VoicePipelineAgent don't send the transcription automatically to client #236

cuongpham-1001 · 2024-12-30T15:08:17Z

I tried to use VoicePipelineAgent but didn't receive the transcription from client when listening to event RoomEvent.TranscriptionReceived
The livekit document mention "VoicePipelineAgent and MultimodalAgent can generate and deliver transcriptions automatically".
I also try to find the code that implement the transcription forwarding but look like it hasn't implemented yet
Here is my code to setup VoicePipelineAgent

const { logger, voiceSettings } = options
  const initialContext = new llm.ChatContext().append({
    role: llm.ChatRole.SYSTEM,
    text: options.defaultInstruction
  });

  const vad = await silero.VAD.load();
  const agent = new pipeline.VoicePipelineAgent(
    vad,
    new openai.STT(),
    new openai.LLM(),
    new openai.TTS({
      voice: voiceSettings?.voice,
    }),
    {
      chatCtx: initialContext,
      allowInterruptions: true,
      interruptSpeechDuration: 500,
      interruptMinWords: 0,
      minEndpointingDelay: 500,
      transcription: {
        userTranscription: true,
        agentTranscription: true,
        agentTranscriptionSpeech: 1,
        sentenceTokenizer: new tokenize.basic.SentenceTokenizer(),
        wordTokenizer: new tokenize.basic.WordTokenizer(false),
        hyphenateWord: tokenize.basic.hyphenateWord,
      },
      beforeLLMCallback: (_, ctx) => {

        const lastMessage = ctx.messages[ctx.messages.length - 1]
        if (lastMessage) {
          logger.info({
            content: lastMessage.content,
          })
        }
      },
      beforeTTSCallback: async (_, source) => {
        const messageChunks: string[] = []
        if (typeof source === 'string') {
          messageChunks.push(source)
        } else {
          for await (const chunk of source) {
            messageChunks.push(chunk)
          }
        }
        const message = messageChunks.join('')
        logger.info({
          content: message,
        })
        return message
      },
    }

valdrox · 2025-01-01T03:26:43Z

I left a comment in the docs. I had the same issue. I think I saw a comment somewhere saying they were working on this.

This is what i'm going with until this is added:

  let messageCounter = 0;

  const handleSpeechCommitted = async (text: { content?: string }) => {
    if (text.content != undefined) {
      messageCounter++;
      const textForwarder = new BasicTranscriptionForwarder(
        ctx.room,
        participant.identity,
        'trackSid',
        messageCounter.toString(),
      );
      textForwarder.start();
      textForwarder.pushText(text.content);
      textForwarder.markTextComplete();
      textForwarder.close(false);
    }
  };

  agent.on(pipeline.VPAEvent.USER_SPEECH_COMMITTED, handleSpeechCommitted);
  agent.on(pipeline.VPAEvent.AGENT_SPEECH_COMMITTED, handleSpeechCommitted);

Then catch it in the front-end with the code in the docs.

nbsp · 2025-01-03T14:23:24Z

our method of sending transcription events is currently very rudimentary; this is something we're working on for a future release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VoicePipelineAgent don't send the transcription automatically to client #236

VoicePipelineAgent don't send the transcription automatically to client #236

cuongpham-1001 commented Dec 30, 2024

valdrox commented Jan 1, 2025

nbsp commented Jan 3, 2025

VoicePipelineAgent don't send the transcription automatically to client #236

VoicePipelineAgent don't send the transcription automatically to client #236

Comments

cuongpham-1001 commented Dec 30, 2024

valdrox commented Jan 1, 2025

nbsp commented Jan 3, 2025