Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cartesia] Upgrading the message cutoff for Cartesia Synthesizer to use timestamps #700

Merged

Conversation

chongzluong
Copy link
Contributor

Overview

Discussed with Ajay over Slack, recapping below.

There's a distinction between Cartesia's version of continuations and Vocode's general expectation for continuations. Vocode seems to expect an N:N ratio of senders to receivers, but our continuations is an N:1 ratio of senders to receivers. Vocode's get_message_up_to also reflects this expected N:N approach.

The proposed solution to this is to start storing 2 new variables self.ctx_message and self.ctx_timestamps. The Cartesia TTS now requests timestamps, and those timestamps are used to indicate what message we've gotten up to. In the event that the timestamps aren't available for some reason, or in the event that timestamps are delayed beyond a 2 second gap, we fall back to using an estimated wpm and the self.ctx_message to get a best approximation.

Testing

I set up the telephony_app per the Vocode directions. I then adjusted the synthesizer and played around with it locally on my own Vocode deployment to check that it works as intended.

Copy link
Contributor

@ajar98 ajar98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will approve / merge once linting is good! can run make lint from the root dir

@ajar98 ajar98 merged commit dc983a0 into vocodedev:main Sep 6, 2024
4 checks passed
@cyrilS-dev
Copy link

This PR is causing an error :

vocode.streaming.synthesizer.cartesia_synthesizer:chunk_generator:185 - Caught error while receiving audio chunks from CartesiaSynthesizer: Failed to generate audio:
Error generating audio:
error processing TTS request: Language must be specified for timestamps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants