ChatVertexAI example always returns a truncated response #4968

svengau · 2024-04-03T16:26:05Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain.js documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain.js rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { ChatVertexAI } from "@langchain/google-vertexai";
// Or, if using the web entrypoint:
// import { ChatVertexAI } from "@langchain/google-vertexai-web";

const model = new ChatVertexAI({
  temperature: 0.7,
  modelName: "gemini-1.0-pro",
});

const response = await model.invoke("Why is the ocean blue?");
console.log(response);
/*
AIMessageChunk {
  content: [{ type: 'text', text: 'The ocean' }], // <===== ONLY FIRST CHUNK
  additional_kwargs: {},
  response_metadata: {}
}
 */

Error Message and Stack Trace (if applicable)

N/A

Description

I've tried the provided example in the doc, and the invoke method always returns the first chunk of what could be a stream.

As a workaround, I use:

const stream = await (llm as any).stream("...");
  let response = "";
  for await (const chunk of stream) {
    console.log(chunk.lc_kwargs.content);
    response += chunk?.lc_kwargs?.content?.toString();
  }

System Info

langchain: 0.1.30
@langchain/community: 0.0.44
@langchain/core: 0.1.53
@langchain/google-vertexai: 0.0.2
@langchain/openai: 0.0.25
MacOS Sonoma 14.2.1 / Apple M1 Pro
node: 20.10.0
yarn: 1.22.1

The text was updated successfully, but these errors were encountered:

dosubot · 2024-04-03T16:26:40Z

Hey @svengau,

Thanks for bringing this to our attention and for the detailed report! It seems like you've encountered an issue where the ChatVertexAI class from @langchain/google-vertexai is only returning the first chunk of the expected response. This behavior is indeed not intended for most use cases, as you'd typically want the full response to your query.

Your workaround using the streaming approach is a smart move and effectively bypasses the issue by manually concatenating the response chunks. This method ensures that you receive the complete response as intended. However, I understand that having to resort to such workarounds is not ideal.

I've taken a closer look at the issue, and it appears to be related to how the invoke method handles response chunks. The method should ideally aggregate all chunks into a single response before returning, but it seems like it's currently stopping after the first chunk. This is something that needs to be addressed in the library's code.

While I cannot directly create a pull request or issue due to my limitations, I highly recommend reporting this behavior as a bug in the @langchain/google-vertexai repository. When reporting, please include the details you've shared here, along with your temporary workaround. This will help the maintainers understand the issue fully and expedite a fix.

In the meantime, continue using your streaming workaround for a complete response. Keep an eye on the repository for updates and potential fixes to this issue. The LangChain.js community is quite responsive, so I'm hopeful a solution will be available soon.

If you have any more questions or run into other issues, feel free to reach out. I'm here to help!

Best,
Dosu

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

afirstenberg · 2024-04-03T18:01:18Z

Hmmm... that almost seems consistent, since Vertex only has streaming, but it should have combined things. And it probably shoudln't be returning a chunk.

@jacoblee93 - Please assign this to me.

afirstenberg · 2024-07-10T17:08:58Z

(For the record, I think this was fixed when we implemented real non-streaming mode.)

svengau changed the title ~~ChatVertexAI example always returns a Stream~~ ChatVertexAI example always returns a truncated response Apr 3, 2024

dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Apr 3, 2024

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jul 3, 2024

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 10, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatVertexAI example always returns a truncated response #4968

ChatVertexAI example always returns a truncated response #4968

svengau commented Apr 3, 2024

dosubot bot commented Apr 3, 2024 •

edited

Loading

About Dosu

afirstenberg commented Apr 3, 2024

afirstenberg commented Jul 10, 2024

ChatVertexAI example always returns a truncated response #4968

ChatVertexAI example always returns a truncated response #4968

Comments

svengau commented Apr 3, 2024

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

dosubot bot commented Apr 3, 2024 • edited Loading

About Dosu

afirstenberg commented Apr 3, 2024

afirstenberg commented Jul 10, 2024

dosubot bot commented Apr 3, 2024 •

edited

Loading