Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to get streaming response from BedrockChat in a custom agent #5917

Closed
5 tasks done
MrHertal opened this issue Jun 27, 2024 · 6 comments
Closed
5 tasks done

Unable to get streaming response from BedrockChat in a custom agent #5917

MrHertal opened this issue Jun 27, 2024 · 6 comments
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@MrHertal
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

  const llm = new BedrockChat({
    model: "anthropic.claude-3-sonnet-20240229-v1:0",
    modelKwargs: {
      temperature: 0,
    },
    streaming: true,
    callbacks: [
      {
        async handleLLMNewToken(token) {
          console.log({ token });
        },
      },
    ],
  });

  const memory = new BufferMemory({
    chatHistory: new DynamoDBChatMessageHistory({
      tableName: memoryTable,
      partitionKey: "SessionId",
      sessionId,
    }),
    returnMessages: true,
    memoryKey: "chat_history",
    outputKey: "output",
  });

  const agent = createToolCallingAgent({
    llm,
    tools,
    prompt,
  });

  const agentExecutor = new AgentExecutor({
    agent,
    tools,
    memory,
  });

  const result = await agentExecutor.invoke({
    input: message,
  });

Error Message and Stack Trace (if applicable)

No response

Description

Hi,

I'm following this example: https://github.com/langchain-ai/langchainjs/blob/main/examples/src/agents/streaming.ts
to enable streaming in an agent response.

I'm using BedrockChat as the model with the following options:

  const llm = new BedrockChat({
    model: "anthropic.claude-3-sonnet-20240229-v1:0",
    modelKwargs: {
      temperature: 0,
    },
    streaming: true,
    callbacks: [
      {
        async handleLLMNewToken(token) {
          console.log({ token });
        },
      },
    ],
  });

I'm expecting to see a response from the agent streaming in the console, token by token.
But i always get a single token with the full response:

{
  token: "the full response..."
}

Am I missing something?

System Info

{
  name: 'langchain',
  description: 'Typescript bindings for langchain',
  'dist-tags': {
    next: '0.2.3-rc.0',
    latest: '0.2.7'
  },
  version: '0.2.7',
  ...
}
yarn -v
1.22.22
node -v
v20.11.0
@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Jun 27, 2024
Copy link

dosubot bot commented Jun 27, 2024

Hey there, @MrHertal! I'm here to help you with the issue you've encountered. I'm a bot designed to assist with bug-solving, answering questions, and guiding potential contributors. Let's work together to resolve this issue!

You are unable to get a streaming response from BedrockChat in a custom agent and instead receive a single token with the full response because the streaming option is not enabled in the BedrockChat configuration for the custom agent.

To enable streaming, you need to set the streaming property to true and provide the appropriate callbacks for handling new tokens and the end of the response. Here is an example of how to configure it:

import { BedrockChat } from "@langchain/community/chat_models/bedrock";
import { HumanMessage } from "@langchain/core/messages";

async function runStreaming() {
  // Instantiate the BedrockChat model with the desired configuration
  const model = new BedrockChat({
    model: "anthropic.claude-3-sonnet-20240229-v1:0",
    region: "us-east-1",
    credentials: {
      accessKeyId: process.env.BEDROCK_AWS_ACCESS_KEY_ID!,
      secretAccessKey: process.env.BEDROCK_AWS_SECRET_ACCESS_KEY!,
    },
    maxTokens: 150,
    temperature: 0.7,
    stopSequences: ["\n", " Human:", " Assistant:"],
    streaming: true, // Enable streaming
    trace: "ENABLED",
    guardrailIdentifier: "your-guardrail-id",
    guardrailVersion: "1.0",
    guardrailConfig: {
      tagSuffix: "example",
      streamProcessingMode: "SYNCHRONOUS",
    },
  });

  // Prepare the message to be sent to the model
  const message = new HumanMessage({ content: "Tell me a joke" });

  // Stream the response from the model
  const stream = await model.stream([message]);
  for await (const chunk of stream) {
    // Output each chunk of the response
    console.log(chunk.content);
  }
}

runStreaming().catch(console.error);

This configuration ensures that the BedrockChat instance is set up to handle streaming responses, and the stream method will yield each chunk of the response as it is received [1][2][3].

To continue talking to Dosu, mention @dosu.

@MrHertal
Copy link
Author

Sorry @dosu but this is not helping.

I'm using BedrockChat in the context of a custom agent, so I don't run the model directly.
This is done by the agent after executor.invoke

Copy link

dosubot bot commented Jun 27, 2024

To enable proper token-by-token streaming responses from BedrockChat in your custom agent setup, you need to ensure that the streamRunnable parameter is set to true in the createToolCallingAgent function. Here is an example of how to configure it:

import { BedrockChat } from "@langchain/community/chat_models/bedrock";
import { BufferMemory } from "@langchain/memory";
import { DynamoDBChatMessageHistory } from "@langchain/memory/dynamodb";
import { createToolCallingAgent, AgentExecutor } from "@langchain/agents";
import { HumanMessage } from "@langchain/core/messages";

const llm = new BedrockChat({
  model: "anthropic.claude-3-sonnet-20240229-v1:0",
  modelKwargs: {
    temperature: 0,
  },
  streaming: true,
  callbacks: [
    {
      async handleLLMNewToken(token) {
        console.log({ token });
      },
    },
  ],
});

const memory = new BufferMemory({
  chatHistory: new DynamoDBChatMessageHistory({
    tableName: memoryTable,
    partitionKey: "SessionId",
    sessionId,
  }),
  returnMessages: true,
  memoryKey: "chat_history",
  outputKey: "output",
});

const agent = createToolCallingAgent({
  llm,
  tools,
  prompt,
  streamRunnable: true, // Enable streaming
});

const agentExecutor = new AgentExecutor({
  agent,
  tools,
  memory,
});

const result = await agentExecutor.invoke({
  input: message,
});

By setting streamRunnable: true, you enable the agent to invoke the underlying model in streaming mode, allowing for token-by-token streaming responses [1][2].

To continue talking to Dosu, mention @dosu.

@MrHertal
Copy link
Author

By default streamRunnable is set to true
I tried to explicitly set it to true without success

@jeasonnow
Copy link
Contributor

jeasonnow commented Jul 5, 2024

By default streamRunnable is set to true I tried to explicitly set it to true without success

Can you try the following code?

const result = await agentExecutor.stream({
  input: message,
});

@MrHertal

@MrHertal
Copy link
Author

@jeasonnow thanks but it was not working.

I finally made it work by implementing this solution: https://github.com/aws-samples/langchain-agents/tree/main/bedrock/langchain-js-stream-agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants