Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experimental] Chat with history #832

Closed

Conversation

khoangothe
Copy link
Contributor

@khoangothe khoangothe commented Sep 9, 2024

I am trying to implement chat history to chat with the final report. The code is just experimental since I'm not sure whether this is where I should put the chat agent. The code will put the report and vectorize in a inmemory vectorstore to get the relevant chunks and I used default LangGraph React Agent. Would be cool if I could have some feedbacks on how to proceed! Thanks guys

Here's the UI (default to chatting after having a report, pressing the search symbol will conduct another research)
image

@ElishaKay
Copy link
Collaborator

ElishaKay commented Sep 10, 2024

Pretty awesome @khoangothe!

a) You'll want to think about the text in the homepage, as defined in "frontend/nextjs/components/InputArea.tsx".

Perhaps conditional text based on the current state?

Screen Shot 2024-09-10 at 9 34 16

b) I've played with the branch a bit & the results are pretty solid for the follow up questions

I'm curious whether we should leverage the new GPTR vector_store parameter:

researcher = GPTResearcher(
            query=query,
            report_type="research_report",
            report_source="langchain_vectorstore",
            vector_store=vector_store,
        )

@khoangothe
Copy link
Contributor Author

khoangothe commented Sep 10, 2024

@ElishaKay Got it thanks for the feedback! I'll see if I can push a change soon. Hoping can change the answer to some chat-like interface too.

Regarding the Vector Store, I saw your ideas in the VectorStore PR #781 and I think it would be very cool to have those implemented as well. I think we can store all of the related contents into a vectorstore and retrieve the information when ask for more context. But I struggle in how best to store new data since the vectorstore is inputted by the user which makes it a bit complicated

Case 1: Since we are storing every crawled info/documents in "vector_store", ideally the "vector_store" should be empty or contain related information so the new page won't contaminate the existing data and provide better queries.
Case 2: We just want to store all the results, so user can have an all knowing bot that can answers related to his/her previous reports.
Case 3: Should we add a default vectorDB (InMemory or any local vector DB) in case the user does not provide a vectorstore.

I would love to hear your opinion. Not sure which case the user will want, but I think I can quickly change the InMemoryVectorStore in the code to the inputted VectorStore after we got this features! Would be cool if we can have a discussion or you could help with some ideas about this!

@ElishaKay
Copy link
Collaborator

ElishaKay commented Sep 10, 2024

Good points @khoangothe

Case 1: Agreed. If we add a chat_id or report_id to the vectorstore records, I guess we'd be able to filter vectorstore searches by chat_id

Case 2: personalization would be great - but it depends on whether we want to have user authentication & multi-user logic within this repo, or whether the GPTR community should build that out independently on a case by case basis - not yet decided - therefore less relevant for this PR

Case 3: Faiss seems like a good candidate since it seems to come out of the box with a pip package. Postgres could be good as well for SQL querying. Have a look here at the Langchain Vector Stores

@ElishaKay
Copy link
Collaborator

ElishaKay commented Sep 11, 2024

P.S. @assafelovic is the lead on backend integrations, so would be worthwhile to add him into the loop regarding the architecture plan on this PR.

@assafelovic:

a) are we ready for the Postgres integration?

b) if so, regarding the data model (table structure) I'm thinking of:

  • Chat

  • Message (1 or more within chats)

    • chat_id
    • type = enum(query, logs, report, followup_question...)

c) would love to get your input regarding the backend logic for followup questions (see chats above)

Jah bless ✌

@khoangothe
Copy link
Contributor Author

@ElishaKay Thanks man, I just added a PR for VectorStore #838 . I think if we want to proceed with the vector_store ideas, we need to implement this first. Then we can continue with this branch to chat with the crawled information in the vector db

@ElishaKay ElishaKay mentioned this pull request Sep 22, 2024
@ElishaKay
Copy link
Collaborator

ElishaKay commented Sep 22, 2024

@khoangothe

Good thinking!
I've merged #838 into the AI Dev Team flow in #819
I'd like to also merge this PR into the same AI Dev Team branch & do extensive testing.

The main use cases I'd like to test & verify are:

I've also invited you as a collaborator on my GPTR fork, so you can feel free to push up commits to the AI Dev Team branch if inspiration sparks

@khoangothe
Copy link
Contributor Author

you can feel free to push up commits to the AI Dev Team branch if inspiration sparks

Thanks man, I do have some ideas on how to leverage the vector store. I'll read through the AI Dev Team branch and see what I can add

@assafelovic
Copy link
Owner

Hey @khoangothe any update on this? Would love see this merged!

@khoangothe
Copy link
Contributor Author

Thank @assafelovic , my plan was to have the vector_store pr merge first and then utilize the vector_store for this pr. With that functionality, I can store crawled data into an in memory_db, then we can chat with the data in DB (not just the report).
If you prefer, I can also merge the vector_store to this pr and then push a commit altogether to chat with the crawled data. I'm working on it right now and can push a commit in either case soon!

@assafelovic
Copy link
Owner

Got it @khoangothe apologize in advance that we've refactored quite a lot of the codebase to be better written so it created some conflicts with the recent PRs

@khoangothe
Copy link
Contributor Author

khoangothe commented Oct 5, 2024

@assafelovic no worries! I think I'll just close this PR and make another one instead of resolving the merge conflict so it's easier to keep track of the change history. Will do when I finish merging

@khoangothe khoangothe mentioned this pull request Oct 5, 2024
@khoangothe khoangothe closed this Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants