Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove quotes from sql database prompts (caused syntax error) #4101

Merged
merged 1 commit into from
May 13, 2023

Conversation

hansvdam
Copy link
Contributor

@hansvdam hansvdam commented May 4, 2023

fixes a syntax error mentioned in
#2027 and #3305
another PR to remedy is in #3385, but I believe that is not tacking the core problem.
Also #2027 mentions a solution that works:
add to the prompt:
'The SQL query should be outputted plainly, do not surround it in quotes or anything else.'

To me it seems strange to first ask for:

SQLQuery: "SQL Query to run"

and then to tell the LLM not to put the quotes around it. Other templates (than the sql one) do not use quotes in their steps.
This PR changes that to:

SQLQuery: SQL Query to run

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable to me!

@hwchase17 hwchase17 added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label May 5, 2023
@PhilipMay
Copy link

another PR to remedy is in #3385, but I believe that is not tacking the core problem.

I agree. This PR is a good solution to the problem. IMO removing the quotes (just in case the LLM decides to generate them for some reason) would also be helpful. So I think #3385 is also a valid change that should be made.

@rdancer
Copy link
Contributor

rdancer commented May 6, 2023

Doesn't fix the problem for me

Comparing the test run of the current master, and the current master + this PR applied, there is no difference, and the extraneous quotes appear for the same queries (test DB + test case Python script attached; https://github.com/rdancer/langchain.git@f873b94 is the current master + this PR applied)

image
test_sql_quoting.zip

@hansvdam
Copy link
Contributor Author

hansvdam commented May 6, 2023

If you insert the following snippet in your fragment it starts working. It should have the same effect as removing the quotes from the prompt file, strange...:

# ********
_sqlite_prompt = """You are a SQLite expert. Given an input question, first create a syntactically correct SQLite query to run, then look at the results of the query and return the answer to the input question.
Unless the user specifies in the question a specific number of examples to obtain, query for at most {top_k} results using the LIMIT clause as per SQLite. You can order the results to return the most informative data in the database.
Never query for all columns from a table. You must query only the columns that are needed to answer the question. Wrap each column name in double quotes (") to denote them as delimited identifiers.
Pay attention to use only the column names you can see in the tables below. Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.

Use the following format:

Question: Question here
SQLQuery: SQL Query to run
SQLResult: Result of the SQLQuery
Answer: Final answer here

Only use the following tables:
{table_info}

Question: {input}"""

## replace because of https://github.com/hwchase17/langchain/issues/2027 (quotes around query)
_sqlite_prompt = PromptTemplate(
    input_variables=["input", "table_info", "top_k"],
    template=_sqlite_prompt,
)

db_chain = SQLDatabaseChain.from_llm(llm=llm, db=db, verbose=True, prompt=_sqlite_prompt)
# ************

@PhilipMay
Copy link

PhilipMay commented May 6, 2023

Doesn't fix the problem for me

IMO the solution for this problem should be two measures:

  1. adapt the prompt to reduce the probability that the SQL is in quotes - like in this PR
  2. remove the quotes just in case the LLM decided to generate them - like in Fix problem with sql_chain and quotation marks: normalize_sql_cmd #3385

@hansvdam
Copy link
Contributor Author

hansvdam commented May 8, 2023

@rdancer
I did not run my commit, rebased onto master before, because I was battling with poetry to get langchain to build properly.

Now I did, and on my machine your example runs without error. Could you verify if the changes in the prompt-file (removing of the quotes) were actually applied when you tested the PR?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PR looks good. Use to confirm that a PR is ready for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants