Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How to create a querypipeline to chat with text docs and sql tables. #13775

Open
1 task done
JonOnEarth opened this issue May 28, 2024 · 5 comments
Open
1 task done
Labels
question Further information is requested

Comments

@JonOnEarth
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

I have followed these two advanced examples [1, 2] to successfully chat with SQL tables, and text docs separately.
How can I stitch them together? chat with multiple SQL tables and text docs at the same time.
The process should be to choose the correct docs based on the query, and then chat with the chosen doc or table. If it's the table, then need the text-to-sql component as well.
Besides, if I ask irrelevant queries with these documents, how can just use the LLM's reply instead of always searching the documents?

@JonOnEarth JonOnEarth added the question Further information is requested label May 28, 2024
@logan-markewich
Copy link
Collaborator

Basically add some kind of router to route between the paths. Query pipelines support conditional links

@JonOnEarth
Copy link
Author

Thanks @logan-markewich. I am new to LlamaIndex, and still have lots of confusion, just try to learn from the examples and real problems.
Here is what I did and an error in the end.

I combined the two examples together as the basic queryPipeline like this:
1st example for SQL tables:

qp = QP(
    modules={
        "input": InputComponent(),
        "table_retriever": obj_retriever,
        "table_output_parser": table_parser_component,
        "text2sql_prompt": text2sql_prompt,
        "text2sql_llm": llm,
        "sql_output_parser": sql_parser_component,
        "sql_retriever": sql_retriever,
        "response_synthesis_prompt": response_synthesis_prompt,
        "response_synthesis_llm": llm,
    },
    verbose=True,
)
qp.add_chain(["input", "table_retriever", "table_output_parser"])
qp.add_link("input", "text2sql_prompt", dest_key="query_str")
qp.add_link("table_output_parser", "text2sql_prompt", dest_key="schema")
qp.add_chain(
    ["text2sql_prompt", "text2sql_llm", "sql_output_parser", "sql_retriever"]
)
qp.add_link(
    "sql_output_parser", "response_synthesis_prompt", dest_key="sql_query"
)
qp.add_link(
    "sql_retriever", "response_synthesis_prompt", dest_key="context_str"
)
qp.add_link("input", "response_synthesis_prompt", dest_key="query_str")
qp.add_link("response_synthesis_prompt", "response_synthesis_llm")

2nd example for text RAG:

p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "summarizer": summarizer,
    }
)
p.add_link("input", "retriever")
p.add_link("input", "summarizer", dest_key="query_str")
p.add_link("retriever", "summarizer", dest_key="nodes")

I add a 3rd querypipeline just use the llm:

qp_llm = QueryPipeline(
    modules={
        "llm": llm2,
    },
    verbose=True,
)

Then I add the router as:

# define selector
selector = LLMSingleSelector.from_defaults()
choices = [
    "This tool answers questions related to wiki tables data",
    "This tool contains the knowledge about Paul Graham",
    "This tool only uses LLM itself to answer the questions non-related to Paul Graham and wiki tables data. "
]
router_c = RouterComponent(
    selector=selector,
    choices=choices,
    components=[p, qp_llm], #qp
    verbose=True,
)
# top-level pipeline
qp_t = QueryPipeline(chain=[router_c], verbose=True)

It returns the error:

[/usr/local/lib/python3.10/dist-packages/llama_index/core/query_pipeline/components/router.py](https://localhost:8080/#) in __init__(self, selector, choices, components, verbose)
    109             # validate component has one input key
    110             if len(new_component.free_req_input_keys) != 1:
--> 111                 raise ValueError("Expected one required input key")
    112             query_keys.append(next(iter(new_component.free_req_input_keys)))
    113             new_components.append(new_component)

ValueError: Expected one required input key

@lazyFrogLOL
Copy link

Thanks @logan-markewich. I am new to LlamaIndex, and still have lots of confusion, just try to learn from the examples and real problems. Here is what I did and an error in the end.

I combined the two examples together as the basic queryPipeline like this: 1st example for SQL tables:

qp = QP(
    modules={
        "input": InputComponent(),
        "table_retriever": obj_retriever,
        "table_output_parser": table_parser_component,
        "text2sql_prompt": text2sql_prompt,
        "text2sql_llm": llm,
        "sql_output_parser": sql_parser_component,
        "sql_retriever": sql_retriever,
        "response_synthesis_prompt": response_synthesis_prompt,
        "response_synthesis_llm": llm,
    },
    verbose=True,
)
qp.add_chain(["input", "table_retriever", "table_output_parser"])
qp.add_link("input", "text2sql_prompt", dest_key="query_str")
qp.add_link("table_output_parser", "text2sql_prompt", dest_key="schema")
qp.add_chain(
    ["text2sql_prompt", "text2sql_llm", "sql_output_parser", "sql_retriever"]
)
qp.add_link(
    "sql_output_parser", "response_synthesis_prompt", dest_key="sql_query"
)
qp.add_link(
    "sql_retriever", "response_synthesis_prompt", dest_key="context_str"
)
qp.add_link("input", "response_synthesis_prompt", dest_key="query_str")
qp.add_link("response_synthesis_prompt", "response_synthesis_llm")

2nd example for text RAG:

p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "summarizer": summarizer,
    }
)
p.add_link("input", "retriever")
p.add_link("input", "summarizer", dest_key="query_str")
p.add_link("retriever", "summarizer", dest_key="nodes")

I add a 3rd querypipeline just use the llm:

qp_llm = QueryPipeline(
    modules={
        "llm": llm2,
    },
    verbose=True,
)

Then I add the router as:

# define selector
selector = LLMSingleSelector.from_defaults()
choices = [
    "This tool answers questions related to wiki tables data",
    "This tool contains the knowledge about Paul Graham",
    "This tool only uses LLM itself to answer the questions non-related to Paul Graham and wiki tables data. "
]
router_c = RouterComponent(
    selector=selector,
    choices=choices,
    components=[p, qp_llm], #qp
    verbose=True,
)
# top-level pipeline
qp_t = QueryPipeline(chain=[router_c], verbose=True)

It returns the error:

[/usr/local/lib/python3.10/dist-packages/llama_index/core/query_pipeline/components/router.py](https://localhost:8080/#) in __init__(self, selector, choices, components, verbose)
    109             # validate component has one input key
    110             if len(new_component.free_req_input_keys) != 1:
--> 111                 raise ValueError("Expected one required input key")
    112             query_keys.append(next(iter(new_component.free_req_input_keys)))
    113             new_components.append(new_component)

ValueError: Expected one required input key

Any solution for this issue? I meet it as well. How to define a free_req_input_key for QueryPipeline modules?

@lazyFrogLOL
Copy link

@JonOnEarth
I solved the issue by replacing InputComponent() with a specific PromptTemplate object.

prompt_str = "{query}"
prompt_tmpl = PromptTemplate(prompt_str)
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "input": prompt_tmpl,
        "retriever": retriever,
        "summarizer": summarizer,
    }
)
p.add_link("input", "retriever")
p.add_link("input", "summarizer", dest_key="query_str")
p.add_link("retriever", "summarizer", dest_key="nodes")

You can try using this code.

@JonOnEarth
Copy link
Author

@JonOnEarth I solved the issue by replacing InputComponent() with a specific PromptTemplate object.

prompt_str = "{query}"
prompt_tmpl = PromptTemplate(prompt_str)
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "input": prompt_tmpl,
        "retriever": retriever,
        "summarizer": summarizer,
    }
)
p.add_link("input", "retriever")
p.add_link("input", "summarizer", dest_key="query_str")
p.add_link("retriever", "summarizer", dest_key="nodes")

You can try using this code.

@lazyFrogLOL Thanks for the solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants