Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Streaming response with metadata #14519

Closed
1 task done
narenSb1837 opened this issue Jul 2, 2024 · 2 comments
Closed
1 task done

[Question]: Streaming response with metadata #14519

narenSb1837 opened this issue Jul 2, 2024 · 2 comments
Labels
question Further information is requested

Comments

@narenSb1837
Copy link

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

def get_completion(query: str, namespace: HomeNamespace, home_id: int):
"""
Queries document from for a specific property and returns the response and citations.
Args:
query (str): The query string.
namespace (str): The namespace for the Pinecone index.
home_id (str): The home ID to filter the documents.
Returns:
tuple: A tuple containing the response string and a list of citations.
"""
# Initialize Pinecone index
vector_store = PineconeVectorStore(pinecone_index=get_index(PineconeIndexEnum.HOME), namespace=namespace)
index = VectorStoreIndex.from_vector_store(vector_store=vector_store)

# Configure the re-ranking optimizer
rerank = SentenceEmbeddingOptimizer(embed_model=Settings.embed_model, percentile_cutoff=0.5, threshold_cutoff=0.85)

# Set metadata filters for the query
filters = MetadataFilters(
    filters=[
        MetadataFilter(key="home_id", operator=FilterOperator.EQ, value=home_id),
    ]
)

# Initialize the citation query engine
citation_query_engine = CitationQueryEngine.from_args(
    index,
    similarity_top_k=5,
    verbose=True,
    postprocessor=[rerank],
    filters=filters,
    citation_chunk_size=512,
    citation_qa_template=citation_qa_template,
    llm=OpenAI(model="gpt-4o-2024-05-13", api_key=get_secret_value("OPENAI_API_KEY")),
    streaming=True,
)

# Perform the query
response = citation_query_engine.query(query)

# Extract citations and modify the response string
# citation_indices, response_str = extract_citations_and_modify_string(str(response))
# citations = [response.source_nodes[i - 1].text for i in citation_indices]

for text in response.response_gen:
    yield text

when I use this function I am only able to get the text of the response but I also want to access the metadata attributes so that I can also cite my page_number and other metadata

@narenSb1837 narenSb1837 added the question Further information is requested label Jul 2, 2024
@logan-markewich
Copy link
Collaborator

I think I shared this on discord, by either yield the metadata at the start or end, or attach it to every text that you yield. Its still on the response object

So either

yield response.source_nodes # or whatever other metadata
for text in response.response_gen:
    yield text

or

for text in response.response_gen:
    yield text
yield response.source_nodes # or whatever other metadata

or

for text in response.response_gen:
    yield {"text": text, "metadata": ....}

@narenSb1837
Copy link
Author

Okk thank you so much got it👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants