-
Notifications
You must be signed in to change notification settings - Fork 88
mention OpenSearchHybridRetrieval in OpenSearch integration page #327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some suggestions for the usage example. You could discuss with @bilgeyucel and @dfokina , if we want to include the OpenSearchHybridRetriever in the integration page with a usage example or only link to the documentation pages. The other retrievers aren't described on the integrations page either. I suggest we at least link to the documentation pages of the document store and the three components.
top_k=10, | ||
) | ||
|
||
pipeline.run(query="What is the capital of France?") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This usage example can't work. Pipeline is not imported, not initialized and we're not adding the retriever to the pipeline. There are also no documents in the document store.
You could extend the following. It's based on the example we used in the 2.13.0 release https://github.com/deepset-ai/haystack/releases/tag/v2.13.0
Please add the embedder, adjust dimensions param of the doc store and test it.
# pip install haystack-ai datasets "sentence-transformers>=3.0.0"
from haystack import Document
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack_integrations.components.retrievers.opensearch import OpenSearchHybridRetriever
from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from datasets import load_dataset
dataset = load_dataset("HaystackBot/medrag-pubmed-chunk-with-embeddings", split="train")
docs = [Document(content=doc["contents"], embedding=doc["embedding"]) for doc in dataset]
document_store = OpenSearchDocumentStore()
document_store.write_documents(docs)
query = "What treatments are available for chronic bronchitis?"
result = OpenSearchHybridRetriever(document_store).run(...). # add SentenceTransformersTextEmbedder with "BAAI/bge-small-en-v1.5"
print(result)
pipeline.run(query="What is the capital of France?") | ||
``` | ||
|
||
You can learn more about the `OpenSearchHybridRetriever` in the [documentation](). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can learn more about the `OpenSearchHybridRetriever` in the [documentation](). | |
You can learn more about the `OpenSearchHybridRetriever` in the [documentation](https://docs.haystack.deepset.ai/docs/opensearchhybridretriever). |
Thanks for the suggestions! This is all mostly in "draft" mode, I want to have first the documentation done - I just opened the PR to have some feedback regarding the structure - other SuperComponents are in the docs only and not in integrations. |
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
No description provided.