Skip to content

mention OpenSearchHybridRetrieval in OpenSearch integration page #327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

davidsbatista
Copy link
Contributor

No description provided.

@davidsbatista davidsbatista requested a review from a team as a code owner May 19, 2025 16:58
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some suggestions for the usage example. You could discuss with @bilgeyucel and @dfokina , if we want to include the OpenSearchHybridRetriever in the integration page with a usage example or only link to the documentation pages. The other retrievers aren't described on the integrations page either. I suggest we at least link to the documentation pages of the document store and the three components.

top_k=10,
)

pipeline.run(query="What is the capital of France?")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This usage example can't work. Pipeline is not imported, not initialized and we're not adding the retriever to the pipeline. There are also no documents in the document store.

You could extend the following. It's based on the example we used in the 2.13.0 release https://github.com/deepset-ai/haystack/releases/tag/v2.13.0
Please add the embedder, adjust dimensions param of the doc store and test it.

# pip install haystack-ai datasets "sentence-transformers>=3.0.0"

from haystack import Document
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack_integrations.components.retrievers.opensearch import OpenSearchHybridRetriever
from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from datasets import load_dataset

dataset = load_dataset("HaystackBot/medrag-pubmed-chunk-with-embeddings", split="train")
docs = [Document(content=doc["contents"], embedding=doc["embedding"]) for doc in dataset]
document_store = OpenSearchDocumentStore()
document_store.write_documents(docs)

query = "What treatments are available for chronic bronchitis?"
result = OpenSearchHybridRetriever(document_store).run(...). # add SentenceTransformersTextEmbedder with "BAAI/bge-small-en-v1.5"
print(result)

pipeline.run(query="What is the capital of France?")
```

You can learn more about the `OpenSearchHybridRetriever` in the [documentation]().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can learn more about the `OpenSearchHybridRetriever` in the [documentation]().
You can learn more about the `OpenSearchHybridRetriever` in the [documentation](https://docs.haystack.deepset.ai/docs/opensearchhybridretriever).

@davidsbatista
Copy link
Contributor Author

Thanks for the suggestions!

This is all mostly in "draft" mode, I want to have first the documentation done - I just opened the PR to have some feedback regarding the structure - other SuperComponents are in the docs only and not in integrations.

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
@davidsbatista davidsbatista marked this pull request as draft May 20, 2025 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants