Skip to content

Conversation

@adrianegraphene
Copy link

Add support for ColBERT models via FastEmbed with MaxSim scoring.

Related Issues

Proposed Changes:

This PR adds a feature for FastembedColBERTRanker, a new ranker component that enables bi-encoder reranking using ColBERT models through the FastEmbed library.

What's included:

  • New FastembedColBERTRanker component in haystack/components/rankers/
  • Implements MaxSim scoring algorithm for token-level similarity matching
  • Supports FastEmbed's ColBERT models (e.g., colbert-ir/colbertv2.0, answerdotai/answerai-colbert-small-v1)
  • Full feature parity with existing rankers: prefix/suffix support, metadata embedding, score thresholds, device selection

How did you test it?

  pytest test/components/rankers/test_fastembed_colbert.py -m "not integration" -v

Notes for the reviewer

I just worked on something similar and wanted to get the patterns out of my head while it was fresh. I saw this issue needed contributions, so decided I'd help. Happy to fix up / add / remove anything at all.

Checklist

  • I have read the contributors guidelines and the code of conduct.
  • I have updated the related issue with new insights and changes.
  • I have added unit tests and updated the docstrings.
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I have documented my code.
  • I have added a release note file, following the contributors guidelines.
  • I have run pre-commit hooks and fixed any issue.

Add support for ColBERT models via FastEmbed with MaxSim scoring.

Addresses issue deepset-ai#8245
@adrianegraphene adrianegraphene requested a review from a team as a code owner January 27, 2026 19:32
@adrianegraphene adrianegraphene requested review from anakin87 and Copilot and removed request for a team and Copilot January 27, 2026 19:32
@vercel
Copy link

vercel bot commented Jan 27, 2026

@adrianegraphene is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Jan 27, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@anakin87 anakin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @adrianegraphene, thanks for contributing!

We already have an integration with FastEmbed here: https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/fastembed

So I recommend closing this PR and opening another one targeting the Haystack Core Integrations repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: support bi-encoder models in TransformerSimilarityRanker

3 participants