Skip to content

[HELP] Can Ragatouille take chunks with their embeddings as input for indexing? #264

@BlueKiji77

Description

@BlueKiji77
  • I want to index a long document with muti-vector embeddings.
  • Say I have span annotation for how I want my document to be chunked and want to apply late chunking to it, that is i want to have all the embeddings of my tokens before I perform the chunking.
  • This way I have contextualized chunks with multi-vector representation.
    Essentially what I want to do is Late Chunking without the chunk-wise vector pooling at the end

How do I do it with ragatouille?
I know how to get multi-vector embeddings for my document but can't figure out how to index and leverage the optimizations provided by Ragatouille.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions