- I want to index a long document with muti-vector embeddings.
- Say I have span annotation for how I want my document to be chunked and want to apply late chunking to it, that is i want to have all the embeddings of my tokens before I perform the chunking.
- This way I have contextualized chunks with multi-vector representation.
Essentially what I want to do is Late Chunking without the chunk-wise vector pooling at the end
How do I do it with ragatouille?
I know how to get multi-vector embeddings for my document but can't figure out how to index and leverage the optimizations provided by Ragatouille.