Incompatible with the official spark-pinecone connector

From [here](https://github.com/pinecone-io/pinecone-text/blob/main/pinecone_text/sparse/bm25_encoder.py#L267), the indices array in pinecone-text is a 32-bit unsigned integer.  However, the sparse vectors in the official pinecone connector (see the README [here)](https://github.com/pinecone-io/spark-pinecone/blob/main/README.md) are expected to be Spark `IntegerType`.  Spark's integers are 32-bit _signed_.  That means that pinecone-text produces indices which overflow Spark's integer type and therefore are incompatible with the pinecone spark connector.  I've verified this.

Any ideas on what to do here?  A solution might be for spark-pinecone to change that schema from IntegerType to LongType, but since these are both official Pinecone projects figured y'all might have better success getting that change made.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incompatible with the official spark-pinecone connector #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incompatible with the official spark-pinecone connector #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions