From here, the indices array in pinecone-text is a 32-bit unsigned integer. However, the sparse vectors in the official pinecone connector (see the README here) are expected to be Spark IntegerType. Spark's integers are 32-bit signed. That means that pinecone-text produces indices which overflow Spark's integer type and therefore are incompatible with the pinecone spark connector. I've verified this.
Any ideas on what to do here? A solution might be for spark-pinecone to change that schema from IntegerType to LongType, but since these are both official Pinecone projects figured y'all might have better success getting that change made.