Research - OCR & RAG for Indic languages

**Is your feature request related to a problem? Please describe.**
As part of CBC project, we have use case from various indic languages e.g Telugu, Gujarathi, Assamese, Odiya. Some of these are low resource language. Also the knowledge base documents are also poorly scanned and mix of english+local languages. Hence we need to evaluate more models/providers to get the best accuracy, latency, cost etc.

**Describe the solution you'd like**
- Check out different OCR models or combination of them.
- Check out OpenAI, Gemini for improving the accuracy of RAG

**Describe alternatives you've considered**
- Marker, Google vision, Xerox, Tessaract for OCR
- OpenAI for retrieval

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research - OCR & RAG for Indic languages #607

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Research - OCR & RAG for Indic languages #607

Description

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions