Add Mistral AI backend support with abstraction layer#64
Closed
ddulic wants to merge 27 commits intoallenporter:mainfrom
Closed
Add Mistral AI backend support with abstraction layer#64ddulic wants to merge 27 commits intoallenporter:mainfrom
ddulic wants to merge 27 commits intoallenporter:mainfrom
Conversation
Introduces an AIService abstraction layer so the server can use either Google Gemini or Mistral AI for OCR, embeddings, and summary generation. The active backend is selected at startup based on which API key is configured (Mistral takes precedence when both are set). - Add AIService abstract base class (ocr_image, embed_text, generate_json) - Add MistralService implementing AIService via the mistralai SDK - Refactor GeminiService to inherit AIService; accept model names at construction instead of per-call; add high-level AIService methods - Replace GeminiOcrModule/GeminiEmbeddingModule with provider-agnostic OcrModule/EmbeddingModule accepting AIService - Update SummaryModule and SearchService to use AIService - Add Mistral config fields with env var support (SUPERNOTE_MISTRAL_API_KEY, SUPERNOTE_MISTRAL_OCR_MODEL, SUPERNOTE_MISTRAL_EMBEDDING_MODEL, SUPERNOTE_MISTRAL_CHAT_MODEL, SUPERNOTE_MISTRAL_MAX_CONCURRENCY) - Add mistralai>=1.0.0 to server dependencies - Update all tests to use the new AIService-based mocks Note: switching AI providers invalidates stored embeddings; all notes will need to be re-processed after a provider change.
- Switch MistralService.ocr_image() from Pixtral chat completions to the dedicated mistral-ocr-latest API (client.ocr.process_async) - Update default mistral_ocr_model config to mistral-ocr-latest - Validate non-empty embedding before persisting in EmbeddingModule - Accumulate PNG chunks in list then join once (avoid quadratic bytes concat) - Guard zero-norm query embedding in SearchService to prevent NaN scores
- Add gemini_chat_model config field (SUPERNOTE_GEMINI_CHAT_MODEL, default gemini-2.0-flash) and fix bug where gemini_ocr_model was incorrectly used for summary/chat generation - Raise ValueError in GeminiService.embed_text() when values is empty/None, consistent with MistralService and the EmbeddingModule guard added earlier - Remove unused config parameter from SearchService constructor and call sites - Document in MistralService.ocr_image() why prompt is intentionally unused
- Remove unused logger/logging import from GeminiService and MistralService (would fail ruff unused-variable check in CI) - Fix test_summary_module assertion to use call_args.kwargs["prompt"] since generate_json is called with keyword arguments, making call_args.args empty
…tent fallback - Guard against zero-norm candidate embeddings in SearchService cosine similarity to prevent NaN/inf scores, mirroring the existing query guard - Change generate_json non-string fallback from "" to str(content) so unexpected response shapes produce something inspectable rather than silent empty string - Add unit tests for MistralService covering OCR, embeddings, JSON generation, error paths, and concurrency limiting via semaphore
- README: update tagline, pipeline description, and quick start to show both Gemini and Mistral as provider options - server/README: update feature description, prerequisites, and expand AI configuration section with full env var tables for both providers, including the embedding dimension switching note - note_processing_design: replace Gemini-specific references with provider-agnostic language
…lock - Clamp max_concurrency to minimum 1 in GeminiService and MistralService constructors so a bad value can never create a blocking semaphore - Log a warning and clamp to 1 when SUPERNOTE_GEMINI_MAX_CONCURRENCY or SUPERNOTE_MISTRAL_MAX_CONCURRENCY env vars are set to < 1 - Document the minimum value of 1 in server README config tables
Mirrors the earlier SearchService cleanup — config was assigned but never read anywhere in the module after the Gemini-specific refactor. Removed from the constructor, app.py call site, and both test fixtures.
…arnings - Handle missing/empty pages in MistralService.ocr_image() with safe getattr access and per-page markdown guard, returning "" instead of raising TypeError - Use compact json.dumps separators in generate_json() to reduce prompt token usage when sending the schema to the model - Log a warning (instead of silently passing) when SUPERNOTE_GEMINI_MAX_CONCURRENCY or SUPERNOTE_MISTRAL_MAX_CONCURRENCY env vars contain non-integer values
- Update import from 'from mistralai import Mistral' to
'from mistralai.client import Mistral' for mistralai>=2.0.0 compatibility
- Bump pyproject.toml constraint to mistralai>=2.0.0
- Fix test_full_processing_pipeline_with_real_file: generate_json was mocked
to return '{}' (no segments) but assertion expected summary content from OCR
text — fix mock to return a proper segments JSON and assert against it
…alidation, and search - Add test_gemini.py: full coverage for GeminiService (was completely untested) — is_configured, provider_name, ocr_image, embed_text empty/missing values, generate_json, max_concurrency clamping, and concurrency limit enforcement - test_mistral.py: add edge cases for empty/missing OCR pages, pages without markdown, generate_json with non-string content (json.dumps path), empty choices, and max_concurrency clamping to 1 - test_embedding.py: verify EmbeddingModule marks task FAILED when AI returns empty vector - test_search.py: verify zero-norm candidate embeddings are skipped and don't produce NaN scores
- CONTRIBUTING.md: show both Gemini and Mistral API key options for local dev - note_processing_design.md: replace GeminiAPIError example with generic ValueError - PLAN.md: update Phase 5 to reflect AIService abstraction, MistralService, and renamed OcrModule/EmbeddingModule (were GeminiOcrModule/GeminiEmbeddingModule)
- Remove stale type: ignore[arg-type] comments in gemini.py and mistral.py - Use proper UserMessage type for Mistral chat messages with cast to satisfy mypy list invariance requirement - Handle Optional[List[float]] from Mistral embedding response - Add method-assign to type: ignore comments on AsyncMock assignments in test_gemini.py and test_mistral.py
Switch to more standard Python web server defaults (8000 for main server, 8001 for MCP server) to avoid conflicts with common port 8080/8081 usage.
- Re-raise exception in SummaryModule.process() after logging so the task is marked FAILED (not COMPLETED) when AI summary generation fails - Update mistral_api_key docstring to mention summaries alongside OCR and embeddings - Remove partial API key characters from log output for both Gemini and Mistral API keys to avoid inadvertent credential exposure
Add Mistral as an alternative AI backend
Fix Dockerfile ports, add Docker Compose, and update README
The Dockerfile already sets SUPERNOTE_PORT=8000 but the config.py default was still 8080, causing a mismatch when running outside Docker.
Fix default port to match Dockerfile (8080 -> 8000)
Author
|
Closing to reopen from a feature branch instead of main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
New config options
Port change
Default ports are now 8000 (API) and 8001 (file server), changed from 8080/8081. Update any existing configurations, reverse proxies, or firewall rules accordingly.
This is more of a personal preference, 8080 and 8081 are very common ports and used up on my machine.
Robustness improvements
Important note on provider switching
Mistral embeddings are 1024-dimensional vs Gemini's 3072-dimensional. Switching providers invalidates all stored embeddings — notes will need to be re-processed after a provider change. Mixing embeddings from different providers in the same search index is not supported.
Note that I don't have my Supernote device yet, it is arriving on Friday, I tested it with the (empty?) test note file, and it looks to be working.
Linting and tests are passing.
Full disclosure that Claude Code was used to write this, with assistance from Copilot in reviews.