Skip to content

refactor(codec): simplify type-erased handlers#45

Merged
martsokha merged 14 commits intomainfrom
feat/ocr-spans
Mar 10, 2026
Merged

refactor(codec): simplify type-erased handlers#45
martsokha merged 14 commits intomainfrom
feat/ocr-spans

Conversation

@martsokha
Copy link
Member

No description provided.

…ayers

- Merge SpanEdit into Span and SpanEditStream into SpanStream
- Remove Document<H> wrapper, fold source() into Handler trait
- Replace AnyImage/AnyAudio enums with BoxedImageHandler/BoxedAudioHandler
- Remove span_id generic from ImageRedaction and AudioRedaction
- Remove Concrete marker trait, use explicit DynHandler impls with Handler supertrait
- Make document span/stream modules private, expose only via re-exports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@martsokha martsokha self-assigned this Mar 10, 2026
@martsokha martsokha added the feat request for or implementation of a new feature label Mar 10, 2026
martsokha and others added 13 commits March 10, 2026 09:22
…e visibility

- Extract shared audio handler boilerplate into impl_audio_handler! macro
- Fix stale doc comments referencing removed view_spans/edit_spans methods
- Make all handler struct fields private (were inconsistently pub(crate))
- Add new() constructors to JsonHandler and CsvHandler, update loaders
- Remove unused as_*/into_* accessor boilerplate from AnyText and AnyRich
- Add TODO comment for unimplemented audio redaction stub

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename AnyDocument → Document, PdfHandler → RichTextHandler
- Delete DocxHandler stub, replaced by format-agnostic RichTextHandler
- Inline AudioSpanId into AudioHandler trait, removing associated type
- Eliminate DynAudioHandler — AudioHandler is now directly object-safe
- Add method-level docs to all handler and Dyn* traits

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eHandler

- Remove associated type from ImageHandler, use ImageSpanId directly
- Eliminate DynImageHandler trait and impl_dyn_image! macro
- Simplify BoxedImageHandler to wrap Box<dyn ImageHandler> directly
- Move AudioSpanId and ImageSpanId to own files, wrap Option<u32>
- Remove ImageHandler impl and RichImageSpan from RichTextHandler
- Remove H::ImageId: Default bound from ImageTransform

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Derive Default for XlsxHandler to satisfy clippy::new_without_default
- Update quinn-proto 0.11.13 → 0.11.14 (RUSTSEC-2026-0037, DoS fix)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…de_image

- Replace hand-written From/as_*/into_* on Document with derive_more
  (From, IsVariant, TryInto); inline Document into document/mod.rs
- Add Default derive to all 11 loader structs
- Move decode_image into ImageData::decode with built-in tracing
- Add utility methods to AudioSpanId/ImageSpanId (new, index, From<u32>)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consolidate magic-byte detection in nvisy-core via `infer`. ContentData
gains `detect_mime()`, `document_type()` (caching), and
`infer_document_type()` (non-caching). nvisy-codec adds a `detect`
module with `decode()` that auto-dispatches ContentData to the right
loader with default params. Remove `infer` from nvisy-codec deps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the free `decode()` function into `Document::decode()` as an
associated method and delete the `detect` module. Detection tests
moved to nvisy-core alongside `infer_document_type`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reorganise nvisy-core modules to reflect their actual purpose:

- `fs` → `media`: format classification (ContentKind, DocumentType)
- `io` → `content`: data containers, metadata, source identity
- `path` → folded into `content`: ContentSource now lives alongside
  the content types that reference it
- ContentMetadata moved from media to content (it's content metadata,
  not format classification)
- content/content.rs renamed to content/bundle.rs to avoid clippy
  module name collision

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Same pattern as BoxedImageHandler, BoxedAudioHandler, and
BoxedRichHandler — a boxed trait object with DynTextHandler bridge
trait and impl_dyn_text! macro for each concrete handler. From impls
generated by the macro replace the hand-rolled enum variants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 24 tests across nvisy-core and nvisy-codec that tested
auto-derived behavior (Display, FromStr, Serialize, From conversions,
trivial constructors) without exercising any meaningful logic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consumers now import `nvisy_codec::Span` instead of
`nvisy_codec::document::Span`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@martsokha martsokha merged commit 50d8270 into main Mar 10, 2026
5 checks passed
@martsokha martsokha deleted the feat/ocr-spans branch March 10, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat request for or implementation of a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant