Multimodal redaction runtime for sensitive data.
Detect and remove sensitive information across documents, images, and audio. Combines deterministic patterns, NER, computer vision, and LLM-driven classification into auditable, policy-driven pipelines built for regulated industries such as healthcare, legal, government, and financial services.
- Multimodal codecs: read, edit, and write PDF, DOCX, images, audio, CSV, JSON, and plain text through a unified span-based content model
- Layered detection: regex, dictionary, and checksum patterns run first at low cost; NER, OCR, object detection, and LLM classification handle what deterministic methods cannot
- Context-aware redaction: mask, replace, hash, encrypt, blur, block, and pixelate with policy-driven rules scoped to entity type, document class, and confidence threshold
- Pipeline engine: DAG compiler and executor with retry, timeout, and chunked context-window policies
- Python extensions: PyO3 bridge for speech-to-text, NER, and OCR via embedded Python
The fastest way to get started is with Nvisy Cloud.
For self-hosted deployments, refer to docker/ for compose files and
infrastructure requirements, and .env.example for configuration.
See docs/ for architecture, security, and API documentation.
See CHANGELOG.md for release notes and version history.
Apache 2.0 License, see LICENSE.txt
- Documentation: docs.nvisy.com
- Issues: GitHub Issues
- Email: support@nvisy.com