This project is in an incredibly early, pre-alpha stage of development. It is currently a solo project. As such, you should expect:
- Bugs and Instability: The server may crash, commands might not work as expected, and data could be corrupted.
- Inconsistencies: The API and feature set are subject to frequent and breaking changes without notice.
- Incomplete Features: Many SQL features and Redis commands are either missing or only partially implemented.
This is not ready for any form of production use. It is a learning and experimentation project. Contributions are very much welcomed!
MemFlux is an experimental, high-performance, in-memory, multi-model database engine built in Rust. It aims to blend the speed and simplicity of key-value stores like Redis with the power and flexibility of SQL databases.
Database is designed with a dual-purpose architecture: it can be run as a standalone server (compatible with the Redis protocol) or be embedded directly into your applications as a library via a C-compatible FFI. This approach allows MemFlux to function as both a fast, networked database and a powerful, in-process database for languages like Python, C++, and more.
Core Features:
- Multi-Model Data: Natively supports:
- Bytes/Strings: Classic key-value operations.
- JSON Documents: Rich, schemaless JSON manipulation at the key or sub-path level.
- Lists & Sets: Redis-compatible list and set operations.
- Property Graph: A complete property graph model with nodes (labels, properties) and relationships (types, properties).
- Low-Level Table/Row Commands: A direct, Redis-style command interface (
TABLE.CREATE,ROW.SET, etc.) for manipulating tabular data, complementing the SQL engine.
- Integrated Query Engines:
- SQL Query Engine: A feature-rich SQL engine for querying JSON and tabular data. Supports complex
SELECTs,JOINs, CTEs (WITH RECURSIVE), DML, DDL, and advanced constraints. - Cypher Query Engine: A powerful, from-scratch engine for querying the property graph, supporting
MATCH,CREATE,MERGE,RETURN,DELETE,SET, path variables, variable-length traversals (-[:KNOWS*1..3]->), and functions likeshortestPath().
- SQL Query Engine: A feature-rich SQL engine for querying JSON and tabular data. Supports complex
- Seamless SQL & Graph Interoperability:
- Query graph data with SQL: Graph nodes and relationships are automatically exposed as virtual SQL tables.
- Query SQL data with Cypher: SQL tables with foreign keys can be traversed as if they were graph nodes and relationships.
GRAPH_MATCHin SQL: Embed Cypher queries directly inside your SQLFROMclause to perform hybrid queries that join graph results with SQL tables.
- Transactional Integrity with MVCC: Provides ACID-like properties with Snapshot Isolation using a Multi-Version Concurrency Control (MVCC) architecture. This allows for non-blocking reads and safe, concurrent writes across all data models.
- Dual-Mode Operation:
- Standalone Server: Run as a TCP server with a Redis-compatible (RESP) protocol.
- Embedded Library: Integrate directly into your application via a C-compatible Foreign Function Interface (FFI) for zero-latency, in-process database operations.
- Configurable Durability & Persistence: Persistence can be enabled or disabled. When enabled, durability is achieved through a Write-Ahead Log (WAL) and periodic snapshotting, with configurable durability levels (
fsync,full). - Secondary Indexing: Create indexes on JSON fields to dramatically accelerate SQL query performance.
- Configurable Memory Management: Set a
maxmemorylimit and choose from multiple eviction policies (LRU,LFU,ARC,LFRU,Random) to control memory usage. - TLS Encryption: Secure client connections with TLS when running in server mode.
Note: As this is an alpha project, many of the features listed above are still under heavy development and may be incomplete or unstable.
While this README provides a quick start, the complete documentation contains a detailed reference for every command, SQL feature, and internal system.
---> Start with the Documentation Index <---
Key sections include:
- Configuration: How to configure the server, including memory limits, persistence, and TLS.
- Python Library Guide: A guide for using MemFlux as an embedded library in Python.
- Commands: Detailed reference for all non-SQL, Redis-style commands.
- SQL Reference: A comprehensive guide to the SQL engine, from DDL to complex
SELECTqueries.
MemFlux can be used in two primary ways: as a standalone server or as an embedded library.
In this mode, MemFlux runs as a background process and accepts client connections over the network using the Redis (RESP) protocol.
Running the Server:
- Build the server:
cargo build --release
- Run the server binary:
./target/release/memflux-server
The server will start and listen on 127.0.0.1:8360.
Connecting to the Server: You can connect using any Redis-compatible client. For interactive use, the included Python script is recommended:
# Run the interactive client
python3 test.pyIn this mode, the database engine is loaded directly into your application's process, eliminating network overhead and providing direct, high-performance access.
The primary interface for this is the Python library included in libs/python/.
Using the Python Library:
-
Build the dynamic library:
cargo build --release
-
Use the
memfluxPython module in your script:import libs.python.memflux as memflux # Provided python lib relative from project root import sys # Path to the compiled shared library if sys.platform == "win32": LIB_PATH = "./target/release/memflux.dll" elif sys.platform == "darwin": LIB_PATH = "./target/release/libmemflux.dylib" else: LIB_PATH = "./target/release/libmemflux.so" # Configuration for the database instance DB_CONFIG = { "persistence": True, "durability": "fsync", "wal_file": "memflux.wal", "wal_overflow_file": "memflux.wal.overflow", "snapshot_file": "memflux.snapshot", "snapshot_temp_file": "memflux.snapshot.tmp", "wal_size_threshold_mb": 128, "maxmemory_mb": 0, "eviction_policy": "lru", "isolation_level": "serializable", } # Connect to the database (loads it in-process) conn = memflux.connect(config=DB_CONFIG, lib=LIB_PATH) with conn.cursor() as cur: cur.execute("SQL CREATE TABLE products (id INT, name TEXT, price REAL)") cur.execute("SQL INSERT INTO products VALUES (?, ?, ?)", (1, 'Laptop', 1200.50)) cur.execute("SQL SELECT name, price FROM products WHERE price > ?", (1000,)) product = cur.fetchone() print(product) # Output: {'name': 'Laptop', 'price': 1200.5} conn.close()
For a more detailed guide, see the Python Library Guide.
Note: This is a simplified explanation of an alpha-stage project. The implementation details are subject to change.
-
Core Library (
src/lib.rs): The core database logic is encapsulated in a Rust library. This library manages the in-memory storage (DashMap), persistence, indexing, and the SQL query engine. It exposes a high-levelMemFluxDBstruct. -
Server Binary (
src/main.rs): Thememflux-serverbinary is a lightweight wrapper around the core library. It handles TCP connections, TLS, and uses the RESP protocol to parse client requests, which it passes to theMemFluxDBinstance. -
FFI Layer (
src/ffi.rs): A C-compatible Foreign Function Interface exposes the core library's functionality, allowing it to be loaded and used directly by other languages (like Python'sctypesmodule) for in-process execution. -
Persistence Engine (MVCC): To prevent data loss and enable transactions, MemFlux uses a Multi-Version Concurrency Control (MVCC) model combined with a Write-Ahead Log (WAL).
- Versioning: Instead of overwriting data, every write operation creates a new version of a value, tagged with a transaction ID. A key now points to a chain of these versions.
- WAL: Every data-modifying command is first serialized and written to the WAL file (
memflux.wal) on disk. This ensures that no acknowledged write is ever lost. - Snapshots & Compaction: When the WAL file grows too large, a non-blocking snapshot of the current state of the database is written to disk. A background vacuum process cleans up old, no-longer-visible data versions to reclaim memory.
- Recovery: On startup, the server restores the last snapshot and replays any subsequent entries from the WAL to bring the database to a consistent, up-to-date state.
-
SQL Query Engine Pipeline:
- Parser: The raw SQL string is tokenized and parsed into an Abstract Syntax Tree (AST), which is a tree-like representation of the query structure.
- Logical Planner: The AST is converted into a Logical Plan. This plan represents the "what" of the query (e.g., "filter these rows, then join with that table"). During this phase, the query is validated against any existing virtual schemas.
- Physical Planner: The Logical Plan is transformed into a Physical Plan. This plan represents the "how" of the query execution. It can perform simple optimizations, such as deciding to use an index scan instead of a full table scan if a suitable index exists.
- Executor: The Physical Plan is executed using a Volcano-style iterator model. Each node in the plan (e.g.,
TableScan,Filter,Join) pulls rows from the node(s) below it, processes them, and passes the results up to the next node, until the final results are streamed back to the client.
Disclaimer: This is a solo, pre-alpha project. The development process is informal. Contributions are highly encouraged!
For a deeper understanding of the project's architecture, see the Internals Documentation.
- The Rust programming language toolchain (
rustc,cargo). You can install it from rust-lang.org. - Python 3 for running the test suite.
You can build the project for debugging or release:
# For development (faster compile times, no optimizations)
cargo build
# For benchmarking or running "for real"
cargo build --releaseThe project includes a comprehensive test suite written in Python. It's the best way to validate changes and understand the expected behavior of different commands.
The test runner is the test.py script.
# First, start the server in a separate terminal
cargo run
# --- In another terminal ---
# Run all unit tests
python3 test.py unit all
# Run a specific test suite (e.g., just the SQL tests)
python3 test.py unit sql
# Available suites: byte, json, lists, sets, sql, snapshot, types,
# schema, aliases, case, like, subqueries, union, functionsMemFlux is an ambitious experiment to build a modern, multi-model database from the ground up in Rust. It explores the intersection of NoSQL flexibility and SQL's expressive power.
While the project is still in its infancy, it has a solid foundation with a working persistence layer and a surprisingly capable SQL engine.
Potential Future Directions:
- More robust query optimization.
- Expanded SQL syntax and function support.
- More complex data types.
- Improved concurrency control.
- Thorough documentation and stabilization of the API.
Again, this is an alpha project. Use it, break it, and please consider contributing! All feedback, bug reports, and pull requests are welcome.