Skip to content
View sipemu's full-sized avatar

Highlights

  • Pro

Block or report sipemu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sipemu/README.md

Simon Müller

Software Architect | Data Scientist | Ph.D. Mathematics

I design high-performance data systems in Rust and C++ and apply statistical and machine-learning methods to industrial planning problems. My work spans DuckDB extensions, time-series forecasting engines, and GenAI infrastructure (RAG, MCP servers, foundation model inference).

LinkedInCrates.ioDataZooDE


What I Do

  • Time-Series Forecasting -- Hierarchical, probabilistic, and intermittent-demand forecasting for supply chains, built as native DuckDB extensions.
  • Statistical Computing -- Production-grade regression, hypothesis testing, and causal inference in Rust, exposed through DuckDB and Polars.
  • GenAI & RAG Infrastructure -- Vector databases (HNSW/DiskANN), retrieval-augmented generation, and Model Context Protocol (MCP) servers for AI-assisted development.
  • Foundation Model Inference -- Pure-Rust inference engines for time-series models, targeting edge and WASM deployment without Python dependencies.
  • Enterprise Data Integration -- DuckDB extensions for SAP and API ecosystems, bridging legacy ERP systems with modern analytical workflows.
  • Inventory & Supply Chain Optimisation -- Stochastic inventory models and demand planning applications.

Business Impact

  • Forecast Accuracy & Speed — anofox-forecast delivers 2,900x faster forecasting, enabling near-real-time demand planning that reduces stockouts and excess inventory.
  • Inventory & Working Capital — Stochastic inventory models optimise safety stock levels, freeing working capital while maintaining service levels.
  • Enterprise Data Accessibility — erpl and flapi turn locked-away SAP/ERP data into queryable, API-accessible datasets, cutting integration timelines from months to days.
  • Data Quality & Trust — Automated anomaly detection and validation (anofox-tabular) catches data issues before they reach dashboards and decisions.
  • AI-Ready Infrastructure — RAG pipelines and vector search (Magpie) ground LLM responses in company knowledge, reducing hallucination and making GenAI safe for enterprise use.
  • Reduced Infrastructure Cost — Pure-Rust inference (Chronos-2) and high-performance libraries (motif-rs, oxits-rs) eliminate Python overhead, cutting cloud compute costs and enabling edge deployment.

Featured Projects

Data Engineering & Integration

Project Highlight Stack
flapi DuckDB-powered API gateway with MCP server and VS Code extension C++, DuckDB
erpl DuckDB extension bridging SAP systems via RFC C++, DuckDB
dbt-lineage-viewer Fast CLI for visualising dbt model lineage Rust
anofox-tabular Anomaly detection, validation, and data preparation in DuckDB C++, DuckDB

Time Series & Forecasting

Project Highlight Stack
anofox-forecast 2,900x faster than statsmodels; DuckDB community extension C++, Rust, DuckDB
Chronos-2 Pure-Rust re-implementation of Amazon's Chronos-2 time-series foundation model Rust, Candle
oxits-rs Time series classification and transformation library -- port of pyts Rust
motif-rs High-performance matrix profile library; 3--63x faster than stumpy Rust

GenAI & RAG

Project Highlight Stack
Magpie Vector DB and RAG engine with HNSW, hybrid retrieval, AST-aware chunking Rust

Statistical Computing & Operations Research

Project Highlight Stack
polars-statistics High-performance statistical testing and regression for Polars Rust, Python
Inventory Optimisation Stochastic inventory models for demand planning Rust
fdars FDA algorithms -- depth measures, clustering, smoothing, regression Rust, R

flapi and erpl are DataZooDE projects.

R Packages

Package Description
fdars-r Functional Data Analysis R package with Rust backend
eventstudy Financial event study analysis
case-based-reasoning Case-based reasoning using machine learning methods

Tech Stack

Core: Rust C++ Python R

Data & ML: DuckDB Polars Apache Arrow

Infrastructure: Docker AWS GitHub Actions


Pinned Loading

  1. DataZooDE/anofox-statistics DataZooDE/anofox-statistics Public

    A DuckDB extension for statistical regression analysis, providing OLS, Ridge, WLS, and time-series regression capabilities with full diagnostics and inference directly in SQL.

    Rust 7 1

  2. DataZooDE/anofox-forecast DataZooDE/anofox-forecast Public

    Statistical timeseries forecasting in DuckDB

    C++ 26 3

  3. anofox-statistics-rs anofox-statistics-rs Public

    Statistical tests in Rust

    Rust 4

  4. anofox-regression anofox-regression Public

    Regression analysis in Rust.

    Rust 3

  5. anofox-forecast anofox-forecast Public

    Timeseries forecasting in Rust

    Rust 3

  6. fdars fdars Public

    Functional Data Analysis in R and Rust - High-performance FDA algorithms including depth measures, metrics, clustering, smoothing, and regression

    Rust