Skip to content

LessUp/mini-inference-engine

Repository files navigation

Mini-Inference Engine

English | 简体中文 | Docs

CI Docs License: MIT CUDA C++ CMake

Mini-Inference Engine is a CUDA GEMM optimization learning project that packages progressive matrix multiplication kernels, a lightweight inference runtime, and profiling-oriented experimentation into one repository.

Repository Overview

  • CUDA kernels and runtime headers in src/ and include/
  • Technical docs under docs/
  • Benchmarks and demos in benchmarks/
  • GitHub Pages site for documentation entry, reading paths, and project updates

Quick Start

cmake --preset release
cmake --build --preset release
./build/release/benchmark
./build/release/tests

Docs

  • Project docs: https://lessup.github.io/mini-inference-engine/
  • Site home explains what to read first for architecture, optimization, and API details
  • See CONTRIBUTING.md for contribution workflow

License

MIT License

About

Mini Deep Learning Inference Engine (CUDA + C++17): 7-Level GEMM Optimization, FP16/INT8 & Auto-Tuner | 迷你深度学习推理引擎(CUDA + C++17):7 级 GEMM 优化、FP16/INT8 量化、自动调优器

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors