Awesome ACE-Step
A curated list of projects, tools, models, UIs, and resources for ACE-Step — the open-source music generation foundation model by ACE Studio and StepFun.
ACE-Step is a hybrid architecture combining a Language Model planner with a Diffusion Transformer to generate commercial-grade music from text prompts and lyrics. It runs locally on consumer hardware with as little as 4 GB VRAM, generating a full song in under 2 seconds on A100 or under 10 seconds on RTX 3090.
Resource
Description
GitHub Repository (v1.5)
Latest codebase with Gradio UI, REST API, CLI, LoRA training. Mac, AMD, Intel, CUDA.
GitHub Repository (v1.0)
Original v1.0 codebase.
Project Page (v1.0)
Architecture overview, demos, and benchmarks.
Project Page (v1.5)
Hybrid LM + DiT architecture, new capabilities.
HuggingFace Space
Interactive online demo on HuggingFace Zero GPU.
HuggingFace Models
All official model weights, LoRAs, and spaces.
Discord
Community chat and support.
DiT Models (Diffusion Transformer)
Model
Steps
Quality
Speed
Features
Link
acestep-v15-turbo
8
Very High
Very Fast
text2music, cover, repaint
HF
acestep-v15-turbo-continuous
8
Very High
Very Fast
Optimized for streaming
HF
acestep-v15-sft
50
High
Medium
All features
HF
acestep-v15-base
50
Medium
Medium
All features, best for fine-tuning
HF
Language Models (Planner)
Model
Base
VRAM
Capability
Link
acestep-5Hz-lm-0.6B
Qwen3-0.6B
6-8 GB
Lightweight
HF
acestep-5Hz-lm-1.7B
Qwen3-1.7B
8-16 GB
Default, full features
HF
acestep-5Hz-lm-4B
Qwen3-4B
16+ GB
Best quality, audio understanding
HF
LoRA Adapters and Quantized Models
Model
Type
Description
Link
ACE-Step-v1.5-chinese-new-year-LoRA
LoRA
Chinese folk instruments (dizi, erhu), festive style. Trained on 12 songs
HF
Serveurperso/ACE-Step-1.5-GGUF
GGUF
Full quantization suite (Q4-Q8, BF16) for acestep.cpp
HF
Project
Tech Stack
Highlights
Link
ace-step-ui (fspecii)
Node.js + Python
Spotify-inspired, dark/light modes, audio editor, stem extraction, video gen
GitHub
ace-step-studio (roblaughter)
React + FastAPI
Suno-style studio, create/library/player workflow, OpenAI-compatible LLM for lyrics, cover art gen
GitHub
Tadpole Studio
Next.js + FastAPI
AI DJ, Radio, Library, Playlists, LoRA training, HeartMuLa backend, 11 themes
GitHub
Ace-Step-Wrangler
Python + HTML/JS
DAW-inspired dark UI for musicians. Friendly sliders (Creativity, Strictly follow lyrics) instead of raw model params
GitHub
ace-step-ui.pinokio
Pinokio
One-click launcher for ace-step-ui (v1.5), auto backend + frontend
GitHub
ACE-Step-1.5-for-windows (sdbds)
Python + Windows
936 Suno style tags with search/select; song parameter history; 4-language UI (EN/ZH/JA/KO); LoRA/LoKR training with GPU memory optimization
GitHub
ProdIA-MAX (ElWalki)
Node.js + Python
Fork of ace-step-ui with AI Chat Assistant (multi-LLM), Audio Codes conditioning, Voice Recorder + Whisper, Chord Progression Editor, Windows one-click setup
GitHub
ACE-Step-RADIO
Python
Continuous radio-style music stream powered by ACE-Step — auto-generates and plays songs back-to-back
GitHub
Project
Description
Link
ComfyUI Native Support
ACE-Step 1.5 built into ComfyUI core. AIO and split model workflows
Docs
ComfyUI-AceMusic
15-node full-featured integration: generation, cover, repaint, extend, edit, LoRA, HeartMuLa compatible
GitHub
ComfyUI_RH_ACE-Step
ComfyUI plugin for ACE-Step 1.5 generation
GitHub
scromfyUI-AceStep
30+ specialized nodes: audio KSamplers with shift control, multi-API lyrics gen (Gemini/Groq/OpenAI/Claude), masking & inpainting
GitHub
ComfyUI-FL-AceStep-Training
LoRA training pipeline in ComfyUI: auto-label, tiled VAE, real-time loss charts
GitHub
Comfyui_SN_AceStepTrainer
LoRA training nodes for ACE-Step 1.5 inside ComfyUI
GitHub
ComfyUI-kaola-ace-step
ComfyUI custom nodes for ACE-Step music generation
GitHub
Project
Description
Link
Side-Step
Standalone LoRA/LoKR toolkit for v1.5. Auto-detects variant, 8 GB VRAM training, interactive wizard + CLI
GitHub
ACE-Step-1.5-for-windows (sdbds)
LoRA and LoKR training with GPU memory offloading optimizations; integrated Gradio UI with style management and 4-language support
GitHub
ComfyUI-FL-AceStep-Training
End-to-end LoRA training inside ComfyUI with auto-labeling and live monitoring
GitHub
Ace-Step-1.5-Dataset-Manager
Desktop tool (Qt/C++) for editing LoRA training datasets: per-track caption, lyrics, BPM, key, audio preview
GitHub
Project
Description
Link
acestep-captioner
11B music captioning model (Qwen2.5 Omni). 1000+ instruments, timbre, structure analysis. Accuracy surpasses Gemini Pro 2.5
HF
acestep-transcriber
Qwen2.5 Omni-based music transcription. Structure annotation, lyrics transcription, 50+ languages
HF
Integrations and Extensions
Project
Description
Link
acestep.cpp
Portable C++17 / GGML implementation of ACE-Step 1.5. CPU, CUDA, Metal, Vulkan. Stereo 48 kHz WAV output
GitHub
ace-step-1.5 Docker
Docker image with models pre-baked (~15 GB). REST API server, RunPod template, CLI generation tool
GitHub
Generative Radio
Fully local AI radio station. Qwen3 generates prompts, ACE-Step 1.5 generates songs. Multi-listener, Apple Silicon optimized
GitHub
StemForge
Local GPU-accelerated audio workstation. Stem separation (Demucs, BS-Roformer), MIDI extraction, Stable Audio generation, ACE-Step composition, RVC voice conversion, mixing, and export — all in one browser UI
GitHub
Open-Source Music Generation Landscape
A comparison of notable open-source music generation projects alongside ACE-Step.
Project
Architecture
Capability
License
Link
ACE-Step
LM + DiT
Text/lyrics → full song (vocal + BGM), cover, repaint, LoRA. <4 GB VRAM
Apache-2.0
GitHub
YuE
LLaMA2 autoregressive
Lyrics → full song, multi-genre, multi-lingual, voice cloning, style transfer
Apache-2.0
GitHub
AudioCraft / MusicGen
Autoregressive transformer
Text → music/audio, melody conditioning, style conditioning (JASCO)
MIT
GitHub
Amphion
Multiple (SVC, TTS, TTA)
Singing voice conversion, text-to-audio, vocoders, research toolkit
MIT
GitHub
Riffusion
Stable Diffusion (spectrograms)
Real-time text → music via spectrogram diffusion
MIT
GitHub
Stable Audio Tools
DiT + flow matching
Text → variable-length stereo audio (up to 47 s)
MIT
GitHub
DiffRhythm
Latent diffusion (DiT + VAE)
Lyrics → full-length song (up to 4 min 45 s) in ~10 s
Apache-2.0
GitHub
HeartMuLa
LLM-based codec
Song gen, lyric recognition, audio codec, audio-text alignment
Apache-2.0
GitHub
SongGeneration (LeVo)
Transformer-based
Lyrics → high-quality full song with multi-preference alignment (vocals + BGM)
Non-commercial
GitHub
Title
Topic
Link
ACE-Step Prompt Guide
Detailed prompting tips: tags, lyrics structure, genre control
Ambience AI
Generate AI Music with ACE-Step 1.5
Installation, generation, LoRA customization
DigitalOcean
ComfyUI ACE-Step 1.5 Guide
Official ComfyUI v1.5 workflow tutorial
Comfy.org
AMD ACE-Step 1.5 Local Guide
Running ACE-Step on AMD GPUs
PromptGalaxy
Running ACE-Step 1.5 on M2 Mac
Apple Silicon setup, MPS memory workarounds
BioErrorLog
Install ACE-Step 1.5 with UV
Git + UV package manager setup
PandaiTech
ACE-Step 1.5 DeepWiki
Architecture deep-dive, code walkthrough, Gradio UI internals
DeepWiki
ACE Studio
Professional AI music production suite
acestudio.ai
Paper
Version
Key Contribution
Link
ACE-Step: A Step Towards Music Generation Foundation Model
v1.0
DCAE + linear transformer, REPA training
arXiv
ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation
v1.5
Hybrid LM + DiT, intrinsic RL, comprehensive evaluation
arXiv
Contributions welcome! Please read the contributing guidelines first.
To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this work.