-
Notifications
You must be signed in to change notification settings - Fork 405
Open
Labels
call-for-contributionSomething planned but not in our current developmentSomething planned but not in our current developmenthelp wantedExtra attention is neededExtra attention is neededroadmapDevelopment plan.Development plan.
Description
AReaL 2026 Q1 Milestone Tracker
Introduction
This document tracks major planned enhancements for AReaL through April 30, 2026. Our development roadmap is organized into two categories to help contributors identify where they can make the most impact:
On-going sections contain features currently under active development by the core AReaL team. These represent our immediate priorities.
Planned but not in progress sections list features with concrete implementation plans that we currently lack bandwidth to pursue. We actively welcome community contributions for these items! If you're interested in contributing to any planned feature, please reach out to discuss implementation details.
Backends
On-going
- ZBPP & ZBPP-V support for the Archon backend feat(archon): add InterleavedZeroBubble (ZB1P) pipeline schedule #936 feat(archon): add ZBVZeroBubble pipeline schedule support #916
- FP8 training for Archon
- Online RL training with the proxy server feat(proxy): add proxy gateway and online RL training mode #947
Planned but not in progress
- Support for agentic training with large VLM MoE models (Archon backend)
- Omini model RL support with FSDP/Archon backend [Feature] Support Audio model in the future #879
- Decoupling agent service with the inference service
- LoRA support for the Archon backend
- Colocation mode with
awexas the weight sync engine - Multi-LLM training (different agents with different parameters)
- Auto-scaling inference engines in single-controller mode
- Elastic weight update setup and acceleration
- RL training with cross-node vLLM pipeline/context parallelism
Usability
On-going
- Flatten the import structure of areal modules
Planned but not in progress
- Publishing pypi packages
- Support distributed training and debugging in Jupyter notebooks
- Example of using a generative or critic-like reward model
- Support directly constructing inference/training engines without config objects
- Add router in rollout controller for simpler proxy server usage
- Integrating
aenvironmentfor environment handling
Documentation
On-going
N/A
Planned but not in progress
- Use case guides: multi-agent training
- Guide for online proxy mode training
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
call-for-contributionSomething planned but not in our current developmentSomething planned but not in our current developmenthelp wantedExtra attention is neededExtra attention is neededroadmapDevelopment plan.Development plan.