An operating system for research-oriented coding agents.
简体中文 · Website · Quick Start · Core Skills · Contributing
PhD-Zero is a repository of reusable skills for AI research work. The point is not to make an agent sound smart for one turn. The point is to give it a workflow it can actually follow: plan the task, gather evidence, run experiments, keep context, ask for human review when needed, and write results down in a form another person can inspect.
The same skill library is exposed to different runtimes. Codex-style agents read workspace rules from AGENTS.md. Claude Code sees a mirrored discovery layer under .claude/skills/. The actual source of truth lives in .agents/skills/.
See PhD-Zero in action! We provide end-to-end demonstrations of how our research-oriented OS empowers autonomous AI agents to conduct complex research tasks.
Watch PhD-Zero autonomously investigate how different prompting tricks affect the reasoning capabilities of models. The agent handles everything from literature review and experiment design to execution and report generation.
The entire workflow below was completed entirely autonomously by PhD-Zero: leveraging the
deep-researchskill to survey ideas, executing experiments independently in the background, analyzing quantitative results, and ultimately drafting the final research report.
📄 Read the generated report: phd_zero_report_e2e_prompting_tricks_v0_0316.pdf
Watch PhD-Zero autonomously optimize a foundational model's performance on the notoriously challenging AIME25 mathematics benchmark! By systematically experimenting with different datasets (numina-math, open-r1), training algorithms (SFT, RL), and exploring training tricks like learning rate tuning and rigorous data filtering, the agent successfully skyrocketed the accuracy of Qwen3-1.7B-base from 0% to an impressive 20%!
If you just want to see whether the repo is wired correctly, do this:
git clone https://github.com/TenureAI/PhD-Zero.git
cd PhD-Zero
find .agents/skills -mindepth 1 -maxdepth 1 -type d
find .claude/skills -mindepth 1 -maxdepth 1 -type lThose two commands should list the same skill names. If they do, the shared skill layer is in place.
From there:
- Read
AGENTS.mdto understand the workspace rules used by Codex-style agents. - Inspect
.agents/skills/if you want the canonical skill implementations. - Inspect
.claude/skills/if you want to verify the Claude Code mirror.
If you prefer a landing page over the raw repository view, there is also a static site under docs/index.html.
The repository is intentionally small. It does not try to be a benchmark suite, a framework, and a demo app all at once. It is mostly a skill library plus the rules that tell agents how to use it.
.
├── AGENTS.md
├── REPO_CONVENTIONS.md
├── .agents/skills/ # canonical skill definitions
├── .claude/skills/ # Claude Code mirror layer
├── .github/workflows/ # repository validation
├── assets/ # shared visual assets
└── docs/ # static landing page
The CI in this repo checks that the skill directories under .agents/skills and .claude/skills stay in sync, and that every tracked skill has a readable SKILL.md.
The current skill set covers the basic loop of a research-oriented agent:
| Skill | What it is for |
|---|---|
run-governor |
Stage control, run discipline, and execution policy |
research-workflow |
The default loop for non-trivial research tasks |
research-plan |
Turning an open-ended goal into a concrete plan |
deep-research |
External search, literature comparison, and synthesis |
experiment-execution |
Running code, debugging, and experiment execution |
memory-manager |
Working state and reusable memory |
project-context |
Project-specific runtime context and conventions |
human-checkpoint |
Human review for risky or expensive decisions |
paper-writing |
Drafting and revising research artifacts |
That list will probably grow, but the idea is stable: break research into pieces that can be reused instead of trying to solve everything with one giant prompt.
PhD-Zero is for people who are already using coding agents in research or engineering-adjacent work and want more discipline around the process. If you care about literature review, experiment planning, reproducibility, or keeping an agent from improvising its way through a long task, this repo is meant to be useful. If you just want a flashy demo, it is probably not the right project.
Contributions are welcome, especially in three areas:
- new skills that fit the repository's scope
- tighter workflows for the existing skills
- validation and examples from real usage
Before opening a PR, check REPO_CONVENTIONS.md. This repo keeps reusable skill content in version control and keeps task-specific logs or run artifacts out.
PhD-Zero is shaped by the broader ecosystem around coding agents, research tooling, and writing support. In particular, the repository draws useful ideas from projects that treat workflows as first-class artifacts rather than one-off prompts.
We also want to acknowledge:
These are not runtime dependencies here, but they were useful references when thinking about writing quality and reusable editing guidance.
If PhD-Zero is useful in your workflow or research, you can cite it as:
@misc{phd_zero_github,
author = {TenureAI Contributors},
title = {PhD-Zero: An Operating System for Research-Oriented Coding Agents},
year = {2026},
howpublished = {\url{https://github.com/TenureAI/PhD-Zero}},
note = {GitHub repository}
}


