This repository contains the code, experiments, and supporting material for the paper:
RedMorph: Characterizing the Challenges of Attacker Adaptation in Realistic Networks
Stratosphere Laboratory, September 2025
RedMorph investigates the generalization problem of autonomous attacker agents in cybersecurity environments.
We study how agents can adapt their learned policies when facing new, unseen network topologies, focusing on:
- Data exfiltration attack scenarios
- Bridging the gap between theory and realism
- Comparing different adaptation and generalization strategies
Our contributions include:
- A redesigned NetSecGame environment with changing topologies
- A systematic training/evaluation methodology for attacker adaptation
- Evaluation of multiple agents:
- Deep Q-Networks (DQN)
- Model-Agnostic Meta Learning (MAML)
- Conceptual Q-learning/SARSA (state abstraction)
- Large Language Model (LLM)-based agents
- An XAI framework to analyze action distributions and explain generalization
Most attacker agents in cybersecurity simulations fail to transfer knowledge when network topology changes.
We ask:
- Can attackers learn abstract, general policies?
- How well do different RL/meta-RL/LLM agents adapt to unseen networks?
- Can we explain why and how agents generalize using interpretable methods?
We extend NetSecGame, a multi-agent security environment.
- Observation Space: networks, hosts, services, data, firewall rules
- Action Space:
ScanNetwork,FindServices,FindData,ExploitService,ExfiltrateData,BlockIP
- Scenario: Data exfiltration
- Compromise internal hosts
- Find and access the target server
- Exfiltrate data to a C&C server
- Classic Deep Q-Learning attacker
- Single-buffer and dual-buffer variants
- Candidate-centric 12D state representation
- Meta-RL attacker using Model-Agnostic Meta Learning
- Learns initialization for fast adaptation to unseen topologies
- Abstraction-based Q-learning agent
- Uses concepts (roles, functions) instead of raw values (IP addresses, services)
- GPT-4o-mini and Gemma-3 evaluated as reasoning-based attackers
- Agents trained on 5 topologies, tested on a 6th unseen topology
- Average generalization success ~65–70% across agents
- Key insights:
- Conceptual abstraction enables broader transfer
- MAML improves zero-shot adaptation
- LLMs show promise but require careful integration
- DQN struggles when evaluation topology diverges from training
We analyze action distributions over time to understand strategies:
- Q-learning vs. Conceptual Q-learning
- MAML adaptation vs. LLM strategies
- Insights into how different agents explore, exploit, and exfiltrate in new environments
.
├── figures/ # Plots, environment diagrams, agent results
├── src/ # Agent implementations
│ ├── dqn/
│ ├── maml/
│ ├── conceptual/
│ └── llm/
├── env/ # NetSecGame modifications
├── experiments/ # Training & evaluation scripts
├── notebooks/ # Analysis & visualization
└── paper/ # LaTeX source of the paper
- Python 3.10+
- PyTorch
- Gym-like environment (custom NetSecGame)
- Transformers (for LLM agents)
- Matplotlib / Seaborn (for plotting)
git clone https://github.com/stratosphere-lab/RedMorph.git
cd RedMorph
pip install -r requirements.txtpython experiments/train_dqn.py --topologies data/topos --episodes 1000
python experiments/eval_dqn.py --model checkpoints/dqn.pth --topology data/test_topo.json- See notebooks/ for evaluation and XAI plots.
If you use this code or dataset, please cite:
@article{redmorph2025,
title={RedMorph: Characterizing the Challenges of Attacker Adaptation in Realistic Networks},
author={Stratosphere Laboratory CTU University, UTEP University},
year={2025},
journal={arXiv preprint}
}
- NetSecGame: GitHub Repository
- MAML: Finn et al., ICML 2017
- Related cyber-RL works: Collyer (2022), Applebaum (2022), Janisch (2023), Wolk (2022)
Stratosphere Laboratory
https://www.stratosphereips.org