R2E-TestgenAgent
Overview
R2E-TestgenAgent is a specialized execution-based testing agent designed for generating targeted unit tests for software engineering tasks. This agent is part of the R2E-Gym framework, which provides a comprehensive environment for training and evaluating software engineering agents.
Model Description
The R2E-TestgenAgent is an execution-based testing agent that specializes in:
- Targeted Unit Test Generation: Creates specific unit tests to validate code patches and implementations
- Execution-Based Verification: Generates tests that can be executed to verify the correctness of code changes
- Corner Case Detection: Identifies and tests potential edge cases and corner scenarios
- Patch Disambiguation: Creates tests that can differentiate between correct and incorrect patches
Architecture
The agent is built on top of the Qwen2.5-Coder-32B-Instruct model and fine-tuned using R2E-Gym's SFT (Supervised Fine-Tuning) trajectories specifically designed for testing tasks.
Training Data
The model was trained on the R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories
dataset, which contains:
- High-quality testing trajectories collected from Claude-3.5-Sonnet
- Execution-based testing scenarios
- Diverse software engineering problems across 13 repositories
- Real-world testing patterns and methodologies
Usage
Basic Usage
from r2egym.agenthub.environment.env import EnvArgs, RepoEnv
from r2egym.agenthub.agent.agent import AgentArgs, Agent
from pathlib import Path
from datasets import load_dataset
# Load dataset
ds = load_dataset("R2E-Gym/R2E-Gym-Lite")
env_args = EnvArgs(ds=ds['train'][0])
env = RepoEnv(env_args)
# Load testing agent configuration
agent_args = AgentArgs.from_yaml(Path('./config/testing_agent.yaml'))
agent_args.llm_name = 'r2e-gym/R2E-TestgenAgent'
agent = Agent(name="TestingAgent", args=agent_args)
# Run the testing agent
output = agent.run(env, max_steps=30, use_fn_calling=True)
Configuration
The agent uses specific prompts and configurations optimized for test generation:
system_prompt: |
You are a specialized testing agent designed to generate targeted unit tests
for software engineering tasks. Your goal is to create comprehensive tests
that can validate code patches and identify potential issues.
instance_prompt: |
Given the following problem and potential patches, create targeted unit tests
that can effectively validate the correctness of the implementation.
Training Configuration
The model was trained using the following configuration:
- Base Model: Qwen/Qwen2.5-Coder-32B-instruct
- Training Method: Full fine-tuning with DeepSpeed ZeRO-3
- Learning Rate: 1.0e-5
- Epochs: 2.0
- Batch Size: 1 (per device)
- Context Length: 20,480 tokens
- Optimizer: AdamW with cosine learning rate scheduling
Performance
The R2E-TestgenAgent is designed to work in conjunction with other R2E-Gym agents:
- Code Editing Agent: For generating and fixing code
- Execution-free Verifier: For reranking patches
- Hybrid Test-time Scaling: Combines execution-based and execution-free verification
Integration with R2E-Gym
This agent is part of the larger R2E-Gym ecosystem:
- Environment: Works with R2E-Gym's 8.1K+ procedurally curated environments
- Evaluation: Can be evaluated on SWE-Bench Verified and other benchmarks
- Training: Supports continued training on additional trajectories
Citation
If you use R2E-TestgenAgent in your research, please cite:
@article{jain2025r2e,
title={R2e-gym: Procedural environments and hybrid verifiers for scaling open-weights swe agents},
author={Jain, Naman and Singh, Jaskirat and Shetty, Manish and Zheng, Liang and Sen, Koushik and Stoica, Ion},
journal={arXiv preprint arXiv:2504.07164},
year={2025}
}
License
This model is released under the same license as the base Qwen2.5-Coder model.
Links
- Paper: R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents
- GitHub: R2E-Gym
- Dataset: R2EGym-TestingAgent-SFT-Trajectories
- Related Models: