YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

R2E-TestgenAgent

Overview

R2E-TestgenAgent is a specialized execution-based testing agent designed for generating targeted unit tests for software engineering tasks. This agent is part of the R2E-Gym framework, which provides a comprehensive environment for training and evaluating software engineering agents.

Model Description

The R2E-TestgenAgent is an execution-based testing agent that specializes in:

  • Targeted Unit Test Generation: Creates specific unit tests to validate code patches and implementations
  • Execution-Based Verification: Generates tests that can be executed to verify the correctness of code changes
  • Corner Case Detection: Identifies and tests potential edge cases and corner scenarios
  • Patch Disambiguation: Creates tests that can differentiate between correct and incorrect patches

Architecture

The agent is built on top of the Qwen2.5-Coder-32B-Instruct model and fine-tuned using R2E-Gym's SFT (Supervised Fine-Tuning) trajectories specifically designed for testing tasks.

Training Data

The model was trained on the R2E-Gym/R2EGym-TestingAgent-SFT-Trajectories dataset, which contains:

  • High-quality testing trajectories collected from Claude-3.5-Sonnet
  • Execution-based testing scenarios
  • Diverse software engineering problems across 13 repositories
  • Real-world testing patterns and methodologies

Usage

Basic Usage

from r2egym.agenthub.environment.env import EnvArgs, RepoEnv
from r2egym.agenthub.agent.agent import AgentArgs, Agent
from pathlib import Path
from datasets import load_dataset

# Load dataset
ds = load_dataset("R2E-Gym/R2E-Gym-Lite")
env_args = EnvArgs(ds=ds['train'][0])
env = RepoEnv(env_args)

# Load testing agent configuration
agent_args = AgentArgs.from_yaml(Path('./config/testing_agent.yaml'))
agent_args.llm_name = 'r2e-gym/R2E-TestgenAgent'
agent = Agent(name="TestingAgent", args=agent_args)

# Run the testing agent
output = agent.run(env, max_steps=30, use_fn_calling=True)

Configuration

The agent uses specific prompts and configurations optimized for test generation:

system_prompt: |
  You are a specialized testing agent designed to generate targeted unit tests 
  for software engineering tasks. Your goal is to create comprehensive tests 
  that can validate code patches and identify potential issues.

instance_prompt: |
  Given the following problem and potential patches, create targeted unit tests
  that can effectively validate the correctness of the implementation.

Training Configuration

The model was trained using the following configuration:

  • Base Model: Qwen/Qwen2.5-Coder-32B-instruct
  • Training Method: Full fine-tuning with DeepSpeed ZeRO-3
  • Learning Rate: 1.0e-5
  • Epochs: 2.0
  • Batch Size: 1 (per device)
  • Context Length: 20,480 tokens
  • Optimizer: AdamW with cosine learning rate scheduling

Performance

The R2E-TestgenAgent is designed to work in conjunction with other R2E-Gym agents:

  • Code Editing Agent: For generating and fixing code
  • Execution-free Verifier: For reranking patches
  • Hybrid Test-time Scaling: Combines execution-based and execution-free verification

Integration with R2E-Gym

This agent is part of the larger R2E-Gym ecosystem:

  1. Environment: Works with R2E-Gym's 8.1K+ procedurally curated environments
  2. Evaluation: Can be evaluated on SWE-Bench Verified and other benchmarks
  3. Training: Supports continued training on additional trajectories

Citation

If you use R2E-TestgenAgent in your research, please cite:

@article{jain2025r2e,
  title={R2e-gym: Procedural environments and hybrid verifiers for scaling open-weights swe agents},
  author={Jain, Naman and Singh, Jaskirat and Shetty, Manish and Zheng, Liang and Sen, Koushik and Stoica, Ion},
  journal={arXiv preprint arXiv:2504.07164},
  year={2025}
}

License

This model is released under the same license as the base Qwen2.5-Coder model.

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support