asgard-robot

community

AI & ML interests

None defined yet.

Recent Activity

jj701  updated a model about 1 month ago
asgard-robot/s0101-act-condiment-handover
jj701  published a model about 1 month ago
asgard-robot/s0101-act-condiment-handover
jj701  updated a model about 1 month ago
asgard-robot/s0101-act-potato-handover
View all activity

ASGARD Robot 🤖

Creating Intelligent Home Assistant Robots for Human-Robot Interaction

ASGARD (Autonomous Service Generation for Advanced Robot Deployment) is a research and development initiative focused on creating practical home assistant robots capable of safely interacting with humans in domestic environments.


🎯 Mission

To develop autonomous robots that can:

  • Handle everyday household tasks safely and reliably
  • Interact naturally with humans in home environments
  • Hand over objects to humans with proper coordination and social awareness
  • Adapt to diverse home environments and situations

🏠 Focus Areas

1. Home Environment Manipulation

Our robots are designed to handle common household objects:

  • Food items (potatoes, condiments, containers)
  • Daily use objects (cups, utensils, small tools)
  • Delicate items requiring careful handling

2. Human-Robot Handover

Developing sophisticated coordination for:

  • Gesture Recognition: Understanding when and how humans want to receive objects
  • Force Feedback: Proper force control during handover to prevent accidents
  • Timing Coordination: Synchronizing robot and human movements
  • Social Awareness: Reading human intent and nonverbal cues

3. Multi-Modal Understanding

Our robots integrate:

  • Vision: Dual camera systems (wrist + external) for comprehensive scene understanding
  • Touch: Force/torque feedback for delicate manipulation
  • Language: Natural language understanding for task specification
  • Context: Awareness of household context and social norms

📊 Current Models

Trained GROOT Models

1. Potato Manipulation Model

  • Model: groot-potato-inference
  • Task: Potato handling and cleaning in kitchen environments
  • Checkpoint: Step 2000
  • Base Model: NVIDIA GR00T N1.5-3B
  • Robot: ASGARD so101_follower (single-arm 6 DOF)
  • Performance: 99.53% loss reduction from initial training
  • Dataset: 40 episodes, 30,795 frames

2. Condiment Handover Model

  • Model: groot-condiment-handover
  • Task: Condiment bottle handling and handover to humans
  • Checkpoint: Step 2000
  • Base Model: NVIDIA GR00T N1.5-3B
  • Robot: ASGARD so101_follower (single-arm 6 DOF)
  • Dataset: 40 episodes, 31,522 frames
  • Focus: Human-robot coordination for object handover

🗂️ Datasets

Training Datasets

1. Potato Training Data

  • Dataset: asgard_training_data_potato
  • Type: LeRobot v3.0 format
  • Episodes: 40 demonstrations
  • Frames: 30,795 (avg 770 per episode)
  • Duration: ~26 seconds per episode at 30 FPS
  • Modalities:
    • Dual RGB cameras (wrist + realsense)
    • 6 DOF joint positions
    • Force feedback
  • Task: Potato manipulation and cleaning

2. Condiment Training Data

  • Dataset: asgard_training_data_condiment
  • Type: LeRobot v3.0 format
  • Episodes: 40 demonstrations
  • Frames: 31,522 (avg 788 per episode)
  • Duration: ~26 seconds per episode at 30 FPS
  • Modalities:
    • Dual RGB cameras (wrist + realsense)
    • 6 DOF joint positions
    • Force feedback
  • Task: Condiment handling and human handover

🤖 Robot Platform

ASGARD so101_follower

  • Type: Single-arm manipulator
  • Degrees of Freedom: 6 (shoulder_pan, shoulder_lift, elbow_flex, wrist_flex, wrist_roll, gripper)
  • Sensors:
    • Wrist-mounted RGB camera (640×480)
    • External RGB camera (640×480)
    • Force/torque sensors
    • Joint position encoders
  • Capabilities:
    • Precise object manipulation
    • Force-controlled grasping
    • Human-safe operation
    • Real-time perception

🧠 Technology Stack

Base Models

  • NVIDIA GR00T N1.5-3B: Foundation model for robot manipulation
    • Generalist robot foundation model
    • Trained on diverse manipulation tasks
    • Multi-modal understanding (vision + language + actions)
    • Flow matching for continuous action generation

Training Framework

  • LeRobot: PyTorch-based robotics framework
    • ASGARD teleop control branch
    • GROOT policy support
    • Dataset format v3.0
    • Multi-GPU training with Hugging Face Accelerate

Hardware

  • Training: 4× NVIDIA H100 PCIe GPUs (80GB VRAM each)
  • Inference: Optimized for edge deployment
  • Compute: 320GB total VRAM for full fine-tuning

🔬 Research Goals

Short-Term

  1. Robust Manipulation: Reliable handling of diverse household objects
  2. Safe Handover: Zero accidents in human-robot handover scenarios
  3. Context Awareness: Understanding household context and social norms
  4. Adaptation: Quick adaptation to new objects and scenarios

Long-Term

  1. General Household Assistance: Cooking, cleaning, organization
  2. Human-Robot Collaboration: Seamless teamwork with humans
  3. Learning from Demonstration: Improved generalization from limited data
  4. Real-Time Adaptation: Dynamic adjustment to unexpected situations

🏗️ Architecture

Model Architecture

Our models are fine-tuned from GR00T N1.5-3B:

  • Frozen Components:
    • Vision encoder (preserves visual understanding)
    • LLM (maintains language understanding)
  • Trainable Components:
    • Diffusion transformer (action generation)
    • Projector (vision-language → action mapping)

Training Strategy

  • Full Fine-Tuning: All trainable parameters updated
  • Batch Size: 512 (128 per GPU × 4 GPUs)
  • Training Steps: 2,000 per task
  • Approx. Epochs: ~33 (potato) / ~32 (condiment)
  • Learning Rate: 1e-4 with warmup
  • Precision: bf16 mixed precision

📈 Performance

Training Results

Both models show excellent convergence:

  • Loss Reduction: 99%+ from initial to final
  • Stability: No overfitting observed
  • Convergence: Achieved around steps 1200-1600
  • Final Loss: ~0.006 (from initial ~1.2)

Metrics

  • Training Time: ~2 hours per model
  • Memory Usage: 60-70GB per GPU
  • Throughput: 2-3 samples/second per GPU
  • Checkpoints: 5 saved per training run (steps 400, 800, 1200, 1600, 2000)

🤝 Contributing

We welcome contributions in:

  • Additional household task datasets
  • Improved handover algorithms
  • Multi-robot coordination
  • Human behavior modeling
  • Safety protocols

📚 Citations

If you use our models or datasets, please cite:

@organization{asgard_robot_2024,
  title={ASGARD Robot: Home Assistant Robot for Human-Robot Interaction},
  author={ASGARD Team},
  year={2024},
  url={https://huggingface.co/asgard-robot},
  models={groot-potato-inference, groot-condiment-handover},
  datasets={asgard_training_data_potato, asgard_training_data_condiment}
}

📞 Contact


🎖️ Acknowledgments

  • Base Model: NVIDIA GR00T N1.5-3B
  • Framework: LeRobot (Hugging Face)
  • Hardware: Shadeform H100 Multi-GPU Cluster
  • Research: ASGARD Team

🌟 Vision

We envision a future where robots seamlessly integrate into home environments, assisting humans with daily tasks while maintaining the highest standards of safety, reliability, and social awareness. Our work focuses on practical applications that can improve quality of life and enable independent living.


Building the future of home robotics, one handover at a time. 🤖❤️