---
title: Robot Task Planning - Llama 3.1 8B
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
app_file: app.py
pinned: false
license: llama3.1
---

# 🤖 Robot Task Planning - Llama 3.1 8B (ZeroGPU)

This Space demonstrates a fine-tuned version of Meta's **Llama 3.1 8B** model specialized for **robot task planning** using QLoRA (4-bit quantization + LoRA) technique.

## 🚀 Hardware: ZeroGPU

This Space uses **ZeroGPU** - dynamic GPU allocation with Nvidia H200:
- **Free** for HuggingFace users
- **Dynamic allocation** - GPU resources allocated on-demand
- **High performance** - H200 offers superior performance
- **60-second duration** per request

## 🎯 Purpose

Convert natural language commands into structured task sequences for construction robots including:
- **Excavators** - Digging, loading, positioning
- **Dump Trucks** - Material transport, loading, unloading  
- **Multi-robot Coordination** - Complex task dependencies

## 🔗 Model

**Fine-tuned Model**: [YongdongWang/llama-3.1-8b-dart-qlora](https://huggingface.co/YongdongWang/llama-3.1-8b-dart-qlora)

**Base Model**: [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B)

## ✨ Features

- 🎮 **Interactive Chat Interface** - Real-time robot command processing
- ⚙️ **Configurable Generation** - Adjust temperature, top-p, max tokens
- 📝 **Example Commands** - Pre-built scenarios to get started
- 🚀 **Optimized Performance** - 4-bit quantization for efficient inference
- 📊 **Structured Output** - JSON-formatted task sequences
- ⚡ **ZeroGPU Powered** - Dynamic GPU allocation for free users

## 🚀 Usage

1. **Input**: Natural language robot commands
   ```
   "Deploy Excavator 1 to Soil Area 1 for excavation"
   ```

2. **Output**: Structured task sequences
   ```json
   {
     "tasks": [
       {
         "robot": "Excavator_1",
         "action": "move_to",
         "target": "Soil_Area_1",
         "duration": 30
       },
       {
         "robot": "Excavator_1", 
         "action": "excavate",
         "target": "Soil_Area_1",
         "duration": 120
       }
     ]
   }
   ```

## 🛠️ Technical Details

- **Architecture**: Llama 3.1 8B + QLoRA adapters
- **Quantization**: 4-bit (NF4) with double quantization
- **Framework**: Transformers + PEFT + BitsAndBytesConfig
- **Hardware**: ZeroGPU (Dynamic Nvidia H200)

## ⚡ Performance Notes

- **First Generation**: 5-10 seconds (GPU allocation + model loading)
- **Subsequent Generations**: 2-5 seconds per response
- **Memory Usage**: ~8GB VRAM with 4-bit quantization
- **Context Length**: Up to 2048 tokens
- **GPU Duration**: 60 seconds per request

## 📚 Example Commands

Try these robot commands:

- `"Deploy Excavator 1 to Soil Area 1 for excavation"`
- `"Send Dump Truck 1 to collect material, then unload at storage"`
- `"Coordinate multiple excavators across different areas"`
- `"Create evacuation sequence for all robots from dangerous zone"`

## 🔬 Research Applications

This model demonstrates:
- **Natural Language → Robot Planning** translation
- **Multi-agent Task Coordination** 
- **Efficient LLM Fine-tuning** with QLoRA
- **Real-time Robot Command Processing**
- **ZeroGPU Integration** for scalable deployment

## 📄 License

This project uses Meta's Llama 3.1 license. Please review the license terms before use.

## 🤝 Contributing

For issues, improvements, or questions about the model, please visit the [model repository](https://huggingface.co/YongdongWang/llama-3.1-8b-dart-qlora).