--- title: Robot Task Planning - Llama 3.1 8B emoji: 🤖 colorFrom: blue colorTo: green sdk: gradio app_file: app.py pinned: false license: llama3.1 --- # 🤖 Robot Task Planning - Llama 3.1 8B (ZeroGPU) This Space demonstrates a fine-tuned version of Meta's **Llama 3.1 8B** model specialized for **robot task planning** using QLoRA (4-bit quantization + LoRA) technique. ## 🚀 Hardware: ZeroGPU This Space uses **ZeroGPU** - dynamic GPU allocation with Nvidia H200: - **Free** for HuggingFace users - **Dynamic allocation** - GPU resources allocated on-demand - **High performance** - H200 offers superior performance - **60-second duration** per request ## 🎯 Purpose Convert natural language commands into structured task sequences for construction robots including: - **Excavators** - Digging, loading, positioning - **Dump Trucks** - Material transport, loading, unloading - **Multi-robot Coordination** - Complex task dependencies ## 🔗 Model **Fine-tuned Model**: [YongdongWang/llama-3.1-8b-dart-qlora](https://huggingface.co/YongdongWang/llama-3.1-8b-dart-qlora) **Base Model**: [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) ## ✨ Features - 🎮 **Interactive Chat Interface** - Real-time robot command processing - ⚙️ **Configurable Generation** - Adjust temperature, top-p, max tokens - 📝 **Example Commands** - Pre-built scenarios to get started - 🚀 **Optimized Performance** - 4-bit quantization for efficient inference - 📊 **Structured Output** - JSON-formatted task sequences - ⚡ **ZeroGPU Powered** - Dynamic GPU allocation for free users ## 🚀 Usage 1. **Input**: Natural language robot commands ``` "Deploy Excavator 1 to Soil Area 1 for excavation" ``` 2. **Output**: Structured task sequences ```json { "tasks": [ { "robot": "Excavator_1", "action": "move_to", "target": "Soil_Area_1", "duration": 30 }, { "robot": "Excavator_1", "action": "excavate", "target": "Soil_Area_1", "duration": 120 } ] } ``` ## 🛠️ Technical Details - **Architecture**: Llama 3.1 8B + QLoRA adapters - **Quantization**: 4-bit (NF4) with double quantization - **Framework**: Transformers + PEFT + BitsAndBytesConfig - **Hardware**: ZeroGPU (Dynamic Nvidia H200) ## ⚡ Performance Notes - **First Generation**: 5-10 seconds (GPU allocation + model loading) - **Subsequent Generations**: 2-5 seconds per response - **Memory Usage**: ~8GB VRAM with 4-bit quantization - **Context Length**: Up to 2048 tokens - **GPU Duration**: 60 seconds per request ## 📚 Example Commands Try these robot commands: - `"Deploy Excavator 1 to Soil Area 1 for excavation"` - `"Send Dump Truck 1 to collect material, then unload at storage"` - `"Coordinate multiple excavators across different areas"` - `"Create evacuation sequence for all robots from dangerous zone"` ## 🔬 Research Applications This model demonstrates: - **Natural Language → Robot Planning** translation - **Multi-agent Task Coordination** - **Efficient LLM Fine-tuning** with QLoRA - **Real-time Robot Command Processing** - **ZeroGPU Integration** for scalable deployment ## 📄 License This project uses Meta's Llama 3.1 license. Please review the license terms before use. ## 🤝 Contributing For issues, improvements, or questions about the model, please visit the [model repository](https://huggingface.co/YongdongWang/llama-3.1-8b-dart-qlora).