Spaces:
Sleeping
Sleeping
metadata
title: Robot Task Planning - Llama 3.1 8B
emoji: ๐ค
colorFrom: blue
colorTo: green
sdk: gradio
app_file: app.py
pinned: false
license: llama3.1
๐ค Robot Task Planning - Llama 3.1 8B (ZeroGPU)
This Space demonstrates a fine-tuned version of Meta's Llama 3.1 8B model specialized for robot task planning using QLoRA (4-bit quantization + LoRA) technique.
๐ Hardware: ZeroGPU
This Space uses ZeroGPU - dynamic GPU allocation with Nvidia H200:
- Free for HuggingFace users
- Dynamic allocation - GPU resources allocated on-demand
- High performance - H200 offers superior performance
- 60-second duration per request
๐ฏ Purpose
Convert natural language commands into structured task sequences for construction robots including:
- Excavators - Digging, loading, positioning
- Dump Trucks - Material transport, loading, unloading
- Multi-robot Coordination - Complex task dependencies
๐ Model
Fine-tuned Model: YongdongWang/llama-3.1-8b-dart-qlora
Base Model: meta-llama/Llama-3.1-8B
โจ Features
- ๐ฎ Interactive Chat Interface - Real-time robot command processing
- โ๏ธ Configurable Generation - Adjust temperature, top-p, max tokens
- ๐ Example Commands - Pre-built scenarios to get started
- ๐ Optimized Performance - 4-bit quantization for efficient inference
- ๐ Structured Output - JSON-formatted task sequences
- โก ZeroGPU Powered - Dynamic GPU allocation for free users
๐ Usage
Input: Natural language robot commands
"Deploy Excavator 1 to Soil Area 1 for excavation"
Output: Structured task sequences
{ "tasks": [ { "robot": "Excavator_1", "action": "move_to", "target": "Soil_Area_1", "duration": 30 }, { "robot": "Excavator_1", "action": "excavate", "target": "Soil_Area_1", "duration": 120 } ] }
๐ ๏ธ Technical Details
- Architecture: Llama 3.1 8B + QLoRA adapters
- Quantization: 4-bit (NF4) with double quantization
- Framework: Transformers + PEFT + BitsAndBytesConfig
- Hardware: ZeroGPU (Dynamic Nvidia H200)
โก Performance Notes
- First Generation: 5-10 seconds (GPU allocation + model loading)
- Subsequent Generations: 2-5 seconds per response
- Memory Usage: ~8GB VRAM with 4-bit quantization
- Context Length: Up to 2048 tokens
- GPU Duration: 60 seconds per request
๐ Example Commands
Try these robot commands:
"Deploy Excavator 1 to Soil Area 1 for excavation"
"Send Dump Truck 1 to collect material, then unload at storage"
"Coordinate multiple excavators across different areas"
"Create evacuation sequence for all robots from dangerous zone"
๐ฌ Research Applications
This model demonstrates:
- Natural Language โ Robot Planning translation
- Multi-agent Task Coordination
- Efficient LLM Fine-tuning with QLoRA
- Real-time Robot Command Processing
- ZeroGPU Integration for scalable deployment
๐ License
This project uses Meta's Llama 3.1 license. Please review the license terms before use.
๐ค Contributing
For issues, improvements, or questions about the model, please visit the model repository.