Spaces:

YongdongWang
/

DART-LLM-Multi-Model

Sleeping

File size: 6,358 Bytes

---
title: "DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution"
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
app_file: app.py
pinned: false
license: llama3.1
---

<div align="center">
<h1>DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models (Spaces)</h1>
<div class="project-info">
    This project is part of the <a href="https://moonshot-cafe-project.org/en/" target="_blank">Moonshot Café Project</a>
</div>
<div class="authors">
  <a href="https://researchmap.jp/wangyongdong?lang=en" target="_blank">Yongdong Wang</a><sup class="org-1">1,*</sup>,
  Runze Xiao<sup class="org-1">1</sup>,
  <a href="https://www.robot.t.u-tokyo.ac.jp/~louhi_kasahara/index-e.html" target="_blank">Jun Younes Louhi Kasahara</a><sup class="org-1">1</sup>,
  <a href="https://researchmap.jp/r-yaj?lang=en" target="_blank">Ryosuke Yajima</a><sup class="org-1">1</sup>,
  <a href="http://k-nagatani.org/" target="_blank">Keiji Nagatani</a><sup class="org-1">1</sup><sup class="org-2">, 2</sup>,
  <a href="https://www.robot.t.u-tokyo.ac.jp/~yamashita/" target="_blank">Atsushi Yamashita</a><sup class="org-3">3</sup>,
  <a href="https://www.robot.t.u-tokyo.ac.jp/asamalab/en/members/asama/biography.html" target="_blank">Hajime Asama</a><sup class="org-4">4</sup>
</div>
<div class="affiliations">
  <sup class="org-1">1</sup>Graduate School of Engineering, The University of Tokyo<br>
  <sup class="org-2">2</sup>Faculty of Systems and Information Engineering, University of Tsukuba<br>
  <sup class="org-3">3</sup>Graduate School of Frontier Sciences, The University of Tokyo<br>
  <sup class="org-4">4</sup>Tokyo College, The University of Tokyo
</div>
<div class="corresponding-author">
  *Corresponding author: <a href="mailto:[email protected]">[email protected]</a>
</div>
<div align="center">
  <a href="https://arxiv.org/pdf/2411.09022" target="_blank" rel="noopener noreferrer">
    <img src="https://img.shields.io/badge/arXiv-2411.09022-b31b1b" alt="arXiv Badge">
  </a>
  <a href="https://github.com/wyd0817/QA_LLM_Module" target="_blank" rel="noopener noreferrer">
    <img src="https://img.shields.io/badge/QA_LLM_Module-GitHub-blue" alt="QA LLM Module GitHub Badge">
  </a>
  <a href="https://huggingface.co/datasets/YongdongWang/dart_llm_tasks" target="_blank" rel="noopener noreferrer">
    <img src="https://img.shields.io/badge/Dataset-Hugging_Face-blue" alt="Dataset Badge">
  </a>
  <a href="https://huggingface.co/spaces/YongdongWang/DART-LLM-Llama3.1-8b" target="_blank" rel="noopener noreferrer">
    <img src="https://img.shields.io/badge/Spaces-DART--LLM--Llama3.1--8b-lightgrey" alt="Spaces Badge">
  </a>
  <a href="https://www.youtube.com/watch?v=p3A-yg3yv0Q" target="_blank" rel="noopener noreferrer">
    <img src="https://img.shields.io/badge/Video-YouTube-red" alt="Video Badge">
  </a>
  <a href="https://www.youtube.com/watch?v=T3M94hP8NFQ" target="_blank" rel="noopener noreferrer">
    <img src="https://img.shields.io/badge/Real_Robot-YouTube-orange" alt="Real Robot Badge">
  </a>
</div>

## Overview

This Hugging Face Space hosts DART-LLM, a QLoRA-fine-tuned meta-llama/Llama-3.1-8B model specialized in construction robotics. It demonstrates converting natural language robot commands into structured JSON tasks, supporting detailed multi-robot coordination, spatial reasoning, and action planning.

## Quick Start

1. Enter your robot command in the provided interface.
2. Click **Generate Tasks**.
3. Review the structured JSON output describing the robot task sequence.

## Local/Edge Deployment (Recommended for Jetson)

For local deployment on edge devices like NVIDIA Jetson, we recommend using the GGUF quantized models for optimal performance and memory efficiency:

### Available GGUF Models

| Model | Size | Memory Usage | Recommended Hardware |
|-------|------|--------------|---------------------|
| [llama-3.2-1b-lora-qlora-dart-llm-gguf](https://huggingface.co/YongdongWang/llama-3.2-1b-lora-qlora-dart-llm-gguf) | 870MB | ~2GB RAM | Jetson Nano, Jetson Orin Nano |
| [llama-3.2-3b-lora-qlora-dart-llm-gguf](https://huggingface.co/YongdongWang/llama-3.2-3b-lora-qlora-dart-llm-gguf) | 1.9GB | ~4GB RAM | Jetson Orin NX, Jetson AGX Orin |
| [llama-3.1-8b-lora-qlora-dart-llm-gguf](https://huggingface.co/YongdongWang/llama-3.1-8b-lora-qlora-dart-llm-gguf) | 4.6GB | ~8GB RAM | High-end Jetson AGX Orin |

### Deployment Options

#### Option 1: Using Ollama (Recommended)

```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Create a Modelfile
cat > Modelfile << EOF
FROM ./llama_3.2_1b-lora-qlora-dart-llm_q5_k_m.gguf
TEMPLATE """### Instruction:
{{ .Prompt }}

### Response:
"""
PARAMETER stop "### Instruction:"
PARAMETER stop "### Response:"
EOF

# Create the model
ollama create dart-llm-1b -f Modelfile

# Run inference
ollama run dart-llm-1b "Deploy Excavator 1 to Soil Area 1 for excavation"
```

#### Option 2: Using llama.cpp

```bash
# Clone and build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

# Download model
wget https://huggingface.co/YongdongWang/llama-3.2-1b-lora-qlora-dart-llm-gguf/resolve/main/llama_3.2_1b-lora-qlora-dart-llm_q5_k_m.gguf

# Run inference
./main -m llama_3.2_1b-lora-qlora-dart-llm_q5_k_m.gguf \
  -p "### Instruction:\nDeploy Excavator 1 to Soil Area 1 for excavation\n\n### Response:\n" \
  -n 512
```

#### Option 3: Using Python (llama-cpp-python)

```bash
# Install llama-cpp-python
pip install llama-cpp-python

# Python script
python3 << EOF
from llama_cpp import Llama

# Load model
llm = Llama(model_path="llama_3.2_1b-lora-qlora-dart-llm_q5_k_m.gguf", n_ctx=2048)

# Generate response
prompt = "### Instruction:\nDeploy Excavator 1 to Soil Area 1 for excavation\n\n### Response:\n"
output = llm(prompt, max_tokens=512, stop=["</s>"], echo=False)

print(output['choices'][0]['text'])
EOF
```

## Citation

If you use this work, please cite:

```bibtex
@article{wang2024dart,
  title={Dart-llm: Dependency-aware multi-robot task decomposition and execution using large language models},
  author={Wang, Yongdong and Xiao, Runze and Kasahara, Jun Younes Louhi and Yajima, Ryosuke and Nagatani, Keiji and Yamashita, Atsushi and Asama, Hajime},
  journal={arXiv preprint arXiv:2411.09022},
  year={2024}
}
```