Model Card for sciencebase-metadata-llama3-8b (v 1.0)
Model Details
| Field | Value |
|---|---|
| Developed by | Quan Quy, Travis Ping, Tudor Garbulet, Chirag Shah, Austin Aguilar |
| Contact | [email protected] • [email protected] • [email protected] • [email protected] • [email protected] |
| Funded by | U.S. Geological Survey (USGS) & Oak Ridge National Laboratory – ARM Data Center |
| Model type | Autoregressive LLM, instruction-tuned for structured → metadata generation |
| Base model | meta-llama/Llama-3.1-8B-Instruct |
| Languages | English (metadata vocabulary) |
| Finetuned from | unsloth/Meta-Llama-3.1-8B-Instruct |
Model Description
Fine-tuned on ≈ 9 000 ScienceBase “data → metadata” pairs to automate creation of FGDC/ISO-style metadata records for scientific datasets.
Model Sources
| Resource | Link |
|---|---|
| Repository | https://huggingface.co/ARM-Development/Llama-3.1-8B-tabular-1.0 |
| Demo | https://colab.research.google.com/drive/1saCEFhkBYDhQWkdTwnwiE_-AiWmD6p0f#scrollTo=WeniLP-Ah1QL |
Uses
Direct Use
Generate schema-compliant metadata text from a JSON/CSV representation of a ScienceBase item.
Downstream Use
Integrate as a micro-service in data-repository pipelines.
Out-of-Scope
Open-ended content generation, or any application outside metadata curation.
Bias, Risks, and Limitations
- Domain-specific bias toward ScienceBase field names.
- Possible hallucination of fields when prompts are underspecified.
Training Details
Training Data
- ~9 k ScienceBase records with curated metadata.
Training Procedure
| Hyper-parameter | Value |
|---|---|
| Max sequence length | 100 000 |
| Precision | fp16 / bf16 (auto) |
| Quantisation | 4-bit QLoRA (load_in_4bit=True) |
| LoRA rank / α | 16 / 16 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Optimiser | adamw_8bit |
| LR / schedule | 2 × 10⁻⁴, linear |
| Epochs | 1 |
| Effective batch | 4 (1 GPU × grad-acc 4) |
| Trainer | trl SFTTrainer + peft 0.15.2 |
Hardware & Runtime
| Field | Value |
|---|---|
| GPU | 1 × NVIDIA A100 80 GB |
| Total training hours | ~120 hours |
| Cloud/HPC provider | ARM Cumulus HPC |
Software Stack
| Package | Version |
|---|---|
| Python | 3.12.9 |
| PyTorch | 2.6.0 + CUDA 12.4 |
| Transformers | 4.51.3 |
| Accelerate | 1.6.0 |
| PEFT | 0.15.2 |
| Unsloth | 2025.3.19 |
| BitsAndBytes | 0.45.5 |
| TRL | 0.15.2 |
| Xformers | 0.0.29.post3 |
| Datasets | 3.5.0 |
| … |
Evaluation
Evaluation still in progress.
Technical Specifications
Architecture & Objective
QLoRA-tuned Llama-3.1-8B-Instruct; causal-LM objective with structured-to-text instruction prompts.
Model Card Authors
Quan Quy, Travis Ping, Tudor Garbulet, Chirag Shah, Austin Aguilar
- Downloads last month
- 29
Hardware compatibility
Log In
to view the estimation
4-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for ARM-Development/Llama-3.1-8B-tabular-1.0
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct
Finetuned
unsloth/Meta-Llama-3.1-8B-Instruct