GeoLLM-Qwen3.5-2B

Fine-tuned Qwen3.5-2B for mineral exploration geology, targeting the Western Australian geological domain.

This model is part of the GeoLLM-Qwen3.5-FineTune benchmark. All five model sizes (0.8B--27B) were trained with identical hyperparameters on the same dataset for fair comparison.

Performance

Metric	Base	Fine-tuned
Overall weighted score	0.355	0.343
QA ROUGE-L	0.1336	0.1869
QA BERTScore	0.8108	0.8475
CoT ROUGE-L	0.1261	0.1972
Hallucination pass rate	80.0%	26.7%
Training loss	--	1.576

See the full benchmark comparison for all models.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AshkanTaghipour/GeoLLM-Qwen3.5-2B", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("AshkanTaghipour/GeoLLM-Qwen3.5-2B")

messages = [
    {"role": "system", "content": "You are a specialist geologist with expertise in Western Australian mineral exploration."},
    {"role": "user", "content": "What geophysical methods would you recommend for targeting komatiite-hosted nickel sulphide deposits in the Eastern Goldfields?"},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.6, top_p=0.95)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

Training Details

Method: bf16 LoRA (r=16, alpha=16) via Unsloth + SFTTrainer
Epochs: 5
Optimizer: adamw_8bit, cosine LR schedule (2e-4, 10% warmup)
Dataset: mineral-exploration-geology-qa (479 train, 26 test examples)
Hardware: NVIDIA A100-80GB
VRAM (training): 5 GB
VRAM (inference): ~5 GB

Author

Ashkan Taghipour -- GitHub | HuggingFace

Downloads last month: 8

Safetensors

Model size

2B params

Tensor type

F32

BF16

Model tree for AshkanTaghipour/GeoLLM-Qwen3.5-2B

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Adapter

(78)

this model

Adapters

1 model

AshkanTaghipour
/

GeoLLM-Qwen3.5-2B