HVF-SLM v3 (Qwen): Maritime Domain SLM
We present HVF-SLM, the first language model specifically for maritime intelligence and data. All dataset creation and supervised fine tuning (SFT) was conducted by Hitachi Vantara Federal. This is the third model produced by HVF for this domain; we previously used the same dataset for both Magistral (v1) and Llama (v2). More info is below. It will also be clear that this model - v3 (Qwen) - is by far the best, and is extraordinarily fast. Even at just 7B parameters, it is directly competing with much larger, more expensive models.
Less is better for domain-specific, mission critical domains.
Third and final iteration in the HVF-SLM AIS research. v3 is based on Qwen2.5-7B. This model successfully addresses the critical failures of v1-magistral and v2-llama, demonstrating extraordinary vessel extraction and maritime calculations without hallucination, even when provided with 100k+ tokens of structured AIS JSON data.
Model Performance
Validated Capabilities:
- Accurate vessel extraction: Successfully identifies and extracts specific vessels from 100k+ token JSON contexts
- No coordinate hallucination: Uses only actual vessel positions from provided data
- Correct maritime calculations: Uses correct physics; applies rhumb line formulas and nautical units as trained
- No repetition issues: Generates clean outputs without the endless repetition of v2
- 100K context window: Handles massive AIS datasets via YaRN scaling (100k+ tokens)
Model Details
- Base Model: Qwen/Qwen2.5-7B-Instruct
- Context Length: 100k tokens (YaRN rope_scaling factor 4.0)
- Training Dataset: 21,497 synthetic maritime Q&A pairs (95K tokens average)
- Fine-tuning Method: QLoRA rank 256, alpha 512, LoRA dropout 0.1
- Training Loss: 0.117 (training) / 0.084 (eval)
- Optimal Temperature: 0.7 (tested range: 0.1-0.9)
Inference
We highly recommend using the following settings for inference. In our case, we use vLLM.
payload={
"model": "hvf-slm-qwen",
"prompt": full_prompt,
"max_tokens": 2500,
"temperature": 0.7,
"top_p": 0.9,
"stop": ["<|im_end|>", "<|im_start|>"]
},
Comparisons
After v1-magistral's complete failure and v2-llama's hallucination issues, v3-qwen succeeds due to:
- Architectural advantages: Qwen2.5's native long-context support and structured data capabilities (pre-trained on JSON and long contexts)
- Training methodology: Questions positioned before vessel data to prevent truncation
- System instruction inclusion: Maritime context provided during training
- Aggressive learning rate: 2e-4 forced genuine learning vs memorization
- Proper regularization: Dropout and weight decay prevented overfitting
- Cosine restarts: Implemented cosine restarts to decay LR, enhancing the robustness of the model and escaping lsos plateau
Validated Use Cases
- Maritime vessel tracking and identification
- AIS data analysis and extraction
- Vessel trajectory calculations
- Port congestion analysis
- Maritime safety assessments
Research Value
- Low training loss (0.0002 in v2) can indicate memorization rather than learning
- Proper context ordering (questions first) is critical for extreme sequence lengths
- System instructions must be included during training, not just inference
- Higher learning rates with instability can force genuine pattern learning
Citation
Part of the HVF-SLM research series documenting iterative improvements in maritime AI. Full citation available upon publication.
- Downloads last month
- 7