File size: 7,322 Bytes

---
license: cc-by-nc-2.0
language:
- en
base_model:
- mistralai/Ministral-8B-Instruct-2410
base_model_relation: finetune
pipeline_tag: text-generation
library_name: transformers
tags:
- alignment
- conversational
- conversational-ai
- collaborate
- chat
- chatbot
- research
- persona
- personality
- friendly
- reasoning
- chatbot
- vanta-research
- LLM
- collaborative-ai
- frontier
- reflective
---

<div align="center">

![vanta_trimmed](https://cdn-uploads.huggingface.co/production/uploads/686c460ba3fc457ad14ab6f8/hcGtMtCIizEZG_OuCvfac.png)
  
  <h1>VANTA Research</h1>
    
  <p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
  
  <p>
    <a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
    <a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
    <a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
    <a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
  </p>
</div>

---

# Atom v1 8B Preview

**Developed by VANTA Research**

Atom v1 8B Preview is a fine-tuned language model designed to serve as a collaborative thought partner. Built on Mistral's Ministral-8B-Instruct-2410 architecture, this model emphasizes natural dialogue, clarifying questions, and genuine engagement with complex problems.
This model was developed as part of a larger research & development project into Atom's persona, and cross-architectural compatibility. 

## Model Details

- **Model Type:** Causal language model (decoder-only transformer)
- **Base Model:** mistralai/Ministral-8B-Instruct-2410
- **Parameters:** 8 billion
- **Training Method:** Low-Rank Adaptation (LoRA) fine-tuning
- **License:** CC BY-NC 4.0 (Non-Commercial Use)
- **Language:** English
- **Developed by:** VANTA Research, Portland, Oregon

## Intended Use

Atom v1 8B Preview is designed for:

- Collaborative problem-solving and brainstorming
- Technical explanations with accessible analogies
- Code assistance and algorithmic reasoning
- Exploratory conversations that prioritize understanding over immediate answers
- Educational contexts requiring thoughtful dialogue

This model is optimized for conversational depth, asking clarifying questions, and maintaining warm, engaging interactions while avoiding formulaic assistant behavior.

## Training Data

The model was fine-tuned on a curated dataset comprising:

- Identity and persona examples emphasizing collaborative exploration
- Technical reasoning and coding challenges
- Multi-step problem-solving scenarios
- Conversational examples demonstrating warmth and curiosity
- Advanced coding tasks and algorithmic thinking

Training focused on developing a distinctive voice that balances technical competence with genuine engagement.

## Performance Characteristics

Atom v1 8B demonstrates strong capabilities in:

- **Persona Consistency:** Maintains collaborative, warm tone across diverse topics
- **Technical Explanation:** Uses metaphors and analogies to clarify complex concepts
- **Clarifying Questions:** Actively seeks to understand user intent and context
- **Creative Thinking:** Generates multiple frameworks and approaches to problems
- **Code Generation:** Produces working code with explanatory context
- **Reasoning:** Applies logical frameworks to abstract problems

## Limitations

- **Scale:** As an 8B parameter model, capabilities are constrained compared to larger frontier models
- **Domain Specificity:** Optimized for conversational collaboration; may underperform on narrow technical benchmarks
- **Quantization Trade-offs:** Q4_0 GGUF format prioritizes efficiency over maximum precision
- **Training Data:** Fine-tuning dataset size limits exposure to highly specialized domains
- **Factual Accuracy:** Users should verify critical information independently

## Ethical Considerations

This model is released for research and non-commercial applications. Users should:

- Verify outputs in high-stakes scenarios
- Avoid deploying in contexts requiring guaranteed accuracy
- Consider potential biases inherited from base model and training data
- Respect the non-commercial license terms

## Usage

### Hugging Face Transformers

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "vanta-research/atom-v1-8b-preview"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

messages = [
    {"role": "system", "content": "You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth."},
    {"role": "user", "content": "Can you explain how gradient descent works?"}
]

input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
output = model.generate(input_ids, max_new_tokens=512, temperature=0.8)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```

### Ollama (GGUF)

The repository includes `atom-ministral-8b-q4_0.gguf` for efficient local inference:

```bash
# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./atom-ministral-8b-q4_0.gguf

TEMPLATE """{{- if .System }}<s>[INST] <<SYS>>
{{ .System }}
<<SYS>>

{{ .Prompt }}[/INST]{{ else }}<s>[INST]{{ .Prompt }}[/INST]{{ end }}{{ .Response }}</s>
"""

PARAMETER stop "</s>"
PARAMETER temperature 0.8
PARAMETER top_p 0.9
PARAMETER top_k 40

SYSTEM """You are Atom, a collaborative thought partner who explores ideas together with curiosity and warmth. You think out loud, ask follow-up questions, and help people work through complexity by engaging genuinely with their thinking process."""
EOF

# Register with Ollama
ollama create atom-v1-8b:latest -f Modelfile

# Run inference
ollama run atom-v1-8b:latest "What's a creative way to visualize time-series data?"
```

## Technical Specifications

- **Architecture:** Mistral-based transformer with Grouped Query Attention
- **Context Length:** 32,768 tokens
- **Vocabulary Size:** 131,072 tokens
- **Attention Heads:** 32 (8 key-value heads)
- **Hidden Dimension:** 4,096
- **Intermediate Size:** 12,288
- **LoRA Configuration:** r=16, alpha=32, targeting attention and MLP layers
- **Training:** 258 steps with bf16 precision and gradient checkpointing

## Citation

```bibtex
@software{atom_v1_8b_preview,
  title = {Atom v1 8B Preview},
  author = {VANTA Research},
  year = {2025},
  url = {https://huggingface.co/vanta-research/atom-v1-8b-preview},
  license = {CC-BY-NC-4.0}
}
```

## License

This model is released under the **Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)**.

You are free to:
- Share and adapt the model for non-commercial purposes
- Attribute VANTA Research as the creator

You may not:
- Use this model for commercial purposes without explicit permission

## Contact

For questions, collaboration inquiries, or commercial licensing:
- **Email:** [email protected]


---

**Version:** 1.0.0-preview  
**Release Date:** November 2025  
**Status:** Preview release for research and evaluation