Model Card for nexa-OLMo-sci7b
Model Details
Model Description:
nexa-OLMo-sci7b is a fine-tuned variant of allenai/OLMo-7B, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using PEFT with LoRA in 4-bit quantized mode via bitsandbytes.
Developed by: Allan (Independent Scientific Intelligence Architect)
Shared by: Allan (https://huggingface.co/allan-wandia)
Model type: Decoder-only transformer (causal language model)
Language(s): English (scientific domain-specific vocabulary)
License: Apache 2.0
Fine-tuned from: allenai/OLMo-7B
Repository: https://huggingface.co/allan-wandia/nexa-olmo-sci7b
Training Details
Training Data:
- Size: 100 million tokens
- Source: Curated scientific literature (Bio, Physics, QST, Astro)
Hyperparameters:
- Sequence length: 1024
- Batch size: 1
- Gradient Accumulation Steps: 64
- Effective Batch Size: 64
- Learning rate: 2e-05
- Epochs: 2
- LoRA: Enabled (PEFT)
- Quantization: 4-bit
Results:
Robust performance in scientific prose tasks, with novelty varying by prompt diversity.
- Downloads last month
- 7
Model tree for Allanatrix/Nexa-OLMo-sci7b
Base model
allenai/OLMo-7B