Model Card for nexa-OLMo-sci7b

Model Details

Model Description:
nexa-OLMo-sci7b is a fine-tuned variant of allenai/OLMo-7B, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using PEFT with LoRA in 4-bit quantized mode via bitsandbytes.

Developed by: Allan (Independent Scientific Intelligence Architect)
Shared by: Allan (https://huggingface.co/allan-wandia)
Model type: Decoder-only transformer (causal language model)
Language(s): English (scientific domain-specific vocabulary)
License: Apache 2.0
Fine-tuned from: allenai/OLMo-7B
Repository: https://huggingface.co/allan-wandia/nexa-olmo-sci7b

Training Details

Training Data:

Size: 100 million tokens
Source: Curated scientific literature (Bio, Physics, QST, Astro)

Hyperparameters:

Sequence length: 1024
Batch size: 1
Gradient Accumulation Steps: 64
Effective Batch Size: 64
Learning rate: 2e-05
Epochs: 2
LoRA: Enabled (PEFT)
Quantization: 4-bit

Results:
Robust performance in scientific prose tasks, with novelty varying by prompt diversity.

Downloads last month: 13

Model tree for Allanatrix/Nexa-OLMo-sci7b

Base model

allenai/OLMo-7B

Adapter

(1)

this model

Dataset used to train Allanatrix/Nexa-OLMo-sci7b

Collection including Allanatrix/Nexa-OLMo-sci7b

Nexa_Models

Collection

This is where I keep all of my SciML models • 13 items • Updated Jul 22