Finetuned TinyLLaMA Chat (4-bit, LoRA) β Healthcare Assistant
This model is a fine-tuned LoRA adapter based on the 4-bit quantized TinyLlama/TinyLlama-1.1B-Chat-v1.0
, using the Unsloth framework and Hugging Face's PEFT library.
It is designed as a lightweight Healthcare assistant to help users ask common health-related questions and receive informative, conversational responses.
β οΈ Note: This model is currently in the build stage and was trained on a small synthetic dataset. It is intended for research and experimentation only β not for clinical use.
π§ Model Details
- Model Name:
tinyllama-healthcare-4bit-lora
- Base Model:
TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Fine-tuned with: LoRA (Low-Rank Adaptation) + 4-bit quantization via Unsloth
- Library:
peft
,transformers
,unsloth
- Language: English
- License: Apache 2.0 (inherited from base)
- Instruction Format: Medical question-answering (symptoms, conditions, advice)
π©Ί Intended Use
β Intended Applications
- Patient education bots
- General health Q&A assistants
- Symptom explainer tools
- On-device medical chatbots (informational)
β Not Intended For
- Real-time clinical diagnosis or decision-making
- Emergency or high-risk medical settings
- Professional medical consultation replacement
β οΈ Bias, Risks, and Limitations
- This model may hallucinate or produce oversimplified, incorrect advice
- It was not trained on real clinical datasets
- May reflect bias from public medical Q&A examples
- Should not be trusted for personalized or critical health advice
π‘ Recommendations
- Use only in low-stakes, informational or prototyping contexts
- Always encourage users to consult healthcare professionals
- Evaluate thoroughly before deployment in any production use case
π οΈ Training Details
Dataset
- Format: JSONL with
instruction
,input
,output
- Size: 1000 examples
- Split: 90% train / 10% eval
- Max length: 230 tokens
LoRA Configuration
- Target modules:
q_proj
,v_proj
r
: 8,alpha
: 16,dropout
: 0.05- Gradient checkpointing: Enabled
Training Environment
- Hardware: 1Γ NVIDIA Tesla T4 (16GB)
- Duration: ~2.9 minutes total
- Precision: bfloat16 (fallback to fp16)
- Framework:
unsloth.FastLanguageModel
+ Hugging FaceTrainer
π Evaluation
- Final training loss: ~1.08
- Evaluation loss: Low β but limited generalization expected
- Benchmarks: Not evaluated on public QA or health benchmarks
Next Steps
- Incorporate multi-turn dialog examples
- Train with more realistic, diverse health datasets
- Evaluate with medically focused test suites
π± Environmental Impact
- GPU Used: Tesla T4
- Training Time: < 3 minutes
- Estimated Carbon Emission: ~0.05 kgCOβeq (via ML CO2 calculator)
π Sources & Dependencies
- Base model:
TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Adapter: Unsloth
- PEFT library: HuggingFace PEFT
π Citation
@misc{tinyllama_healthcare_2025,
title={TinyLLaMA Healthcare Assistant via LoRA (Finetuned)},
author={Dhivakar G},
year={2025},
howpublished={\url{https://huggingface.co/Dhivs/tinyllama-healthcare-4bit-lora}},
}
π€ Model Contact
- Author: Dhivakar G
- HF Username: Dhivs
- Contact: [email protected]
π§° Framework Versions
peft
: 0.16.0transformers
: 4.53.3torch
: 2.7.1+cu126unsloth
: 2025.7.8- CUDA Toolkit: 12.6
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for Dhivs/tinyllama-healthcare-4bit-lora
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0