File size: 5,307 Bytes

---
license: apache-2.0
tags:
- finance
- fine-tuning
- conversational-ai
- named-entity-recognition
- sentiment-analysis
- topic-classification
- rag
- multilingual
- lightweight-llm
- phi-architecture
datasets:
- Josephgflowers/Finance-Instruct-500k
- Josephgflowers/Phinance
base_model:
- Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.2
---

# Phinance-Phi-3.5-mini-instruct-finance-v0.3


![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/JXmrUfVIgvuzF8hBZwgRI.png)

## Overview

**Phinance-Phi-3.5-mini-instruct-finance-v0.3** is a fine-tuned mini language model built specifically for financial tasks, reasoning, and multi-turn conversations. This version improves upon v0.2 by leveraging additional curated datasets and incorporating enhancements to better align with real-world Retrieval-Augmented Generation (RAG) workflows. It offers superior instruction-following capabilities and financial expertise while maintaining a lightweight architecture.

Key Updates in v0.3:
- **Updated RAG Formatting**: Retrieved context is now included at the start of the `user` field, aligning with widely used practices in RAG workflows.
- **Expanded Dataset**: Trained on the updated **Finance-Instruct-500k** dataset, incorporating broader multilingual and financial tagging examples.
- **Improved Instruction Tuning**: Enhanced handling of multi-turn conversations and context retention for financial reasoning tasks.
- **Structured Output in JSON Format**: Most NER and parsing tasks prompt the model to return structured JSON output, enabling seamless extraction of structured data from unstructured input.

---

## Key Features

- **Finance-Focused Reasoning**: Handles tasks like portfolio analysis, market trends, and financial question answering.
- **Instruction Following**: Tailored for fine-grained instruction-based tasks within the financial domain.
- **Multi-Turn Conversations**: Optimized for context-aware dialogue, supporting long interactions on financial topics.
- **RAG-Compatible**: Prepares retrieved context at the beginning of the `user` field, improving integration with RAG systems.
- **Lightweight Architecture**: Efficient performance on resource-constrained systems while maintaining robust output quality.
- **JSON Structured Output**: Excels in returning structured JSON data for parsing and NER tasks.

---

## Training Data

The model was fine-tuned on the **Finance-Instruct-500k** dataset, a diverse and meticulously curated financial corpus. The dataset features multi-turn conversations and instruction-tuning examples formatted for modern RAG workflows.

### Dataset Highlights
- **Topics**: Market trends, investment strategies, financial analysis, and more.
- **Format**: Conversations structured as `system`, `user`, `assistant`, with retrieved context prepended to the `user` field for RAG use cases.
- **Filtering**: High-quality financial content curated through advanced methods.
- **NER and Parsing Tasks**: Prompts often structured to encourage JSON-formatted outputs, aiding structured data extraction.

---

## Supported Tasks

1. **Financial Question Answering**: Address complex queries about markets, terminology, and strategies.
2. **Multi-Turn Conversations**: Engage in coherent, context-rich dialogues.
3. **Instruction Following**: Execute finance-specific prompts with precision.
4. **RAG Applications**: Seamlessly integrate external data for enhanced responses.
5. **NER and Parsing**: Extract structured JSON data from unstructured financial inputs.
6. **Lightweight Financial Assistant**: Serve as an efficient domain expert for finance-related tasks.

---

## Usage

This model is ideal for:
- Financial advisory tools and assistants
- Chatbots for customer interactions
- Financial QA systems
- Lightweight, domain-specific applications

---

## Example Code

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Example usage
inputs = tokenizer("System: You are a financial assistant.\nUser: What is the difference between stocks and bonds?", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## Limitations

- **Niche Knowledge**: Best suited for financial topics; may underperform on general-purpose tasks.
- **Bias**: Data filtering could introduce biases toward specific financial sectors.
- **Validation Needed**: Outputs should be verified for critical use cases.

---

## Model Details

- **Base Model**: phi-3.5-mini
- **Fine-Tuned Dataset**: Finance-Instruct-500k
- **Version**: v0.3
- **Parameters**: Mini-sized architecture for efficient performance
- **Training Framework**: Hugging Face Transformers

---

## License

This model is released under the Apache 2.0 license.

---

## Citation

If you use this model, please cite:

```bibtex
@model{josephgflowers2025phinance,
  title={Phinance-Phi-3.5-mini-instruct-finance-v0.3},
  author={Joseph G. Flowers},
  year={2025},
  url={https://huggingface.co/Josephgflowers/Phinance-Phi-3.5-mini-instruct-finance-v0.3}
}
```