gajesh commited on
Commit
921a87e
·
verified ·
1 Parent(s): 25833d8

Re-upload model with updated configuration

Browse files
Files changed (1) hide show
  1. README.md +92 -0
README.md ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - causal-lm
5
+ - llama
6
+ - fine-tuned
7
+ - text-generation
8
+ ---
9
+
10
+ # Fine-Tuned LLaMA 3.2 1B Model
11
+
12
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) on custom data. It has been trained to generate coherent and contextually relevant responses based on the input prompt.
13
+
14
+ ## Model Description
15
+
16
+ - **Model Type**: LLaMA (Large Language Model for AI Assistants)
17
+ - **Architecture**: Causal Language Model (LlamaForCausalLM)
18
+ - **Base Model**: `meta-llama/Llama-3.2-1B-Instruct`
19
+ - **Fine-Tuning**: Fine-tuned on domain-specific data to enhance performance on targeted tasks.
20
+ - **Intended Use**: Suitable for various NLP tasks such as text generation, question answering, and code analysis.
21
+
22
+ ## Training Data
23
+
24
+ The model was fine-tuned on a dataset containing domain-specific examples designed to improve its understanding and generation capabilities within specific contexts. The training data included:
25
+
26
+ - **Code Samples**: Various programming languages for code analysis and explanation.
27
+ - **Technical Documentation**: To improve technical writing and explanation capabilities.
28
+
29
+ ## Training Details
30
+
31
+ - **Fine-Tuning Epochs**: 5
32
+ - **Batch Size**: 1 (with gradient accumulation)
33
+ - **Learning Rate**: 1e-5
34
+ - **Hardware**: Fine-tuned using an NVIDIA A10G on an `g5.16xlarge` instance.
35
+ - **Optimizer**: AdamW with weight decay
36
+
37
+ ### Model Configuration
38
+
39
+ - **Hidden Size**: 2048
40
+ - **Number of Layers**: 16
41
+ - **Number of Attention Heads**: 32
42
+ - **Intermediate Size**: 8192
43
+
44
+ ## Usage
45
+
46
+ To use this model, you can either download it and run locally using the `transformers` library or use the Hugging Face Inference API.
47
+
48
+ ### Using with `transformers`
49
+
50
+ ```python
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM
52
+
53
+ # Load the fine-tuned model and tokenizer
54
+ tokenizer = AutoTokenizer.from_pretrained("username/your-fine-tuned-llama")
55
+ model = AutoModelForCausalLM.from_pretrained("username/your-fine-tuned-llama")
56
+
57
+ # Generate text
58
+ prompt = "What does EigenLayer do exactly?"
59
+ inputs = tokenizer(prompt, return_tensors="pt")
60
+ outputs = model.generate(**inputs, max_length=150, num_beams=4, temperature=0.5, do_sample=True)
61
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
62
+ ```
63
+
64
+ ### Using with the Hugging Face Inference API
65
+
66
+ You can also use the model via the Hugging Face API endpoint:
67
+
68
+ ```python
69
+ import requests
70
+
71
+ API_URL = "https://api-inference.huggingface.co/models/username/your-fine-tuned-llama"
72
+ headers = {"Authorization": "Bearer YOUR_HUGGING_FACE_API_TOKEN"}
73
+
74
+ def query(prompt):
75
+ response = requests.post(API_URL, headers=headers, json={"inputs": prompt})
76
+ return response.json()
77
+
78
+ print(query("Explain how EigenLayer functions."))
79
+ ```
80
+
81
+ ## Limitations
82
+
83
+ - The model may generate incorrect or biased information. Users should verify the outputs for critical applications.
84
+ - Due to fine-tuning, there might be domain-specific biases in the generation.
85
+
86
+ ## Ethical Considerations
87
+
88
+ Please ensure that the outputs of this model are used responsibly. The model may generate unintended or harmful content, so it should be used with caution in sensitive applications.
89
+
90
+ ## Acknowledgements
91
+
92
+ This model was fine-tuned based on [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct). Special thanks to the open-source community and contributors to the `transformers` library.