junghwankim commited on
Commit
efeb387
·
verified ·
1 Parent(s): f4adf98

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -90
README.md CHANGED
@@ -8,41 +8,25 @@ pipeline_tag: sentence-similarity
8
  library_name: sentence-transformers
9
  ---
10
 
11
- # SentenceTransformer based on meta-llama/Llama-3.2-1B
12
 
13
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B). It maps sentences & paragraphs to a 2048-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 
14
 
15
- ## Model Details
 
 
 
16
 
17
- ### Model Description
18
- - **Model Type:** Sentence Transformer
19
  - **Base model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) <!-- at revision 4e20de362430cd3b72f300e6b0f18e50e7166e08 -->
20
  - **Maximum Sequence Length:** 131072 tokens
21
  - **Output Dimensionality:** 2048 dimensions
22
  - **Similarity Function:** Cosine Similarity
23
- <!-- - **Training Dataset:** Unknown -->
24
- <!-- - **Language:** Unknown -->
25
- <!-- - **License:** Unknown -->
26
-
27
- ### Model Sources
28
-
29
- - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
30
- - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
31
- - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
32
-
33
- ### Full Model Architecture
34
-
35
- ```
36
- SentenceTransformer(
37
- (0): Transformer({'max_seq_length': 131072, 'do_lower_case': False}) with Transformer model: BidirectionalLlamaModel
38
- (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
39
- )
40
- ```
41
 
42
  ## Usage
43
 
44
- ### Direct Usage (Sentence Transformers)
45
-
46
  First install the Sentence Transformers library:
47
 
48
  ```bash
@@ -72,70 +56,5 @@ print(similarities.shape)
72
  ```
73
 
74
  <!--
75
- ### Direct Usage (Transformers)
76
-
77
- <details><summary>Click to see the direct usage in Transformers</summary>
78
-
79
- </details>
80
- -->
81
-
82
- <!--
83
- ### Downstream Usage (Sentence Transformers)
84
-
85
- You can finetune this model on your own dataset.
86
-
87
- <details><summary>Click to expand</summary>
88
-
89
- </details>
90
- -->
91
-
92
- <!--
93
- ### Out-of-Scope Use
94
-
95
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
96
- -->
97
-
98
- <!--
99
- ## Bias, Risks and Limitations
100
-
101
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
102
- -->
103
-
104
- <!--
105
- ### Recommendations
106
-
107
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
108
- -->
109
-
110
- ## Training Details
111
-
112
- ### Framework Versions
113
- - Python: 3.12.11
114
- - Sentence Transformers: 4.1.0
115
- - Transformers: 4.55.2
116
- - PyTorch: 2.7.1+cu126
117
- - Accelerate: 1.7.0
118
- - Datasets: 3.6.0
119
- - Tokenizers: 0.21.4
120
-
121
  ## Citation
122
-
123
- ### BibTeX
124
-
125
- <!--
126
- ## Glossary
127
-
128
- *Clearly define terms in order to be accessible across audiences.*
129
  -->
130
-
131
- <!--
132
- ## Model Card Authors
133
-
134
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
135
- -->
136
-
137
- <!--
138
- ## Model Card Contact
139
-
140
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
141
- -->
 
8
  library_name: sentence-transformers
9
  ---
10
 
11
+ # Multilingual Style Representation based on meta-llama/Llama-3.2-1B
12
 
13
+ This is the Style Representation model, presented in ``Leveraging Multilingual Training for Authorship Representation:
14
+ Enhancing Generalization across Languages and Domains``.
15
 
16
+ The Style Representation model encodes documents written by the same author as nearby vectors in the embedding space.
17
+ The model can be used for authorship attribution, style similarity, machine-generated text detection, and more.
18
+
19
+ For training and evaluation code, refer to our repository [here](https://github.com/junghwanjkim/multilingual_aa).
20
 
21
+ ## Model Details
22
+ - **Model Type:** [Sentence Transformer](https://www.SBERT.net)
23
  - **Base model:** [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) <!-- at revision 4e20de362430cd3b72f300e6b0f18e50e7166e08 -->
24
  - **Maximum Sequence Length:** 131072 tokens
25
  - **Output Dimensionality:** 2048 dimensions
26
  - **Similarity Function:** Cosine Similarity
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  ## Usage
29
 
 
 
30
  First install the Sentence Transformers library:
31
 
32
  ```bash
 
56
  ```
57
 
58
  <!--
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  ## Citation
 
 
 
 
 
 
 
60
  -->