shinnosukeono commited on
Commit
dfb0245
·
verified ·
1 Parent(s): 714662b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -23
README.md CHANGED
@@ -113,31 +113,17 @@ Use the code below to get started with the model.
113
 
114
  <!-- This section describes the evaluation protocols and provides the results. -->
115
 
116
- ### Testing Data, Factors & Metrics
117
 
118
- #### Testing Data
119
 
120
  <!-- This should link to a Dataset Card if possible. -->
121
 
122
- [More Information Needed]
123
-
124
- #### Factors
125
-
126
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
127
-
128
- [More Information Needed]
129
-
130
- #### Metrics
131
-
132
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
133
-
134
- [More Information Needed]
135
 
136
  ### Results
137
 
138
- [More Information Needed]
139
-
140
- #### Summary
141
 
142
 
143
 
@@ -170,8 +156,4 @@ See our preprint: [A Japanese Language Model and Three New Evaluation Benchmarks
170
 
171
  ## Model Card Authors [optional]
172
 
173
- @shinnosukeono
174
-
175
- ## Model Card Contact
176
-
177
- [More Information Needed]
 
113
 
114
  <!-- This section describes the evaluation protocols and provides the results. -->
115
 
116
+ We evaluated our model, JPharmatron-7B, with other general / domain-specific models of a similar size.
117
 
118
+ ### Testing Data
119
 
120
  <!-- This should link to a Dataset Card if possible. -->
121
 
122
+ [JPharmaBench](https://huggingface.co/collections/EQUES/jpharmabench-680a34acfe96870e41d050d8) and two existing benchmarks (JMMLU (pharma) and IgakuQA) were used.
 
 
 
 
 
 
 
 
 
 
 
 
123
 
124
  ### Results
125
 
126
+ Compared to Meditron3-Qwen2.5-7B and Llama3.1-Swallow-8B-Instruct-v0.3, JPharmatron-7B achieved the highest score on all of the five benchmarks.
 
 
127
 
128
 
129
 
 
156
 
157
  ## Model Card Authors [optional]
158
 
159
+ [@shinnosukeono](https://shinnosukeono.github.io/)