Intelligent-Internet
/

II-Medical-8B-1706

Text Generation

text-generation-inference

Model card Files Files and versions

hoanganhpham commited on Jun 17

Commit

db075b4

·

verified ·

1 Parent(s): 071e9c8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -40,7 +40,7 @@ For RL stage we setup training with:
 ## III. Evaluation Results
-Our II-Medical-8B-1706 model achieved a 46.8% score on [HealthBench](https://openai.com/index/healthbench/), a comprehensive open-source benchmark evaluating the performance and safety of large language models in healthcare. This performance is comparable to OpenAI's o1 reasoning model and GPT-4.5, OpenAI's largest and most advanced model to date. We provide a comparison to models available in ChatGPT below.
 <!-- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61f2636488b9b5abbe184a8e/5r2O4MtzffVYfuUZJe5FO.jpeg) -->
 Detailed result for HealthBench can be found [here](https://huggingface.co/datasets/Intelligent-Internet/OpenAI-HealthBench-II-Medical-8B-1706-GPT-4.1).

 ## III. Evaluation Results
+Our II-Medical-8B-1706 model achieved a 46.8% score on [HealthBench](https://openai.com/index/healthbench/), a comprehensive open-source benchmark evaluating the performance and safety of large language models in healthcare. This performance is comparable to MedGemma-27B from Google. We provide a comparison to models available in ChatGPT below.
 <!-- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61f2636488b9b5abbe184a8e/5r2O4MtzffVYfuUZJe5FO.jpeg) -->
 Detailed result for HealthBench can be found [here](https://huggingface.co/datasets/Intelligent-Internet/OpenAI-HealthBench-II-Medical-8B-1706-GPT-4.1).