Image-Text-to-Text
PEFT
Safetensors
English
llm
music
multimodal
midi
phi-3
question-answering
optical-music-recognition
custom_code
Instructions to use puar-playground/Phi-3-MusiX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use puar-playground/Phi-3-MusiX with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-vision-128k-instruct") model = PeftModel.from_pretrained(base_model, "puar-playground/Phi-3-MusiX") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -97,10 +97,13 @@ Each entry in the dataset includes:
|
|
| 97 |
## 🎓 Reference
|
| 98 |
If you use this dataset in your work, please cite it using the following reference:
|
| 99 |
```
|
| 100 |
-
@
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
|
|
|
|
|
|
|
|
|
| 105 |
}
|
| 106 |
```
|
|
|
|
| 97 |
## 🎓 Reference
|
| 98 |
If you use this dataset in your work, please cite it using the following reference:
|
| 99 |
```
|
| 100 |
+
@misc{chen2025musixqaadvancingvisualmusic,
|
| 101 |
+
title={MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models},
|
| 102 |
+
author={Jian Chen and Wenye Ma and Penghang Liu and Wei Wang and Tengwei Song and Ming Li and Chenguang Wang and Jiayu Qin and Ruiyi Zhang and Changyou Chen},
|
| 103 |
+
year={2025},
|
| 104 |
+
eprint={2506.23009},
|
| 105 |
+
archivePrefix={arXiv},
|
| 106 |
+
primaryClass={cs.CV},
|
| 107 |
+
url={https://arxiv.org/abs/2506.23009},
|
| 108 |
}
|
| 109 |
```
|