๐ŸŽง Bigvox

Bigvox์€ ํ•œ๊ตญ์–ด ์Œ์„ฑ ์ธ์‹์— ํŠนํ™”๋œ ๊ณ ์„ฑ๋Šฅ, ์ €์ง€์—ฐ ์Œ์„ฑ ์–ธ์–ด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์ถ•๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๐Ÿš€

image/png

๐Ÿ“‚ ๋ชจ๋ธ ์ ‘๊ทผ

๐ŸŒŸ ์ฃผ์š” ํŠน์ง•

  • ๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด ํŠนํ™”: ํ•œ๊ตญ์–ด ์Œ์„ฑ ํŒจํ„ด๊ณผ ์–ธ์–ด์  ํŠน์„ฑ์— ์ตœ์ ํ™”
  • โšก ๊ฒฝ๋Ÿ‰ํ™”: 1B ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ํšจ์œจ์ ์ธ ์ถ”๋ก  ์„ฑ๋Šฅ
  • ๐ŸŽฏ ๊ณ ์ •ํ™•๋„: ๋‹ค์–‘ํ•œ ํ•œ๊ตญ์–ด ์Œ์„ฑ ํ™˜๊ฒฝ์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ
  • ๐Ÿ”ง ์‹ค์šฉ์„ฑ: ์‹ค์‹œ๊ฐ„ ์Œ์„ฑ ์ธ์‹ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์ ํ•ฉ

๐Ÿ“‹ ๋ชจ๋ธ ์ •๋ณด

ํ•ญ๋ชฉ ์„ธ๋ถ€์‚ฌํ•ญ
๊ธฐ๋ฐ˜ ๋ชจ๋ธ naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B
์–ธ์–ด ํ•œ๊ตญ์–ด (Korean)
๋ชจ๋ธ ํฌ๊ธฐ ~1B ํŒŒ๋ผ๋ฏธํ„ฐ
์ž‘์—… ์œ ํ˜• Speech-to-Text ์Œ์„ฑ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ
๋ผ์ด์„ ์Šค Apache 2.0

๐Ÿ”ง ๋ ˆํฌ์ง€ํ† ๋ฆฌ ๋‹ค์šด๋กœ๋“œ ๋ฐ ํ™˜๊ฒฝ ์„ค์ •

Bigvox์„ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ ˆํฌ์ง€ํ† ๋ฆฌ๋ฅผ ํด๋ก ํ•˜๊ณ  ํ™˜๊ฒฝ์„ ์„ค์ •ํ•˜์„ธ์š”. ๐Ÿ› ๏ธ

  1. ๋ ˆํฌ์ง€ํ† ๋ฆฌ ํด๋ก :

    git clone https://github.com/bigdefence/bigvox-hyperclovax
    cd bigvox-hyperclovax
    
  2. ์˜์กด์„ฑ ์„ค์น˜:

    bash setting.sh
    

๐Ÿ“ฅ ๋‹ค์šด๋กœ๋“œ ๋ฐฉ๋ฒ•

Huggingface CLI ์‚ฌ์šฉ:

pip install -U huggingface_hub
huggingface-cli download bigdefence/Bigvox-HyperCLOVAX-Audio --local-dir ./checkpoints

Snapshot Download ์‚ฌ์šฉ:

pip install -U huggingface_hub
from huggingface_hub import snapshot_download
snapshot_download(
  repo_id="bigdefence/Bigvox-HyperCLOVAX-Audio",
  local_dir="./checkpoints",
  resume_download=True
)

Git ์‚ฌ์šฉ:

git lfs install
git clone https://huggingface.co/bigdefence/Bigvox-HyperCLOVAX-Audio

๐Ÿ› ๏ธ ์˜์กด์„ฑ ๋ชจ๋ธ

๐Ÿ”„ ๋กœ์ปฌ ์ถ”๋ก 

Bigvox์œผ๋กœ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ผ ๋ชจ๋ธ์„ ์„ค์ •ํ•˜๊ณ  ๋กœ์ปฌ์—์„œ ์‹คํ–‰ํ•˜์„ธ์š”. ๐Ÿ“ก

  1. ๋ชจ๋ธ ์ค€๋น„:

    • HuggingFace์—์„œ Bigvox ๋‹ค์šด๋กœ๋“œ ๐Ÿ“ฆ
    • HuggingFace์—์„œ Whisper-large-v3 ์Œ์„ฑ ์ธ์ฝ”๋”๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜์—ฌ ./models/speech_encoder/ ๋””๋ ‰ํ† ๋ฆฌ์— ๋ฐฐ์น˜ ๐ŸŽค
  2. ์ถ”๋ก  ์‹คํ–‰:

    • ์Œ์„ฑ-ํ…์ŠคํŠธ(S2T) ์ถ”๋ก :
      python3 omni_speech/infer/bigvox.py --query_audio test_audio.wav
      

๐Ÿ”ง ํ›ˆ๋ จ ์„ธ๋ถ€์‚ฌํ•ญ

๋ฐ์ดํ„ฐ์…‹

  • VoiceAssistant: ํ•œ๊ตญ์–ด ๋Œ€ํ™” ์Œ์„ฑ ๋ฐ์ดํ„ฐ

ํ›ˆ๋ จ ์„ค์ •

  • Base Model: naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B
  • Hardware: 1x NVIDIA RTX 6000A GPU
  • Training Time: 3์‹œ๊ฐ„

โš ๏ธ ์ œํ•œ์‚ฌํ•ญ

  • ๋ฐฐ๊ฒฝ ์†Œ์Œ์ด ์‹ฌํ•œ ํ™˜๊ฒฝ์—์„œ๋Š” ์„ฑ๋Šฅ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
  • ๋งค์šฐ ๋น ๋ฅธ ๋ฐœํ™”๋‚˜ ์ค‘์–ผ๊ฑฐ๋ฆฌ๋Š” ๋งํˆฌ์— ๋Œ€ํ•ด์„œ๋Š” ์ธ์‹๋ฅ ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
  • ์ „๋ฌธ ์šฉ์–ด๋‚˜ ๊ณ ์œ ๋ช…์‚ฌ์— ๋Œ€ํ•œ ์ธ์‹๋ฅ ์€ ๋„๋ฉ”์ธ์— ๋”ฐ๋ผ ์ฐจ์ด๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

๐Ÿ“œ ๋ผ์ด์„ ์Šค

์ด ๋ชจ๋ธ์€ Apache 2.0 ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. ์ƒ์—…์  ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ, ์ž์„ธํ•œ ๋‚ด์šฉ์€ LICENSE ํŒŒ์ผ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๐Ÿ“ž ๋ฌธ์˜์‚ฌํ•ญ

  • ๊ฐœ๋ฐœ: BigDefence

๐Ÿ“ˆ ์—…๋ฐ์ดํŠธ ๋กœ๊ทธ

v1.0.0 (2024.12)

  • ๐ŸŽ‰ ์ดˆ๊ธฐ ๋ชจ๋ธ ๋ฆด๋ฆฌ์ฆˆ: Bigvox ๊ณต๊ฐœ
  • ๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด ํŠนํ™”: HyperCLOVAX-SEED-Text-Instruct-0.5B ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ์Œ์„ฑ-ํ…์ŠคํŠธ ์Œ์„ฑ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ

๐Ÿค ๊ธฐ์—ฌํ•˜๊ธฐ

Bigvox ํ”„๋กœ์ ํŠธ์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด:

BigDefence์™€ ํ•จ๊ป˜ ํ•œ๊ตญ์–ด AI ์Œ์„ฑ ์ธ์‹์˜ ๋ฏธ๋ž˜๋ฅผ ๋งŒ๋“ค์–ด๊ฐ€์„ธ์š”! ๐Ÿš€๐Ÿ‡ฐ๐Ÿ‡ท

"Every voice matters, every word counts - ๋ชจ๋“  ๋ชฉ์†Œ๋ฆฌ๊ฐ€ ์ค‘์š”ํ•˜๊ณ , ๋ชจ๋“  ๋ง์ด ๊ฐ€์น˜ ์žˆ์Šต๋‹ˆ๋‹ค"

Downloads last month
24
Safetensors
Model size
1.22B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bigdefence/Bigvox-HyperCLOVAX-Audio

Finetuned
(2)
this model