You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Dia TTS - Dhivehi Fine-tuned Model

This is a fine-tuned version of nari-labs/Dia-1.6B specifically trained for Dhivehi (Maldivian) text-to-speech synthesis.

Model Description

  • Base Model: Dia-1.6B
  • Language: Mixed, Dhivehi (dv)
  • Task: Text-to-Speech (TTS)
  • Fine-tuning: Specialized for Dhivehi audio synthesis

Usage

# Install Dia library first:
# pip install git+https://github.com/nari-labs/dia.git
# pip install soundfile

from dia.model import Dia
import soundfile as sf
import torch

print("🎤 Testing Dhivehi Dia TTS model...")

try:
    # Load your fine-tuned model
    print("📥 Loading model from HuggingFace...")
    model = Dia.from_pretrained("alakxender/Dia-1.6B-dhivehi-18k")
    print("✓ Model loaded successfully!")
    
    # Test texts - Basic samples
    test_samples = {
        # Basic samples
        "basic_english": "Hello, this is a test.",
        "basic_dhivehi": "އައްސަލާމް ޢަލައިކުމް، މިއީ ވަކި ޓެސްޓެކެވެ.",
        
        # Mixed language tests
        "mixed_greeting": "Hello އައްސަލާމް ޢަލައިކުމް، how are you? ހާލު ކިހިނެއް؟",

        # Emotional expressions and sounds
        "with_laughter": "That was so funny! (laughs) ވަރަށް މަޖާ އެނގޭ! (laughs) I can't stop laughing!",
        
        # Complex emotional scenarios
        "happy_announcement": "(laughs) Guess what? ބަލާ! I got the job! އަހަރެން ވަޒީފާ ލިބުނު! (claps) (claps) (laughs)",
        "achievement": "After years of hard work... (claps) finally! އެންމެ ފަހުން! I graduated! އަހަރެން ފުރިހަމަ ކުރީ! (claps) (claps) (laughs)"
    }
    
    print("\n🗣️  Generating speech samples...")
    generated_files = []
    
    for name, text in test_samples.items():
        try:
            print(f"🎤 Generating: {name}")
            print(f"   Text: {text[:60]}{'...' if len(text) > 60 else ''}")
            
            output = model.generate(text)
            filename = f"{name}.wav"
            sf.write(filename, output, 44100)
            generated_files.append((filename, len(output)))
            print(f"   ✓ Saved: {filename} ({len(output)/44100:.2f}s)")
            
        except Exception as e:
            print(f"   ❌ Failed to generate {name}: {e}")
    
    print(f"\n🎉 TTS generation completed!")
    print(f"📁 Generated {len(generated_files)} audio files:")
    
    total_duration = 0
    for filename, samples in generated_files:
        duration = samples / 44100
        total_duration += duration
        print(f"   - {filename:<25} ({duration:.2f}s)")
    
    print(f"\n📊 Total audio generated: {total_duration:.2f} seconds")
    
except ImportError as e:
    print("❌ Missing dependencies. Please install:")
    print("   pip install git+https://github.com/nari-labs/dia.git")
    print("   pip install soundfile")
    print(f"   Error: {e}")
    
except Exception as e:
    print(f"❌ Error during TTS generation: {e}")
    print("💡 Make sure the model was uploaded correctly and is accessible")

Training Details

  • Base Model: nari-labs/Dia-1.6B
  • Training Data: Dhivehi audio dataset
  • Fine-tuning Approach: Direct training on Dhivehi audio without language tags
  • Checkpoint: Step 18,000

Model Performance

This model has been specifically fine-tuned for Dhivehi speech synthesis, providing natural-sounding speech generation for Dhivehi text input.

Note: This was stopped at step 18k, find the full run at alakxender/Dia-1.6B-dhivehi-ep1

Limitations

  • Optimized specifically for Dhivehi language
  • May not perform well on other languages
  • Performance depends on input text quality and pronunciation patterns

License

This model is released under the Apache 2.0 License, following the original Dia model licensing.

Downloads last month
24
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/Dia-1.6B-dhivehi-18k

Base model

nari-labs/Dia-1.6B
Finetuned
(21)
this model

Dataset used to train alakxender/Dia-1.6B-dhivehi-18k

Space using alakxender/Dia-1.6B-dhivehi-18k 1

Collection including alakxender/Dia-1.6B-dhivehi-18k