IPA-Transcription-EN / DEVELOPMENT.md
SanderGi's picture
fix and make functional, add more datasets
c2e60bb

A newer version of the Gradio SDK is available: 5.45.0

Upgrade

Development

Design Decisions

We specifically opt for a single-space leaderboard for simplicity. We solve the issue of keeping the gradio UI interactive while models are evaluating by using multiprocessing instead of a separate space. Leaderboard entries are persisted in a Huggingface Dataset to avoid paying for persistent storage. Tasks are deliberately ephemeral.

Local Setup

Prerequisites

Quick Installation

  1. Make sure git-lfs is installed (https://git-lfs.com)
git lfs install
  1. Clone this repository:
git clone https://huggingface.co/spaces/KoelLabs/IPA-Transcription-EN
  1. Setup your environment:
# Create a virtual environment with Python 3.10
python3.10 -m venv venv

# Activate the virtual environment
. ./venv/bin/activate
# use `deactivate` to exit out of it

# Install the required dependencies
pip install -r requirements_lock.txt

# Add a HF_TOKEN with access to your backing dataset (in app/hf.py) and any models you want to be able to run
huggingface-cli login
  1. Launch the leaderboard:
. ./scripts/run-dev.sh      # development mode (auto-reloads)
. ./scripts/run-prod.sh     # production mode (no auto-reloads)
  1. Visit http://localhost:7860 in your browser and see the magic! ✨

Adding New Datasets

The datasets are pre-processed into a single dataset stored in app/data/test with three columns: audio (16 kHz), ipa, and dataset (original source). This is done using the scripts/sample_test_set.py file. To add new datasets, add them to this script. Beware that existing leaderboard entries will need to be recalculated. You can do this locally by accessing the dataset corresponding to LEADERBOARD_ID stored in app/hf.py.

Adding/Removing Dependencies

  1. Activate the virtual environment with . ./venv/bin/activate
  2. Add the dependency to requirements.txt (or remove it)
  3. Make sure you have no unused dependencies with pipx run deptry . (if necessary python -m pip install pipx)
  4. Run pip install -r requirements.txt
  5. Freeze the dependencies with pip freeze > requirements_lock.txt

Forking Into Your Own Leaderboard

  1. Navigate to the space, click the three dots on the right and select Duplicate this Space
  2. Modify the LEADERBOARD_ID in app/hf.py to be some dataset that you own that the new space can use to store data. You don't need to create the dataset but if you do, it should be empty.
  3. Open the settings in your new space and add a new secret HF_TOKEN. You can create it here. It just needs read access to all models you want to add to the leaderboard and write access to the private backing dataset specified by LEADERBOARD_ID.
  4. Submit some models and enjoy!

File Structure

The two most imporant files are app/app.py for the main gradio UI and app/tasks.py for the background tasks that evaluate models.

IPA-Transcription-EN/
β”œβ”€β”€ README.md                   # General information about the leaderboard
β”œβ”€β”€ CONTRIBUTING.md             # Contribution guidelines
β”œβ”€β”€ DEVELOPMENT.md              # Development setup and design decisions
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ requirements_lock.txt       # Locked dependencies
β”œβ”€β”€ scripts                     # Helper scripts    
β”‚   β”œβ”€β”€ sample_test_set.py      # Compute the combined test set
β”‚   β”œβ”€β”€ run-prod.sh             # Run the leaderboard in production mode
β”‚   └── run-dev.sh              # Run the leaderboard in development mode
β”œβ”€β”€ venv                        # Virtual environment
β”œβ”€β”€ app/                        # All application code lives here
β”‚   β”œβ”€β”€ data/                   # Phoneme transcription test set
β”‚   β”œβ”€β”€ app.py                  # Main Gradio UI
β”‚   β”œβ”€β”€ hf.py                   # Interface with the Huggingface API
β”‚   β”œβ”€β”€ inference.py            # Model inference
β”‚   └── metrics.py              # Evaluation metrics
β”‚   β”œβ”€β”€ tasks.py                # Background tasks for model evaluation
└── img/                        # Images for README and other documentation