QuefrencyGuardian / README.md

tlemagueresse

Modify default parameters. Update README.md

982823f 9 months ago

3.22 kB

metadata

license: cc-by-nc-4.0
datasets:
  - rfcx/frugalai
language:
  - en
metrics:
  - accuracy
pipeline_tag: audio-classification
tags:
  - acoustics
  - lgbm
  - frugality
  - signal-processing
  - climate
  - chainsaw

Quefrency Guardian: Chainsaw Noise Detector

An efficient model to detect chainsaw activity in forest soundscapes using spectral and cepstral audio features. The model is designed for environmental conservation and is based on a LightGBM classifier, capable of low-energy inference on both CPU and GPU devices. This repository provides the complete code and configuration for feature extraction, model implementation, and deployment.

Installation

To use the model, clone the repository and install the dependencies:

git clone https://huggingface.co/tlmk22/QuefrencyGuardian
cd QuefrencyGuardian
pip install -r requirements.txt

Model Overview

Features

The model uses:

Spectrogram Features
Cepstral Features: Calculated as the FFT of the log spectrogram between [f_min-f_max] in a filtered quefrency range [fc_min-fc_max].
Time Averaging: Both feature sets are averaged along the whole audio clip for robustness in noisy settings (Welch methodology)

LightGBM Model

The model is a binary classifier (chainsaw vs environment) trained on the rfcx/frugalai dataset. Key model parameters are included in model/lgbm_params.json.

Usage

Load the Model and Parameters

import json
from fast_model import FastModel

# Load parameters
with open("model/features.json", "r") as f:
    features = json.load(f)

with open("model/lgbm_params.json", "r") as f:
    lgbm_params = json.load(f)

# Initialize the model
model = FastModel(
    feature_params=features,
    lgbm_params=lgbm_params,
    model_file="model/model.txt",  # Path to the serialized model file
    device="cuda", 
)

# Predict on a Dataset
from datasets import load_dataset
dataset = load_dataset("rfcx/frugalai")
predictions = model.predict(dataset["test"])
print(predictions)

Performance

Accuracy: 95% on the test set with a 4.5% FPR at the default threshold.
Low-Energy Mode: Using only 1 second of audio inference reduces energy consumption by 50%, at the cost of ~1% accuracy.
Environmental Impact: Inference energy consumption is 0.21 Wh, tracked using CodeCarbon.

License

This project is licensed under the Creative Commons Attribution Non-Commercial 4.0 International. You are free to share and adapt the work for non-commercial purposes, provided attribution is given.

Dataset

The model was trained and evaluated on the Rainforest Connection (RFCx) Frugal AI dataset.

Labels:

0: Chainsaw
1: Environment

Limitations

Audio Length: The classifier is designed for 1 to 3 seconds of audio sampled at either 12 kHz or 24 kHz.
Environmental Noise: The model might misclassify if recordings are noisy or machinery similar to chainsaws is present.

This README serves as the primary documentation for Hugging Face and provides an overview of the model's purpose, data requirements, and usage.