tlmk22
/

QuefrencyGuardian

Audio Classification

audio-spectrogram-transformer

signal-processing

Model card Files Files and versions

QuefrencyGuardian / README.md

tlemagueresse

Modify default parameters. Update README.md

982823f 10 months ago

|

3.22 kB

	---
	license: cc-by-nc-4.0
	datasets:
	- rfcx/frugalai
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: audio-classification
	tags:
	- acoustics
	- lgbm
	- frugality
	- signal-processing
	- climate
	- chainsaw
	---
	# Quefrency Guardian: Chainsaw Noise Detector

	An efficient model to detect chainsaw activity in forest soundscapes using spectral and cepstral audio features. The model is designed for environmental conservation and is based on a LightGBM classifier, capable of low-energy inference on both CPU and GPU devices. This repository provides the complete code and configuration for feature extraction, model implementation, and deployment.

	## Installation

	To use the model, clone the repository and install the dependencies:

	```bash
	git clone https://huggingface.co/tlmk22/QuefrencyGuardian
	cd QuefrencyGuardian
	pip install -r requirements.txt
	```

	## Model Overview

	### Features

	The model uses:
	- Spectrogram Features
	- Cepstral Features: Calculated as the FFT of the log spectrogram between [`f_min`-`f_max`] in a filtered quefrency range [`fc_min`-`fc_max`].
	- Time Averaging: Both feature sets are averaged along the whole audio clip for robustness in noisy settings (Welch methodology)

	### LightGBM Model

	The model is a binary classifier (chainsaw vs environment) trained on the `rfcx/frugalai` dataset.
	Key model parameters are included in `model/lgbm_params.json`.

	## Usage

	### Load the Model and Parameters

	```python
	import json
	from fast_model import FastModel

	# Load parameters
	with open("model/features.json", "r") as f:
	features = json.load(f)

	with open("model/lgbm_params.json", "r") as f:
	lgbm_params = json.load(f)

	# Initialize the model
	model = FastModel(
	feature_params=features,
	lgbm_params=lgbm_params,
	model_file="model/model.txt", # Path to the serialized model file
	device="cuda",
	)

	# Predict on a Dataset
	from datasets import load_dataset
	dataset = load_dataset("rfcx/frugalai")
	predictions = model.predict(dataset["test"])
	print(predictions)
	```

	### Performance

	- Accuracy: 95% on the test set with a 4.5% FPR at the default threshold.
	- Low-Energy Mode: Using only 1 second of audio inference reduces energy consumption by 50%, at the cost of ~1% accuracy.
	- Environmental Impact: Inference energy consumption is 0.21 Wh, tracked using CodeCarbon.

	### License

	This project is licensed under the [Creative Commons Attribution Non-Commercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/). You are free to share and adapt the work for non-commercial purposes, provided attribution is given.

	---

	## Dataset

	The model was trained and evaluated on the [Rainforest Connection (RFCx) Frugal AI](https://huggingface.co/datasets/rfcx/frugalai) dataset.

	#### Labels:
	- `0`: Chainsaw
	- `1`: Environment

	## Limitations

	- Audio Length: The classifier is designed for 1 to 3 seconds of audio sampled at either 12 kHz or 24 kHz.
	- Environmental Noise: The model might misclassify if recordings are noisy or machinery similar to chainsaws is present.

	---

	This README serves as the primary documentation for Hugging Face and provides an overview of the model's purpose, data requirements, and usage.