|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
datasets: |
|
|
- rfcx/frugalai |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
pipeline_tag: audio-classification |
|
|
tags: |
|
|
- acoustics |
|
|
- lgbm |
|
|
- frugality |
|
|
- signal-processing |
|
|
- climate |
|
|
- chainsaw |
|
|
--- |
|
|
# Quefrency Guardian: Chainsaw Noise Detector |
|
|
|
|
|
An efficient model to detect chainsaw activity in forest soundscapes using spectral and cepstral audio features. The model is designed for environmental conservation and is based on a LightGBM classifier, capable of low-energy inference on both CPU and GPU devices. This repository provides the complete code and configuration for feature extraction, model implementation, and deployment. |
|
|
|
|
|
## Installation |
|
|
|
|
|
To use the model, clone the repository and install the dependencies: |
|
|
|
|
|
```bash |
|
|
git clone https://huggingface.co/tlmk22/QuefrencyGuardian |
|
|
cd QuefrencyGuardian |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
### Features |
|
|
|
|
|
The model uses: |
|
|
- **Spectrogram Features** |
|
|
- **Cepstral Features**: Calculated as the FFT of the log spectrogram between [`f_min`-`f_max`] in a filtered quefrency range [`fc_min`-`fc_max`]. |
|
|
- **Time Averaging**: Both feature sets are averaged along the whole audio clip for robustness in noisy settings (Welch methodology) |
|
|
|
|
|
### LightGBM Model |
|
|
|
|
|
The model is a **binary classifier** (chainsaw vs environment) trained on the `rfcx/frugalai` dataset. |
|
|
Key model parameters are included in `model/lgbm_params.json`. |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Load the Model and Parameters |
|
|
|
|
|
```python |
|
|
import json |
|
|
from fast_model import FastModel |
|
|
|
|
|
# Load parameters |
|
|
with open("model/features.json", "r") as f: |
|
|
features = json.load(f) |
|
|
|
|
|
with open("model/lgbm_params.json", "r") as f: |
|
|
lgbm_params = json.load(f) |
|
|
|
|
|
# Initialize the model |
|
|
model = FastModel( |
|
|
feature_params=features, |
|
|
lgbm_params=lgbm_params, |
|
|
model_file="model/model.txt", # Path to the serialized model file |
|
|
device="cuda", |
|
|
) |
|
|
|
|
|
# Predict on a Dataset |
|
|
from datasets import load_dataset |
|
|
dataset = load_dataset("rfcx/frugalai") |
|
|
predictions = model.predict(dataset["test"]) |
|
|
print(predictions) |
|
|
``` |
|
|
|
|
|
### Performance |
|
|
|
|
|
- **Accuracy**: 95% on the test set with a 4.5% FPR at the default threshold. |
|
|
- **Low-Energy Mode**: Using only 1 second of audio inference reduces energy consumption by 50%, at the cost of ~1% accuracy. |
|
|
- **Environmental Impact**: Inference energy consumption is **0.21 Wh**, tracked using CodeCarbon. |
|
|
|
|
|
### License |
|
|
|
|
|
This project is licensed under the [Creative Commons Attribution Non-Commercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/). You are free to share and adapt the work for non-commercial purposes, provided attribution is given. |
|
|
|
|
|
--- |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The model was trained and evaluated on the [Rainforest Connection (RFCx) Frugal AI](https://huggingface.co/datasets/rfcx/frugalai) dataset. |
|
|
|
|
|
#### Labels: |
|
|
- `0`: Chainsaw |
|
|
- `1`: Environment |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Audio Length**: The classifier is designed for 1 to 3 seconds of audio sampled at either 12 kHz or 24 kHz. |
|
|
- **Environmental Noise**: The model might misclassify if recordings are noisy or machinery similar to chainsaws is present. |
|
|
|
|
|
--- |
|
|
|
|
|
This README serves as the primary documentation for Hugging Face and provides an overview of the model's purpose, data requirements, and usage. |