Model Card Vuurwerkverkenner

This model, developed by the Netherlands Forensic Institute, is designed to link fragments from exploded fireworks to their corresponding firework types. An application utilizing this model is available at www.vuurwerkverkenner.nl.

Architecture

The classification process involves two components: an embedding model that generates embeddings, and a classification model that determines classifications based on the distances between these embeddings. While the classification component aids in model evaluation, in practice, the embedding model compares embeddings of wrappers in the database to the embedding of the snippet image provided. This setup allows for the addition of new wrappers without the need to retrain the model.

Embedding model

Initially, we train an embedding model that ensures similar embeddings for snippets from the same source, and diverse embeddings for snippets from different sources. This model is based on the Vision Transformer architecture (arXiv) and fine-tuned with the following specifications:

Model: ViT-B/32, with an L2-normalized linear layer as embedding head
Input: RBG image of 448x448 pixels
Output/embedding layer size: 128
Training loss: ProxyAnchorLoss ( see here) with margin = 0.5 and alpha = 64
Fixed learning rate of 1e-4 for the model weights and 1e-2 for the proxy vectors with AdamW optimizer
Batch size: 150
Epochs: 20

Classification

To connect a snippet photo to a firework wrapper, reference embeddings are generated for comparison from a background dataset using the trained embedding model. Similarly, we generate an embedding for the snippet photo. Classification is achieved by calculating the cosine distance between the snippet photo embedding and the reference embeddings for each firework wrapper. The minimum distance among the reference embeddings determines the representative score for each category.

Text filter

A text filter can be optionally applied following classification, which matches fireworks labels based on text found on the snippet. The snippet text must be manually entered, and all text fragments must be present on the label to get a a match.

Data

The model is trained and evaluated using data from fireworks involved in cases at the Netherlands Forensic Institute since 2010. The dataset is divided into three parts, with the train and validation used in the training and model selection and final model in the application trained on all data except for a holdout set. Further information on the development and application data can be found here and here.

Real snippets

We have generated snippets for the available firework categories by detonating the fireworks. These real snippets (also called 'lab snippets') are photographed with a high-quality DSLR camera against a white background, with optimal lighting conditions. The snippets are segmented, distributed across train, validation, and holdout sets, and grouped into images containing 1 to 10 snippets.

Mock-crime scene snippets

In certain categories, we have created photos that imitate crime scene conditions, e.g. by using suboptimal lighting and/or a phone camera. To optimize model performance, less background noise is desirable, hence photos are created with snippets set against 'DNA blankets,' providing a somewhat uniform background.

Artificial snippets

To ensure the embedding model outputs embeddings for all firework wrappers, including those without real snippets, we create 'artificial snippets' by randomly cropping wrapper images. Each artificial snippet image comprises 1 to 10 snippet pieces, creating a number of images per wrapper. An additional set is generated for each wrapper to serve as the reference dataset of which embeddings are stored for comparisons against the provided image in the application.

Evaluation

To assess differences in performance across conditions, we formulate a test set featuring artificial, real, and mock-pd images. The evaluation encompasses the entire set and reviews snippet types and performance across categories with numerous similar wrappers.

Metrics

Metric	Value
RecallAtKValidator(k=1)	0.9475017269168777
RecallAtKValidator(k=3)	0.9715634354133088
RecallAtKValidator(k=5)	0.9757080359198711
CategoricalRecallAtKValidator(k=1)	0.985493898227032
CategoricalRecallAtKValidator(k=5)	0.9945889937830993

Limitations

The evaluation results may not depict the model's real-world performance due to several factors. Training and testing have occurred exclusively with snippets featuring plain backgrounds and optimal lighting. This might not always be achievable in practice, as model performance is likely heightened with better-quality photos, ample distinctive snippets, and properly entered text. Conversely, performance may diminish when these criteria are unmet. Additionally, if the firework type under scrutiny is novel or rare, it may be absent from the reference database and thus unattainable by the model.

Using the model

This model is intended for use with the Vuurwerkverkenner application, which includes the necessary code for operation. The application's source code can be accessed on GitHub.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

NetherlandsForensicInstitute
/

vuurwerkverkenner