YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Image Classification with AutoGluon

This project demonstrates how to perform image classification using AutoGluon's MultiModalPredictor on an external dataset, adhering to the specified rubric.

Dataset

The dataset used is mohitk24/image_dataset from Hugging Face, created by a classmate. The dataset contains images for a binary classification task, where the goal is to classify images into one of two classes based on the provided labels (0 and 1).

The original dataset has 30 images, and an augmented split with 300 images is used for training to increase the dataset size and improve model generalization.

Supervised Task

The supervised task is binary image classification. Given an image, the model predicts whether it belongs to class 0(pen) or class 1(toy).

AutoML and Model Training

AutoGluon's MultiModalPredictor was used to automate the model training process. The training was conducted with the following considerations:

Budgeted AutoML: The presets="medium_quality" was used to guide the AutoML process towards a reasonable trade-off between performance and training time.
Architectures/Hyperparameters: The model architecture was specified as resnet18 using hyperparameters={"model.names": ["timm_image"], "model.timm_image.checkpoint_name": "resnet18"}. AutoGluon automatically handles other hyperparameters like optimizer, learning rate, weight decay, and augmentation based on the chosen preset.
Early Stopping: AutoGluon internally uses early stopping based on the validation metric (accuracy) to prevent overfitting and optimize training time.
Validation Scheme: The data was split into training (240 images) and validation (60 images) sets using sklearn.model_selection.train_test_split with test_size=0.2 and stratify=df_aug["label"] to ensure a balanced split. The validation set (df_aug_test) was passed to the tuning_data parameter of the predictor.fit() method for evaluating performance during training and for early stopping.

Training Summary

The predictor.fit_summary() provides details about the training process. Key information includes:

Best Architecture: resnet18
Hyperparameters: AutoGluon automatically tunes various hyperparameters. The specific values used can be inspected in the hparams.yaml file within the saved predictor artifact.
Training Curves/Early-Stop Rationale: The training progress and validation metric over epochs can be visualized using TensorBoard with the logs saved in the predictor artifact directory. Early stopping was triggered based on the val_accuracy metric, saving the model checkpoints that achieved the top validation accuracy. The logs show the validation accuracy improving over epochs and checkpoints being saved when a better validation score is achieved. The training stopped after a certain number of epochs without improvement on the validation set.
Training time: The training time was reported as 1366.70 seconds (~22 minutes).

Test Metrics

After training, the model was evaluated on the original, non-augmented dataset (df_orig). The following metrics were calculated:

Accuracy: 0.9667
Weighted F1 Score: 0.9662

These metrics were calculated using sklearn.metrics.accuracy_score and sklearn.metrics.f1_score.

A confusion matrix or per-class metrics could be further calculated using sklearn.metrics.confusion_matrix or sklearn.metrics.classification_report for a more detailed analysis of the model's performance on each class.

Reproducibility

This Colab notebook is designed to be reproducible.

Fixed Seeds: While AutoGluon handles some internal randomness, setting random_state=42 in sklearn.model_selection.train_test_split ensures the same data split each time. For full reproducibility with AutoGluon, you might need to set additional seeds as per their documentation.
Stated Compute Budget: The training time observed on a standard Colab environment (with CPU, as indicated by the logs) was approximately 22 minutes. Using a GPU would significantly reduce this time.

Hugging Face Model Upload

The trained model was uploaded to the Hugging Face Hub repository: FaiyazAzam/24679-image-autolguon-predictor.

The repository contains the following:

autogluon_image_predictor_dir.zip: A zipped archive of the native AutoGluon predictor directory, allowing the model to be loaded directly using AutoGluon.multimodal.MultiModalPredictor.load().
autogluon_image_predictor.pkl: A pickled version of the MultiModalPredictor object using cloudpickle, providing another way to load the model.
Model Details: Information about the model architecture (resnet18) and the task (binary image classification).
Training Details: Summary of the training process, including the dataset used, hyperparameters (implicitly handled by AutoGluon presets and the specified backbone), training curves (referencing the TensorBoard logs), and early stopping.
Evaluation Results: The accuracy and weighted F1 score on the original test set.
Known Failure Modes: Potential failure modes could include performance degradation on images significantly different from the training data, or issues with images that are not clear or contain multiple objects. Further analysis on specific image types or edge cases would be needed to identify more specific failure modes.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

FaiyazAzam
/

24679-image-autolguon-predictor