Toxic-Predict

Toxic-Predict is a machine learning project developed as part of the Cellula Internship, focused on safe and responsible multi-modal toxic content moderation. It classifies text queries and image descriptions into nine toxicity categories such as "Safe", "Violent Crimes", "Non-Violent Crimes", "Unsafe", and others. The project leverages deep learning (Keras/TensorFlow), NLP preprocessing, and benchmarking with modern transformer models to build and evaluate a robust multi-class toxic content classifier.


🚩 Project Context

This project is part of the Cellula Internship proposal:
"Safe and Responsible Multi-Modal Toxic Content Moderation"
The goal is to build a dual-stage moderation pipeline for both text and images, combining hard guardrails (Llama Guard) and soft classification (DistilBERT/Deep Learning) for nuanced, policy-compliant moderation.


Features

  • Dual-stage moderation: hard filter (Llama Guard) + soft classifier (DistilBERT/CNN/LSTM)
  • Data cleaning, preprocessing, and label encoding
  • Tokenization and sequence padding for text data
  • Deep learning and transformer-based models for multi-class toxicity classification
  • Evaluation metrics: classification report and confusion matrix
  • Jupyter notebooks for data exploration and model development
  • Streamlit web app for demo and deployment


Usage

  • Preprocessing and Tokenization:
    See notebooks/Preprocessing.ipynb and notebooks/tokenization.ipynb for step-by-step data cleaning, splitting, and tokenization.
  • Model Training:
    Model architecture and training code are in models/model.py.
  • Inference:
    Load the trained model (models/toxic_classifier.h5 or .keras) and tokenizer (data/tokenizer.pkl) to predict toxicity categories for new samples.

Data

  • CSV files with columns: query, image descriptions, Toxic Category, and Toxic Category Encoded.
  • Data splits: train.csv, eval.csv, test.csv, and cleaned.csv for processed data.
  • 9 categories: Safe, Violent Crimes, Elections, Sex-Related Crimes, Unsafe, Non-Violent Crimes, Child Sexual Exploitation, Unknown S-Type, Suicide & Self-Harm.

Model

  • Deep learning model built with Keras (TensorFlow backend).
  • Multi-class classification with label encoding for toxicity categories.
  • Benchmarking with PEFT-LoRA DistilBERT and baseline CNN/LSTM.

Evaluation

  • Classification report and confusion matrix are generated for model evaluation.
  • See the evaluation steps in notebooks/Preprocessing.ipynb.

language: en

πŸ€— Hugging Face Inference

This model is available on the Hugging Face Hub: NightPrince/Toxic_Classification

Inference API Usage

You can use the Hugging Face Inference API or widget with two fields:

  • text: The main query or post text
  • image_desc: The image description (if any)

Example (Python):

from huggingface_hub import InferenceClient
client = InferenceClient("NightPrince/Toxic_Classification")
result = client.text_classification({
    "text": "This is a dangerous post",
    "image_desc": "Knife shown in the image"
})
print(result)  # {'label': 'toxic', 'score': 0.98}

Custom Pipeline Details

  • The model uses a custom pipeline.py for multi-input inference.
  • The output is a dictionary with the predicted label (class name) and score (confidence).
  • Class names are mapped using label_map.json.

Files in the repo:

  • pipeline.py (custom inference logic)
  • tokenizer.json (Keras tokenizer)
  • label_map.json (class code to name mapping)
  • TensorFlow SavedModel files (saved_model.pb, variables/)

Requirements:

tensorflow
keras
numpy


πŸ“š Resources


License

MIT License


Author: Yahya Muhammad Alnwsany
Contact: [email protected]
Portfolio: https://nightprincey.github.io/Portfolio/

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support