YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Source: CrisisMMD (Alam et al., 2017)

Data Type: Multimodal — each sample includes:

tweet_text (social media text)

tweet_image (corresponding image from the tweet)

Total Samples Used: ~18,802(from the dataset)

Class Labels:

0 → Non-informative

1 → Informative

Collect only values where tweet_text and tweet_image are equal. (thus collected 12,743 tweets and convert it into test and train .pt files)

✅ Preprocessing Done Text:

Tokenized using BERT tokenizer (bert-base-uncased)

Extracted input_ids and attention_mask

Image:

Processed using ResNet-50

Extracted 2048-dimensional feature vectors

Label:

Encoded to 0 or 1 as per class

The final preprocessed dataset was saved as .pt files:

train_info.pt

test_info.pt

Each contains: input_ids, attention_mask, image_vector, and label tensors.

✅ Model Architecture A custom multimodal neural network combining both BERT and ResNet features:

Component Details Text Encoder BERT base model (bert-base-uncased) – outputs pooler_output (768-d) Image Encoder ResNet-50 pre-extracted features (2048-d) Fusion Concatenation → FC layers → Softmax Classifier Fully connected layers with BatchNorm, ReLU, Dropout

✅ Training Setup Loss Function: CrossEntropyLoss

Optimizer: AdamW

Scheduler: StepLR (γ = 0.9)

Epochs: 8

Batch Size: 16

Device: CUDA (if available)

✅ Evaluation Metrics Accuracy

Precision

Recall

F1 Score

✅ Test Accuracy : 0.8518 ✅ Precision : 0.8289 ✅ Recall : 0.8032 ✅ F1 Score : 0.8142

Newly created dataset: https://huggingface.co/datasets/Henishma/crisisMMD_cleaned_task1

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support