File size: 2,403 Bytes

---
datasets:
- QCRI/CrisisMMD
language:
- en
metrics:
- accuracy
- f1
- recall
- precision
base_model:
- google-bert/bert-base-uncased
- microsoft/resnet-50
---
Source: CrisisMMD dataset (Alam et al., 2017)

✅Original Labels (8 classes from annotations):

Infrastructure and utility damage

Vehicle damage

Rescue, volunteering, or donation efforts

Affected individuals

Injured or dead people

Missing or found people

Other relevant information

Not humanitarian

✅Label Preprocessing (Class Merging):

Vehicle damage merged into Infrastructure and utility damage

Missing or found people merged into Affected individuals

Not humanitarian retained as a separate class

Removed very low-frequency categories (e.g., "Missing or found people" as a separate class)

✅Final Label Set (5 classes total):

Infrastructure and utility damage

Rescue, volunteering, or donation efforts

Affected individuals

Injured or dead people

Not humanitarian

✅Multimodal Consistency:

Selected only those posts where text and image annotations matched

Resulted in a total of 8,219 consistent samples:

Train set: 6,574 posts

Test set: 1,644 posts

✅ Preprocessing Done
Text:

Tokenized using BERT tokenizer (bert-base-uncased)

Extracted input_ids and attention_mask

Image:

Processed using ResNet-50

Extracted 2048-dimensional image features

The preprocessed data was saved in PyTorch .pt format:

train_human.pt and test_human.pt

Each contains: input_ids, attention_mask, image_vector, and label

✅ Model Architecture
A custom multimodal classifier that combines BERT and ResNet-50 outputs:

Component	Details
Text Encoder	BERT base (bert-base-uncased) – outputs pooler_output (768-d)
Image Encoder	Pre-extracted ResNet-50 image features (2048-d)
Fusion	Concatenation → FC layers → Softmax over 5 classes
Classifier	Fully connected layers with BatchNorm, ReLU, Dropout

✅ Training Setup
Loss Function: CrossEntropyLoss

Optimizer: AdamW

Scheduler: StepLR (γ = 0.9)

Epochs Tried: 1, 3, 5, 8, 10

Batch Size: 16

Runtime: ~2 minutes 20 seconds per epoch on Google Colab (T4 GPU)

✅ Evaluation Metrics
Accuracy

Precision

Recall

F1 Score

✅ Metrics(epoch 3 with highest accuracy)
 
✅ Test Accuracy : 0.8820
✅ Precision     : 0.6854
✅ Recall        : 0.7176
✅ F1 Score      : 0.7005

The new dataset created: https://huggingface.co/datasets/Henishma/crisisMMD_cleaned_task2