YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Source: CrisisMMD (Alam et al., 2017)

Data Type: Multimodal β€” each sample includes:

tweet_text (social media text)

tweet_image (corresponding image from the tweet)

Total Samples Used: ~18,802(from the dataset)

Class Labels:

0 β†’ Non-informative

1 β†’ Informative

Collect only values where tweet_text and tweet_image are equal. (thus collected 12,743 tweets and convert it into test and train .pt files)

βœ… Preprocessing Done Text:

Tokenized using BERT tokenizer (bert-base-uncased)

Extracted input_ids and attention_mask

Image:

Processed using ResNet-50

Extracted 2048-dimensional feature vectors

Label:

Encoded to 0 or 1 as per class

The final preprocessed dataset was saved as .pt files:

train_info.pt

test_info.pt

Each contains: input_ids, attention_mask, image_vector, and label tensors.

βœ… Model Architecture A custom multimodal neural network combining both BERT and ResNet features:

Component Details Text Encoder BERT base model (bert-base-uncased) – outputs pooler_output (768-d) Image Encoder ResNet-50 pre-extracted features (2048-d) Fusion Concatenation β†’ FC layers β†’ Softmax Classifier Fully connected layers with BatchNorm, ReLU, Dropout

βœ… Training Setup Loss Function: CrossEntropyLoss

Optimizer: AdamW

Scheduler: StepLR (Ξ³ = 0.9)

Epochs: 8

Batch Size: 16

Device: CUDA (if available)

βœ… Evaluation Metrics Accuracy

Precision

Recall

F1 Score

βœ… Test Accuracy : 0.8518 βœ… Precision : 0.8289 βœ… Recall : 0.8032 βœ… F1 Score : 0.8142

Newly created dataset: https://huggingface.co/datasets/Henishma/crisisMMD_cleaned_task1

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support