Instructions to use aap9002/NLI-BILSTM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Keras
How to use aap9002/NLI-BILSTM with Keras:
# Available backend options are: "jax", "torch", "tensorflow". import os os.environ["KERAS_BACKEND"] = "jax" import keras model = keras.saving.load_model("hf://aap9002/NLI-BILSTM") - Notebooks
- Google Colab
- Kaggle
language: en license: cc-by-4.0 tags:
- text-classification repo: https://github.com/AAP9002/COMP34812-NLU-NLI
Model Card for z72819ap-e91802zc-NLI
This is a classification model that was trained to detect whether a premise and hypothesis entail each other or not, using binary classification.
Model Details
Model Description
This model is based upon the Enhanced LSTM for Natural Language Inference architecture using BILSTM instead of LSTM trained on over 24K premise-hypothesis pairs from the shared task dataset for Natural Language Inference (NLI).
- Developed by: Alan Prophett and Zac Curtis
- Language(s): English
- Model type: Supervised
- Model architecture: BILSTM
- Finetuned from model [optional]: None
Model Resources
- Repository: None
- Paper or documentation: None
Training Details
Training Data
24K+ premise-hypothesis pairs from the shared task dataset provided for Natural Language Inference (NLI).
Training Procedure
Training Hyperparameters
- seed: 42
- learning_rate: 1e-04
- train_batch_size: 64
- eval_batch_size: 64
- num_epochs: 20
Speeds, Sizes, Times
- overall training time: 3 minutes 4 seconds
- duration per training epoch: 34 seconds
- model size: 30.7 MB
Evaluation
Testing Data & Metrics
Testing Data
A subset of the development set provided, amounting to 6K+ pairs.
Metrics
- Recall
- F1-score
- Accuracy
Results
The BILSTM RNN Model obtained an F1-score of 70% and an accuracy of 70%.
Technical Specifications
Hardware
- RAM: at least 25 GB
- Storage: at least 38.1 GB,
- GPU: a100 40GB
Software
- Tensorflow 2.18.0+cu12.4
- Pandas 2.2.2
- NumPy 2.0.2
- Seaborn 0.13.2
- Matplotlib 3.10.0
- Scikit-learn 1.6.1
Bias, Risks, and Limitations
Any inputs (concatenation of two sequences) longer than 512 subwords will be truncated by the model.
Additional Information
The hyperparameters were determined by experimentation with different values.
- Downloads last month
- 6
# Available backend options are: "jax", "torch", "tensorflow". import os os.environ["KERAS_BACKEND"] = "jax" import keras model = keras.saving.load_model("hf://aap9002/NLI-BILSTM")