metadata
library_name: keras-hub
Model Overview
A Keras model implementing the RetinaNet meta-architecture.
Implements the RetinaNet architecture for object detection. The constructor
requires num_classes, bounding_box_format, and a backbone. Optionally,
a custom label encoder, and prediction decoder may be provided.
Links
- RetinaNet Quickstart Notebook
- RetinaNet API Documentation
- RetinaNet Model Card
- KerasHub Beginner Guide
- KerasHub Model Publishing Guide
Installation
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.
Presets
The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
| Preset name | Parameters | Description |
|---|---|---|
| retinanet_resnet50_fpn_coco | 34.12M | RetinaNet model with ResNet50 backbone fine-tuned on COCO in 800x800 resolution. |
Arguments
- num_classes: the number of classes in your dataset excluding the background class. Classes should be represented by integers in the range [0, num_classes).
- bounding_box_format: The format of bounding boxes of input dataset. Refer to the keras.io docs for more details on supported bounding box formats.
- backbone:
keras.Model. If the defaultfeature_pyramidis used, must implement thepyramid_level_inputsproperty with keys "P3", "P4", and "P5" and layer names as values. A somewhat sensible backbone to use in many cases is the:keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet") - anchor_generator: (Optional) a
keras_cv.layers.AnchorGenerator. If provided, the anchor generator will be passed to both thelabel_encoderand theprediction_decoder. Only to be used when bothlabel_encoderandprediction_decoderare bothNone. Defaults to an anchor generator with the parameterization:strides=[2**i for i in range(3, 8)],scales=[2**x for x in [0, 1 / 3, 2 / 3]],sizes=[32.0, 64.0, 128.0, 256.0, 512.0], andaspect_ratios=[0.5, 1.0, 2.0]. - label_encoder: (Optional) a keras.Layer that accepts an image Tensor, a
bounding box Tensor and a bounding box class Tensor to its
call()method, and returns RetinaNet training targets. By default, a KerasCV standardRetinaNetLabelEncoderis created and used. Results of this object'scall()method are passed to thelossobject forbox_lossandclassification_lossthey_trueargument. - prediction_decoder: (Optional) A
keras.layers.Layerthat is responsible for transforming RetinaNet predictions into usable bounding box Tensors. If not provided, a default is provided. The defaultprediction_decoderlayer is akeras_cv.layers.MultiClassNonMaxSuppressionlayer, which uses a Non-Max Suppression for box pruning. - feature_pyramid: (Optional) A
keras.layers.Layerthat produces a list of 4D feature maps (batch dimension included) when called on the pyramid-level outputs of thebackbone. If not provided, the reference implementation from the paper will be used. - classification_head: (Optional) A
keras.Layerthat performs classification of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used. - box_head: (Optional) A
keras.Layerthat performs regression of the bounding boxes. If not provided, a simple ConvNet with 3 layers will be used.
Example Usage
Pretrained RetinaNet model
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"retinanet_resnet50_fpn_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
Fine-tune the pre-trained model
backbone = keras_hub.models.Backbone.from_preset(
"retinanet_resnet50_fpn_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"retinanet_resnet50_fpn_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
Custom training the model
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
Example Usage with Hugging Face URI
Pretrained RetinaNet model
object_detector = keras_hub.models.ImageObjectDetector.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
input_data = np.random.uniform(0, 1, size=(2, 224, 224, 3))
object_detector(input_data)
Fine-tune the pre-trained model
backbone = keras_hub.models.Backbone.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor.from_preset(
"hf://keras/retinanet_resnet50_fpn_coco"
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)
Custom training the model
image_converter = keras_hub.layers.RetinaNetImageConverter(
scale=1/255
)
preprocessor = keras_hub.models.RetinaNetObjectDetectorPreprocessor(
image_converter=image_converter
)
# Load a pre-trained ResNet50 model.
# This will serve as the base for extracting image features.
image_encoder = keras_hub.models.Backbone.from_preset(
"resnet_50_imagenet"
)
# Build the RetinaNet Feature Pyramid Network (FPN) on top of the ResNet50
# backbone. The FPN creates multi-scale feature maps for better object detection
# at different sizes.
backbone = keras_hub.models.RetinaNetBackbone(
image_encoder=image_encoder,
min_level=3,
max_level=5,
use_p5=False
)
model = RetinaNetObjectDetector(
backbone=backbone,
num_classes=len(CLASSES),
preprocessor=preprocessor
)