| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - faster-rcnn |
| - object-detection |
| - computer-vision |
| - pytorch |
| - kitti |
| - autonomous-driving |
| - from-scratch |
| pipeline_tag: object-detection |
| datasets: |
| - kitti |
| widget: |
| - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png |
| example_title: "Sample Image" |
| model-index: |
| - name: faster-rcnn-kitti-vanilla |
| results: |
| - task: |
| type: object-detection |
| dataset: |
| type: kitti |
| name: KITTI Object Detection |
| metrics: |
| - type: mean_average_precision |
| name: mAP |
| value: "TBD" |
| --- |
| |
| # Faster R-CNN - KITTI Object Detection Vanilla |
|
|
| Faster R-CNN model trained from scratch on KITTI dataset for autonomous driving object detection. |
|
|
| ## Model Details |
|
|
| - **Model Type**: Faster R-CNN Object Detection |
| - **Dataset**: KITTI Object Detection |
| - **Training Method**: trained from scratch |
| - **Framework**: PyTorch |
| - **Task**: Object Detection |
|
|
| ## Dataset Information |
|
|
| This model was trained on the **KITTI Object Detection** dataset, which contains the following object classes: |
|
|
| car, pedestrian, cyclist |
|
|
| ### Dataset-specific Details: |
|
|
| **KITTI Object Detection Dataset:** |
| - Real-world autonomous driving dataset |
| - Contains stereo imagery from vehicle-mounted cameras |
| - Focus on cars, pedestrians, and cyclists |
| - Challenging scenarios with varying lighting and weather conditions |
|
|
| ## Usage |
|
|
| This model can be used with PyTorch and common object detection frameworks: |
|
|
| ```python |
| import torch |
| import torchvision.transforms as transforms |
| from PIL import Image |
| |
| # Load the model (example using torchvision) |
| model = torch.load('path/to/model.pth') |
| model.eval() |
| |
| # Prepare your image |
| transform = transforms.Compose([ |
| transforms.ToTensor(), |
| ]) |
| |
| image = Image.open('path/to/image.jpg') |
| image_tensor = transform(image).unsqueeze(0) |
| |
| # Run inference |
| with torch.no_grad(): |
| predictions = model(image_tensor) |
| |
| # Process results |
| boxes = predictions[0]['boxes'] |
| scores = predictions[0]['scores'] |
| labels = predictions[0]['labels'] |
| ``` |
|
|
| ## Model Performance |
|
|
| This model was trained from scratch on the KITTI Object Detection dataset using Faster R-CNN architecture. |
|
|
| ## Architecture |
|
|
| **Faster R-CNN** (Region-based Convolutional Neural Network) is a two-stage object detection framework: |
|
|
| 1. **Region Proposal Network (RPN)**: Generates object proposals |
| 2. **Fast R-CNN detector**: Classifies proposals and refines bounding box coordinates |
|
|
| Key advantages: |
| - High accuracy object detection |
| - Precise localization |
| - Good performance on small objects |
| - Well-established architecture with extensive research backing |
|
|
| ## Intended Use |
|
|
| - **Primary Use**: Object detection in autonomous driving scenarios |
| - **Suitable for**: Research, development, and deployment of object detection systems |
| - **Limitations**: Performance may vary on images significantly different from the training distribution |
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| ```bibtex |
| @article{ren2015faster, |
| title={Faster r-cnn: Towards real-time object detection with region proposal networks}, |
| author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian}, |
| journal={Advances in neural information processing systems}, |
| volume={28}, |
| year={2015} |
| } |
| ``` |
|
|
| ## License |
|
|
| This model is released under the MIT License. |
|
|
| ## Keywords |
|
|
| Faster R-CNN, Object Detection, Computer Vision, KITTI, Autonomous Driving, Deep Learning, Two-Stage Detection |
|
|