-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add features to the YOLO model from the latest YOLO variants #817
Merged
Borda
merged 103 commits into
Lightning-Universe:master
from
groke-technologies:yolo-update
May 29, 2023
Merged
Changes from all commits
Commits
Show all changes
103 commits
Select commit
Hold shift + click to select a range
9adb664
Added features from latest YOLO versions
senarvi 476adb9
Fixed ONNX export
senarvi 8d70ca1
meshgrid() call made future-proof by using the indexing argument
senarvi 35a98ba
torch.jit.script fails with a lambda function
senarvi a91793c
YOLOV4Tiny, YOLOV5, and YOLOX network architectures in plain PyTorch
senarvi b237099
Improvements to YOLO
senarvi 82f4de1
YOLO output layer name includes the number of outputs
senarvi 8a201a0
Complete type hints
senarvi 2737fa4
Updated CHANGELOG.
senarvi 26987eb
Torchvision import made conditional
senarvi 09bce80
Use expand() instead of broadcast_to() for backward compatibility
senarvi 471107f
Merge branch 'master' into yolo-update
senarvi a80536e
Use pytorch_lightning.utilities.distributed if pytorch_lightning.util…
senarvi b1b8db3
YOLOV4P6 network architecture
senarvi f499b76
Merge branch 'master' into yolo-update
senarvi 5661dba
Fixed document generation, when MeanAveragePrecision is not available
senarvi 0ad5867
Use arxiv URL to avoid a too long line
senarvi 84a949f
Use torch.div() instead of //
senarvi cf1646c
remove under_review decorators
redleaf-kim fecf88c
add yolo cfg with giou & update related test function
redleaf-kim b5abc8f
add serveral yolo config & layers function test
redleaf-kim 198ebc1
Merge branch 'Lightning-AI:master' into yolo_review
redleaf-kim 08f17f7
remove unused import & variable
redleaf-kim db2601a
add type hints
redleaf-kim fe38bb7
remove and merge duplicated test
redleaf-kim 7da9d4a
improve readability
redleaf-kim 8c3ed4e
Merge remote-tracking branch 'origin/yolo_review' into yolo_review
redleaf-kim 8f69419
Merge branch 'master' into yolo_review
otaj 63c7eef
Use distance_box_iou(), complete_box_iou() and the corresponding loss…
senarvi f6af0a4
Merge branch 'master' into yolo-update
senarvi b25a864
Merge branch 'master' into yolo_review
otaj 9ff86ab
Merge branch 'master' into yolo_review
otaj 17fab64
add catch_warning fixture
353f119
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] a3445ac
fix pytest error; indexing argument will be required to pass in upcom…
redleaf-kim 189346c
fix pytest catch_warnings; MisconfigurationException error
redleaf-kim a1d97b6
fix pytest error
redleaf-kim d5b5fb9
Merge remote-tracking branch 'origin/yolo_review' into yolo_review
redleaf-kim 0b4eca4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] b52ab5b
Merge branch 'master' into yolo_review
Borda eb9930e
Fix most obvious CI failings
fdf38fb
fix test with a missing warning
28813ac
Refactoring
senarvi 9db3f9d
Merge branch 'master' into yolo-update
senarvi a42cdec
resolve accidentally introduced errors
d0c68ed
Merge branch 'master' into yolo-update
senarvi 1d324e5
infer() returns the model to the previous mode
senarvi 374a3ec
CLI YOLO application uses the YOLOv4 architecture, if a Darknet confi…
senarvi 737ec64
Minor documentation improvements
senarvi d534cfa
add catch_warnings
57c9baf
Merge branch 'master' into yolo_review
390578d
Merge branch 'yolo-update' into yolo_review
senarvi 7bd8d34
Fixed a typo
senarvi ddc4a46
Fixed unit tests and added catch_warnings to all tests.
senarvi 0b6a6d0
Merge branch 'yolo-update' of github.com:groke-technologies/pytorch-l…
senarvi 37a3f7c
Added a README and documentation for YOLO
senarvi de69cb2
YOLO tests use giou loss, which is available in Torchvision 0.12
senarvi ff2a521
Fixed type annotation
senarvi a040ffd
Removed unused import
senarvi 2bc8a31
Check typing for YOLO
senarvi cc21337
Fixed hyperlinks
senarvi 91ed315
Merge branch 'master' into yolo-update
senarvi 4083bd2
Fixed mypy errors
senarvi 8457fee
Removed iou and giou metrics and losses, as these are provided by Tor…
senarvi 06e9399
Merge branch 'master' into yolo-update
senarvi 207cfc1
Merge branch 'master' into yolo-update
senarvi cfc3a4a
Merge branch 'master' into yolo-update
senarvi 568f390
Merge branch 'master' into yolo-update
senarvi 8d4a3b4
Fixed by mdformat
senarvi d57055a
Avoid using a lambda function
senarvi 9e0a33f
Avoid local functions
senarvi fa8ad41
Avoid lambda functions
senarvi 54f1eb0
Avoid a lambda function
senarvi 997b6a8
Merge branch 'master' into yolo-update
senarvi b95caf2
Use sync_dist=True and don't fail if there are no step outputs
senarvi b50ee2a
Merge branch 'master' into yolo-update
senarvi e4cb505
Added documentation
senarvi 606acb1
Fixed an off-by-one bug when reading YOLOv4 backbone depths
senarvi d840f5a
Merge branch 'master' into yolo-update
senarvi b95f026
YOLOv7 network with deep supervision
senarvi 4bc4198
Merge branch 'master' into yolo-update
senarvi 3e3bad5
Fixed a too long line
senarvi d9d64ea
Avoid using "input" as a variable name
senarvi 49d9709
Fixed type annotations
senarvi 8e4afc9
SimOTA uses also size ratio for "center prior" filtering
senarvi a9c72e3
Fixed docstrings
senarvi cc57b60
Fixed LRScheduler import for PyTorch 2.0
senarvi e6f6cc1
Added support for label smoothing
senarvi 2e55236
Added unit tests for YOLOv7 and box_size_ratio()
senarvi c509285
Speeded up YOLO unit tests (NMS) considerably by using a higher confi…
senarvi 5fa9d56
Merge branch 'master' into yolo-update
senarvi 614f7e6
detection_boxes is now called detections in MeanAveragePrecision
senarvi 2242736
Use giou in YOLO tests to allow them to pass also with older versions…
senarvi 4807f3c
Use double underscores in links in the docstring to avoid duplicate n…
senarvi 64a3e47
Check that targets are given in training mode
senarvi 26c9a7c
Fixed docstring formatting
senarvi c891ca4
Merge branch 'master' into yolo-update
senarvi e119ee9
Merge branch 'master' into yolo-update
senarvi 98e4c2e
Add
senarvi bf6295b
Removed files that popped up back in the merge
senarvi 9042812
Code formatting fixed by ruff
senarvi a399bed
Work around a problem with mypy and if-else ternary operator
senarvi c976147
Fixed bcefunc assignment so that both ruff and mypy are happy
senarvi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,35 @@ | ||
from pl_bolts.models.detection import components | ||
from pl_bolts.models.detection.faster_rcnn import FasterRCNN | ||
from pl_bolts.models.detection.retinanet import RetinaNet | ||
from pl_bolts.models.detection.yolo.yolo_config import YOLOConfiguration | ||
from pl_bolts.models.detection.yolo.darknet_network import DarknetNetwork | ||
from pl_bolts.models.detection.yolo.torch_networks import ( | ||
YOLOV4Backbone, | ||
YOLOV4Network, | ||
YOLOV4P6Network, | ||
YOLOV4TinyBackbone, | ||
YOLOV4TinyNetwork, | ||
YOLOV5Backbone, | ||
YOLOV5Network, | ||
YOLOV7Backbone, | ||
YOLOV7Network, | ||
YOLOXNetwork, | ||
) | ||
from pl_bolts.models.detection.yolo.yolo_module import YOLO | ||
|
||
__all__ = [ | ||
"components", | ||
"FasterRCNN", | ||
"YOLOConfiguration", | ||
"YOLO", | ||
"RetinaNet", | ||
"DarknetNetwork", | ||
"YOLOV4Backbone", | ||
"YOLOV4Network", | ||
"YOLOV4P6Network", | ||
"YOLOV4TinyBackbone", | ||
"YOLOV4TinyNetwork", | ||
"YOLOV5Backbone", | ||
"YOLOV5Network", | ||
"YOLOV7Backbone", | ||
"YOLOV7Network", | ||
"YOLOXNetwork", | ||
"YOLO", | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# YOLO | ||
|
||
The YOLO model has evolved quite a bit, since the original publication in 2016. The original source code was written in C, using a framework called [Darknet](https://github.com/pjreddie/darknet). The final revision by the original author was called YOLOv3 and described in an [arXiv paper](https://arxiv.org/abs/1804.02767). Later various other authors have written implementations that improve different aspects of the model or the training procedure. [YOLOv4 implementation](https://github.com/AlexeyAB/darknet) was still based on Darknet and [YOLOv5](https://github.com/ultralytics/yolov5) was written using PyTorch. Most other implementations are based on these. | ||
|
||
This PyTorch Lightning implementation combines features from some of the notable YOLO implementations. The most important papers are: | ||
|
||
- *YOLOv3*: [Joseph Redmon and Ali Farhadi](https://arxiv.org/abs/1804.02767) | ||
- *YOLOv4*: [Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao](https://arxiv.org/abs/2004.10934) | ||
- *YOLOv7*: [Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao](https://arxiv.org/abs/2207.02696) | ||
- *Scaled-YOLOv4*: [Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao](https://arxiv.org/abs/2011.08036) | ||
- *YOLOX*: [Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun](https://arxiv.org/abs/2107.08430) | ||
|
||
## Network Architecture | ||
|
||
Any network can be used with YOLO detection heads as long as it produces feature maps with the correct number of features. Typically the network consists of a CNN backbone combined with a [Feature Pyramid Network](https://arxiv.org/abs/1612.03144) or a [Path Aggregation Network](https://arxiv.org/abs/1803.01534). Backbone layers reduce the size of the feature map and the network may contain multiple detection heads that operate at different resolutions. | ||
|
||
The user can write the network architecture in PyTorch, or construct a computational graph based on a Darknet configuration file using the [`DarknetNetwork`](https://github.com/Lightning-AI/lightning-bolts/tree/master/pl_bolts/models/detection/yolo/darknet_network.py) class. The network object is passed to the YOLO constructor in the `network` argument. `DarknetNetwork` is also able to read weights from a Darknet model file. | ||
|
||
There are several network architectures included in the [`torch_networks`](https://github.com/Lightning-AI/lightning-bolts/tree/master/pl_bolts/models/detection/yolo/torch_networks.py) module (YOLOv4, YOLOv5, YOLOX). Larger and smaller variants of these models can be created by varying the `width` and `depth` arguments. | ||
|
||
## Anchors | ||
|
||
A detection head can try to detect objects at each of the anchor points that are spaced evenly across the image in a grid. The size of the grid is determined by the width and height of the feature map. There can be a number of anchors (typically three) per grid cell. The number of features predicted per grid cell has to be `(5 + num_classes) * anchors_per_cell`. | ||
|
||
The width and the height of a bounding box is detected relative to a prior shape. `anchors_per_cell` prior shapes per detection head are defined in the network configuration. That is, if the network uses three detection heads, and each head detects three bounding boxes per grid cell, nine prior shapes need to be defined. They are defined in the Darknet configuration file or provided to the network class constructor. The default values have been obtained by clustering bounding box shapes in the COCO dataset. Note that if you use a different image size, you probably want to scale the prior shapes too. | ||
|
||
The prior shapes are also used for matching the ground-truth targets to anchors during training. With the exception of the SimOTA matching algorithm, targets are matched only to anchors from the closest grid cell. The prior shapes are used to determine, to which anchors from that cell the target is matched. The losses are computed between the targets boxes and the predictions that correspond to their matched anchors. Different matching rules have been implemented: | ||
|
||
- *maxiou*: The original matching rule that matches a target to the prior shape that gives the highest IoU. | ||
- *iou*: Matches a target to an anchor, if the IoU between the target and the prior shape is above a threshold. Multiple anchors may be matched to the same target, and the loss will be computed from a number of pairs that is generally not the same as the number of ground-truth boxes. | ||
- *size*: Calculates the ratio between the width and height of the target box to the prior width and height. If both the width and the height are close enough to the prior shape, matches the target to the anchor. | ||
- *simota*: The SimOTA matching algorithm from YOLOX. Targets can be matched not only to anchors from the closest grid cell, but to any anchors that are inside the target bounding box and whose prior shape is close enough to the target shape. The matching algorithm is based on Optimal Transport and uses the training loss between the target and the predictions as the cost. That is, the prior shapes are not used for matching, but the predictions corresponding to the anchors. | ||
|
||
## Input Data | ||
|
||
The model input is expected to be a list of images. Each image is a tensor with shape `[channels, height, width]`. The images from a single batch will be stacked into a single tensor, so the sizes have to match. Different batches can have different image sizes. The feature pyramid network introduces another constraint on the image size: the width and the height have to be divisible by the ratio in which the network downsamples the input. | ||
|
||
During training, the model expects both the image tensors and a list of targets. It's possible to train a model using one integer class label per box, but the YOLO model supports also multiple labels per box. For multi-label training, simply use a boolean matrix that indicates which classes are assigned to which boxes, in place of the class labels. Each target is a dictionary containing the following tensors: | ||
|
||
- *boxes*: `(x1, y1, x2, y2)` coordinates of the ground-truth boxes in a matrix with shape `[N, 4]`. | ||
- *labels*: Either integer class labels in a vector of size `N` or a class mask for each ground-truth box in a boolean matrix with shape `[N, classes]` | ||
|
||
## Training | ||
|
||
The command line application demonstrates how to train a YOLO model using PyTorch Lightning. The first step is to create a network, either from a Darknet configuration file, or using one of the included PyTorch networks. The network is passed to the YOLO model constructor. | ||
|
||
The data module needs to resize the data to a suitable size, in addition to any augmenting transforms. For example, YOLOv4 network requires that the width and the height are multiples of 32. | ||
|
||
## Inference | ||
|
||
During inference, the model requires only the input images. `forward()` method receives a mini-batch of images in a tensor with shape `[N, channels, height, width]`. | ||
|
||
Every detection head predicts a bounding box at every anchor. `forward()` returns the predictions from all detection heads in a tensor with shape `[N, anchors, classes + 5]`, where `anchors` is the total number of anchors in all detection heads. The predictions are `x1`, `y1`, `x2`, `y2`, confidence, and the probability for each class. The coordinates are scaled to the input image size. | ||
|
||
`infer()` method filters and processes the predictions. A class-specific score is obtained by multiplying the class probability with the detection confidence. Only detections with a high enough score are kept. YOLO does not use `softmax` to normalize the class probabilities, but each probability is normalized individually using `sigmoid`. Consequently, one object can be assigned to multiple categories. If more than one class has a score that is above the confidence threshold, these will be split into multiple detections during postprocessing. Then the detections are filtered using non-maximum suppression. The processed output is returned in a dictionary containing the following tensors: | ||
|
||
- *boxes*: a matrix of predicted bounding box `(x1, y1, x2, y2)` coordinates in image space | ||
- *scores*: a vector of detection confidences | ||
- *labels*: a vector of predicted class labels |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @lantiga