Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kmeans anchors #754

Closed
wants to merge 1 commit into from
Closed

kmeans anchors #754

wants to merge 1 commit into from

Conversation

wang-xinyu
Copy link
Contributor

@wang-xinyu wang-xinyu commented Jan 2, 2020

Hi,
I adapt lars76/kmeans-anchor-boxes to this project, loading the data with yolo label format, and run k-means clustering on the dimensions of bounding boxes to get good priors.

I think it would be helpful to train on custom datasets, because the default anchors from paper are clustered from VOC and COCO.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Introducing a new script for generating custom anchor boxes using k-means clustering.

📊 Key Changes

  • Added gen_anchors.py script:
    • Allows generation of anchor boxes tailored to specific datasets.
    • Provides functions to load datasets and calculate anchor boxes using k-means.
    • Outputs accuracy, sorted boxes by area, and aspect ratios of generated anchors.
  • Included kmeans.py within the utils folder:
    • Implements the k-means clustering algorithm, specifically designed to work with bounding box dimensions.
    • Provides functions for calculating IoU (Intersection over Union), translating boxes to the origin, and performing the actual clustering.

🎯 Purpose & Impact

  • ⚙️ Purpose: This PR aims to enhance the model's accuracy by allowing users to generate custom anchor boxes that better fit the distribution of their own dataset's object sizes.
  • 📈 Impact: Users can expect improved object detection performance, particularly if their dataset's object sizes deviate from the default COCO dataset anchors that YOLOv3 is originally tuned for. This customization could be particularly beneficial for specialized tasks with unique object dimensions.

@glenn-jocher
Copy link
Member

glenn-jocher commented Jan 2, 2020

@wang-xinyu thanks for your interest in our work and for submitting this PR! We already have a kmeans function that directly generates the anchor string needed to copy and paste into cfg files, however we have not had success in applying these anchors to produce better mAPs on COCO. An example utilization of the function is here. You supply the range of your image sizes (i.e. for multiscale training), your anchor count, and your data file path.

yolov3/utils/utils.py

Lines 742 to 744 in 2328823

def kmean_anchors(path='data/coco64.txt', n=12, img_size=(320, 640)):
# from utils.utils import *; _ = kmean_anchors(n=9)
# Produces a list of target kmeans suitable for use in *.cfg files

@glenn-jocher
Copy link
Member

glenn-jocher commented Jan 2, 2020

@wang-xinyu hmm interesting, I looked at the lars repo, it seems their 'average IoU' metric must be the average of the best IoU's among all the 9 anchors. In this repo we display that metric as a 'best' average IoU on the final line. If I run kmeans_anchors() on coco/train2017.txt for example under a 320-640 img-size multiscale assumption I get 0.57 best-IoU:

from utils.utils import *
_ = kmean_anchors(path='../coco/train2017.txt', n=9, img_size=(320, 640))
Caching labels (117266 found, 1021 missing, 0 empty, 0 duplicate, for 118287 images): 100%|██████████| 118287/118287 [00:52<00:00, 2246.04it/s]
Running kmeans on 849942 points...
0.10 iou_thr: 0.982 best possible recall, 4.3 anchors > thr
0.20 iou_thr: 0.953 best possible recall, 3.0 anchors > thr
0.30 iou_thr: 0.910 best possible recall, 2.1 anchors > thr
kmeans anchors (n=9, img_size=(320, 640), IoU=0.01/0.18/0.57-min/mean/best): 15,15,  28,47,  72,42,  57,106,  149,93,  111,201,  291,167,  205,340,  454,329

For a fixed img-size of 512 I get the same mean value of 0.57 best-IoU:

_ = kmean_anchors(path='../coco/train2017.txt', n=9, img_size=(512, 512))
Caching labels (117266 found, 1021 missing, 0 empty, 0 duplicate, for 118287 images): 100%|██████████| 118287/118287 [00:54<00:00, 2179.55it/s]
Running kmeans on 849942 points...
0.10 iou_thr: 0.984 best possible recall, 4.4 anchors > thr
0.20 iou_thr: 0.957 best possible recall, 3.1 anchors > thr
0.30 iou_thr: 0.915 best possible recall, 2.2 anchors > thr
kmeans anchors (n=9, img_size=(512, 512), IoU=0.01/0.18/0.57-min/mean/best): 16,17,  31,49,  78,46,  60,114,  152,99,  115,216,  286,165,  219,351,  447,301

What values do you get when you run the lars anchors on the COCO data @wang-xinyu?

@wang-xinyu
Copy link
Contributor Author

wang-xinyu commented Jan 3, 2020

@glenn-jocher Oh sorry I didn't notice there already has kmeans anchors in this repo.

I didn't try COCO data, I run kmeans on my own data, for head detection, and use new anchors to train, and didn't make it to get better result either. But I think it reasonable to use custom anchors clustered from train data. Maybe the custom anchors cannot make a big impact on mAP, as the features are trained to adapt the final bbox.

And I think your measurement of best-IoU is more reasonable than lars's Average-IoU. But the key is kmeans, not how to measure IoU, so your implementation and lars are similar, no big difference. And no need to merge my PR...

Thanks for your patience.

@wang-xinyu wang-xinyu closed this Jan 3, 2020
@xyl-507
Copy link

xyl-507 commented Aug 14, 2020

@glenn-jocher
Sorry to bother you ! I don't konw how to adapt this repo's kmean_anchors function on my own dataset.? Can you show the function's usage and where to add it into. Thanks!

@glenn-jocher
Copy link
Member

glenn-jocher commented Aug 14, 2020

@xyl-507 I would highly suggest you use our YOLOv5 repo with auto anchor. It automatically checks your anchors against your data and evolves new anchors for you if you need them. This is the default behavior in v5 for all trainings.

@xyl-507
Copy link

xyl-507 commented Aug 15, 2020

@glenn-jocher
Thanks! I'm sure I'll give it a try!I am a newbie,so I'd like to learn the v3 first .
BTW, I meet a new question. The first image is my training result of 500 epoch, and the second is of 1000 epoch. Both of them are training on my single class dataset of 40,000 images from scratch in same parameters: yolov3-tiny-1cls.cfg, size 256, batch-size 32, device 0 and other default parameters. At first I thought 500 epoch was not enough to convergence, so try 1000 epoch. There is no convergence in 1000 epoch. finally I training on pretrained weights ,and the result is the third image. Would you please tell me what's wrong with and how to improve my result? Thank you in advance!

results
results
results

@glenn-jocher
Copy link
Member

@xyl-507 overfitting is a phenomenon that larger models are more prone to. YOLOv3-tiny is honestly a very poor model, with quite low capabilities, so it would be very hard to overfit 40,000 images with this model.

Finetuning (aka starting from pretrained weights) is a good method for training smaller datasets, and in particular for achieving results quickly. Larger datasets benefit from it to a lesser degree.

I would repeat your above experiments with YOLOv5s in place of YOLOv3-tiny (it trains to more than 2x the mAP on COCO, and is the same size). Additionally, YOLOv5 benefits from numerous bug fixes and feature additions absent from YOLOv3.

@xyl-507
Copy link

xyl-507 commented Aug 15, 2020

@glenn-jocher your reply is very helpful for me. Excuse me for asking so many questions.
According to your answer, I run kmeans_anchor on my own dataset and apply the result into training on pretrained weights just now. But I got a very lower mAP: 0.00126, Is it because the pretrained weights didn't training on the same anchor?

@glenn-jocher
Copy link
Member

v5 buddy. There’s zero sense pondering issues that may already be resolved there.

@xyl-507
Copy link

xyl-507 commented Aug 15, 2020

@glenn-jocher OK,I move to v5 repo! Thanks!

@glenn-jocher
Copy link
Member

@xyl-507 yes, definitely move to https://github.com/ultralytics/yolov5. I can't emphasize enough the enormous improvement we've made just in the last few months from this repo to our YOLOv5 repo. We have over 30 contributors there as well providing PRs and helping align it with best practices. Our intention to make YOLOv5 the simplest, most robust, and most accurate detection model in the world.

@glenn-jocher
Copy link
Member

glenn-jocher commented Aug 15, 2020

Ultralytics has open-sourced YOLOv5 at https://github.com/ultralytics/yolov5, featuring faster, lighter and more accurate object detection. YOLOv5 is recommended for all new projects.



** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.
  • August 13, 2020: v3.0 release: nn.Hardswish() activations, data autodownload, native AMP.
  • July 23, 2020: v2.0 release: improved model definition, training and mAP.
  • June 22, 2020: PANet updates: new heads, reduced parameters, improved speed and mAP 364fcfd.
  • June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
  • June 9, 2020: CSP updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
  • May 27, 2020: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
  • April 1, 2020: Start development of future compound-scaled YOLOv3/YOLOv4-based PyTorch models.

Pretrained Checkpoints

Model APval APtest AP50 SpeedGPU FPSGPU params FLOPS
YOLOv5s 37.0 37.0 56.2 2.4ms 416 7.5M 13.2B
YOLOv5m 44.3 44.3 63.2 3.4ms 294 21.8M 39.4B
YOLOv5l 47.7 47.7 66.5 4.4ms 227 47.8M 88.1B
YOLOv5x 49.2 49.2 67.7 6.9ms 145 89.0M 166.4B
YOLOv5x + TTA 50.8 50.8 68.9 25.5ms 39 89.0M 354.3B
YOLOv3-SPP 45.6 45.5 65.2 4.5ms 222 63.0M 118.0B

** APtest denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.001
** SpeedGPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
** Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce by python test.py --data coco.yaml --img 832 --augment

For more information and to get started with YOLOv5 please visit https://github.com/ultralytics/yolov5. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants