kmeans anchors #754

wang-xinyu · 2020-01-02T09:10:54Z

Hi,
I adapt lars76/kmeans-anchor-boxes to this project, loading the data with yolo label format, and run k-means clustering on the dimensions of bounding boxes to get good priors.

I think it would be helpful to train on custom datasets, because the default anchors from paper are clustered from VOC and COCO.

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Introducing a new script for generating custom anchor boxes using k-means clustering.

📊 Key Changes

Added gen_anchors.py script:
- Allows generation of anchor boxes tailored to specific datasets.
- Provides functions to load datasets and calculate anchor boxes using k-means.
- Outputs accuracy, sorted boxes by area, and aspect ratios of generated anchors.
Included kmeans.py within the utils folder:
- Implements the k-means clustering algorithm, specifically designed to work with bounding box dimensions.
- Provides functions for calculating IoU (Intersection over Union), translating boxes to the origin, and performing the actual clustering.

🎯 Purpose & Impact

⚙️ Purpose: This PR aims to enhance the model's accuracy by allowing users to generate custom anchor boxes that better fit the distribution of their own dataset's object sizes.
📈 Impact: Users can expect improved object detection performance, particularly if their dataset's object sizes deviate from the default COCO dataset anchors that YOLOv3 is originally tuned for. This customization could be particularly beneficial for specialized tasks with unique object dimensions.

glenn-jocher · 2020-01-02T18:01:07Z

@wang-xinyu thanks for your interest in our work and for submitting this PR! We already have a kmeans function that directly generates the anchor string needed to copy and paste into cfg files, however we have not had success in applying these anchors to produce better mAPs on COCO. An example utilization of the function is here. You supply the range of your image sizes (i.e. for multiscale training), your anchor count, and your data file path.

yolov3/utils/utils.py

Lines 742 to 744 in 2328823

    
           def kmean_anchors(path='data/coco64.txt', n=12, img_size=(320, 640)): 
        
               # from utils.utils import *; _ = kmean_anchors(n=9) 
        
               # Produces a list of target kmeans suitable for use in *.cfg files

glenn-jocher · 2020-01-02T18:16:57Z

@wang-xinyu hmm interesting, I looked at the lars repo, it seems their 'average IoU' metric must be the average of the best IoU's among all the 9 anchors. In this repo we display that metric as a 'best' average IoU on the final line. If I run kmeans_anchors() on coco/train2017.txt for example under a 320-640 img-size multiscale assumption I get 0.57 best-IoU:

from utils.utils import *
_ = kmean_anchors(path='../coco/train2017.txt', n=9, img_size=(320, 640))
Caching labels (117266 found, 1021 missing, 0 empty, 0 duplicate, for 118287 images): 100%|██████████| 118287/118287 [00:52<00:00, 2246.04it/s]
Running kmeans on 849942 points...
0.10 iou_thr: 0.982 best possible recall, 4.3 anchors > thr
0.20 iou_thr: 0.953 best possible recall, 3.0 anchors > thr
0.30 iou_thr: 0.910 best possible recall, 2.1 anchors > thr
kmeans anchors (n=9, img_size=(320, 640), IoU=0.01/0.18/0.57-min/mean/best): 15,15,  28,47,  72,42,  57,106,  149,93,  111,201,  291,167,  205,340,  454,329

For a fixed img-size of 512 I get the same mean value of 0.57 best-IoU:

_ = kmean_anchors(path='../coco/train2017.txt', n=9, img_size=(512, 512))
Caching labels (117266 found, 1021 missing, 0 empty, 0 duplicate, for 118287 images): 100%|██████████| 118287/118287 [00:54<00:00, 2179.55it/s]
Running kmeans on 849942 points...
0.10 iou_thr: 0.984 best possible recall, 4.4 anchors > thr
0.20 iou_thr: 0.957 best possible recall, 3.1 anchors > thr
0.30 iou_thr: 0.915 best possible recall, 2.2 anchors > thr
kmeans anchors (n=9, img_size=(512, 512), IoU=0.01/0.18/0.57-min/mean/best): 16,17,  31,49,  78,46,  60,114,  152,99,  115,216,  286,165,  219,351,  447,301

What values do you get when you run the lars anchors on the COCO data @wang-xinyu?

wang-xinyu · 2020-01-03T02:27:16Z

@glenn-jocher Oh sorry I didn't notice there already has kmeans anchors in this repo.

I didn't try COCO data, I run kmeans on my own data, for head detection, and use new anchors to train, and didn't make it to get better result either. But I think it reasonable to use custom anchors clustered from train data. Maybe the custom anchors cannot make a big impact on mAP, as the features are trained to adapt the final bbox.

And I think your measurement of best-IoU is more reasonable than lars's Average-IoU. But the key is kmeans, not how to measure IoU, so your implementation and lars are similar, no big difference. And no need to merge my PR...

Thanks for your patience.

xyl-507 · 2020-08-14T11:24:12Z

@glenn-jocher
Sorry to bother you ! I don't konw how to adapt this repo's kmean_anchors function on my own dataset.? Can you show the function's usage and where to add it into. Thanks!

glenn-jocher · 2020-08-14T16:30:17Z

@xyl-507 I would highly suggest you use our YOLOv5 repo with auto anchor. It automatically checks your anchors against your data and evolves new anchors for you if you need them. This is the default behavior in v5 for all trainings.

xyl-507 · 2020-08-15T01:11:11Z

@glenn-jocher
Thanks! I'm sure I'll give it a try！I am a newbie，so I'd like to learn the v3 first .
BTW, I meet a new question. The first image is my training result of 500 epoch, and the second is of 1000 epoch. Both of them are training on my single class dataset of 40,000 images from scratch in same parameters: yolov3-tiny-1cls.cfg, size 256, batch-size 32, device 0 and other default parameters. At first I thought 500 epoch was not enough to convergence, so try 1000 epoch. There is no convergence in 1000 epoch. finally I training on pretrained weights ,and the result is the third image. Would you please tell me what's wrong with and how to improve my result? Thank you in advance!

glenn-jocher · 2020-08-15T01:25:23Z

@xyl-507 overfitting is a phenomenon that larger models are more prone to. YOLOv3-tiny is honestly a very poor model, with quite low capabilities, so it would be very hard to overfit 40,000 images with this model.

Finetuning (aka starting from pretrained weights) is a good method for training smaller datasets, and in particular for achieving results quickly. Larger datasets benefit from it to a lesser degree.

I would repeat your above experiments with YOLOv5s in place of YOLOv3-tiny (it trains to more than 2x the mAP on COCO, and is the same size). Additionally, YOLOv5 benefits from numerous bug fixes and feature additions absent from YOLOv3.

xyl-507 · 2020-08-15T02:25:41Z

@glenn-jocher your reply is very helpful for me. Excuse me for asking so many questions.
According to your answer, I run kmeans_anchor on my own dataset and apply the result into training on pretrained weights just now. But I got a very lower mAP: 0.00126, Is it because the pretrained weights didn't training on the same anchor?

glenn-jocher · 2020-08-15T02:33:31Z

v5 buddy. There’s zero sense pondering issues that may already be resolved there.

xyl-507 · 2020-08-15T02:42:05Z

@glenn-jocher OK，I move to v5 repo! Thanks!

glenn-jocher · 2020-08-15T02:48:11Z

@xyl-507 yes, definitely move to https://github.com/ultralytics/yolov5. I can't emphasize enough the enormous improvement we've made just in the last few months from this repo to our YOLOv5 repo. We have over 30 contributors there as well providing PRs and helping align it with best practices. Our intention to make YOLOv5 the simplest, most robust, and most accurate detection model in the world.

glenn-jocher · 2020-08-15T02:49:04Z

Ultralytics has open-sourced YOLOv5 at https://github.com/ultralytics/yolov5, featuring faster, lighter and more accurate object detection. YOLOv5 is recommended for all new projects.

** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from [google/automl](https://github.com/google/automl) at batch size 8.

August 13, 2020: v3.0 release: nn.Hardswish() activations, data autodownload, native AMP.
July 23, 2020: v2.0 release: improved model definition, training and mAP.
June 22, 2020: PANet updates: new heads, reduced parameters, improved speed and mAP 364fcfd.
June 19, 2020: FP16 as new default for smaller checkpoints and faster inference d4c6674.
June 9, 2020: CSP updates: improved speed, size, and accuracy (credit to @WongKinYiu for CSP).
May 27, 2020: Public release. YOLOv5 models are SOTA among all known YOLO implementations.
April 1, 2020: Start development of future compound-scaled YOLOv3/YOLOv4-based PyTorch models.

Pretrained Checkpoints

Model	AP^val	AP^test	AP₅₀	Speed_GPU	FPS_GPU	params	FLOPS
YOLOv5s	37.0	37.0	56.2	2.4ms	416	7.5M	13.2B
YOLOv5m	44.3	44.3	63.2	3.4ms	294	21.8M	39.4B
YOLOv5l	47.7	47.7	66.5	4.4ms	227	47.8M	88.1B
YOLOv5x	49.2	49.2	67.7	6.9ms	145	89.0M	166.4B

YOLOv5x + TTA	50.8	50.8	68.9	25.5ms	39	89.0M	354.3B

YOLOv3-SPP	45.6	45.5	65.2	4.5ms	222	63.0M	118.0B

** AP^test denotes COCO test-dev2017 server results, all other AP results in the table denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or test-time augmentation. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.001
** Speed_GPU measures end-to-end time per image averaged over 5000 COCO val2017 images using a GCP n1-standard-16 instance with one V100 GPU, and includes image preprocessing, PyTorch FP16 image inference at --batch-size 32 --img-size 640, postprocessing and NMS. Average NMS time included in this chart is 1-2ms/img. Reproduce by python test.py --data coco.yaml --img 640 --conf 0.1
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
** Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce by python test.py --data coco.yaml --img 832 --augment

For more information and to get started with YOLOv5 please visit https://github.com/ultralytics/yolov5. Thank you!

kmeans anchors

c4441c8

wang-xinyu closed this Jan 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kmeans anchors #754

kmeans anchors #754

wang-xinyu commented Jan 2, 2020 •

edited by UltralyticsAssistant

Loading

glenn-jocher commented Jan 2, 2020 •

edited

Loading

glenn-jocher commented Jan 2, 2020 •

edited

Loading

wang-xinyu commented Jan 3, 2020 •

edited

Loading

xyl-507 commented Aug 14, 2020

glenn-jocher commented Aug 14, 2020 •

edited

Loading

xyl-507 commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020

xyl-507 commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020

xyl-507 commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020 •

edited

Loading

kmeans anchors #754

kmeans anchors #754

Conversation

wang-xinyu commented Jan 2, 2020 • edited by UltralyticsAssistant Loading

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

glenn-jocher commented Jan 2, 2020 • edited Loading

glenn-jocher commented Jan 2, 2020 • edited Loading

wang-xinyu commented Jan 3, 2020 • edited Loading

xyl-507 commented Aug 14, 2020

glenn-jocher commented Aug 14, 2020 • edited Loading

xyl-507 commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020

xyl-507 commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020

xyl-507 commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020

glenn-jocher commented Aug 15, 2020 • edited Loading

Pretrained Checkpoints

wang-xinyu commented Jan 2, 2020 •

edited by UltralyticsAssistant

Loading

glenn-jocher commented Jan 2, 2020 •

edited

Loading

glenn-jocher commented Jan 2, 2020 •

edited

Loading

wang-xinyu commented Jan 3, 2020 •

edited

Loading

glenn-jocher commented Aug 14, 2020 •

edited

Loading

glenn-jocher commented Aug 15, 2020 •

edited

Loading