Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Clojure example for fixed label-width captcha recognition #13769

Merged
merged 8 commits into from
Jan 10, 2019

Conversation

kedarbellare
Copy link
Contributor

@kedarbellare kedarbellare commented Jan 4, 2019

Description

Adds an example for training a captcha OCR model and using the learned model for performing inference.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Captcha OCR training and inference with unit tests for training

Comments

I ran this on both osx and ubuntu linux but only with :cpu. However, I encountered memory issues while trying to train on a :gpu:

Exception in thread "main" org.apache.mxnet.MXNetError: [19:18:33] src/storage/./pooled_storage_manager.h:143: cudaMalloc failed: out of memory

Reviewers

@gigasquid

@gigasquid
Copy link
Member

Cool! Excited to look at this closer 😸

@Roshrini
Copy link
Member

Roshrini commented Jan 4, 2019

@kedarbellare Thanks for this contribution! On which gpu machine, did you run this example?

@mxnet-label-bot Add [pr-awaiting-review, Clojure]

@marcoabreu marcoabreu added Clojure pr-awaiting-review PR is waiting for code review labels Jan 4, 2019
@kedarbellare
Copy link
Contributor Author

@Roshrini : I ran this on my personal desktop with Nvidia GTX 1070 with 8GB memory (ubuntu 18.04, cuda-9.0). I tried varying the batch sizes but that didn't help. Let me know if you want more details.

@gigasquid
Copy link
Member

gigasquid commented Jan 8, 2019

@kedarbellare - getting to try this out - how long did your training take locally (how many epochs) ?

  • Never mind I just found the info in the README :)
prefix `ocr-`. I was able to achieve an exact match accuracy of ~0.954 and
~0.628 on training and validation data respectively.

Copy link
Contributor

@aaronmarkham aaronmarkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some suggested edits.
Also, it would be nice to have the files with code to have comments or some indication of what that file does.

contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
contrib/clojure-package/examples/captcha/README.md Outdated Show resolved Hide resolved
@kedarbellare
Copy link
Contributor Author

It's pretty slow with CPU training. On my macbook pro, with lein train (default one CPU training) the speedometer shows ~25 samples/sec (end-to-end training took over an hour for 10 epochs). My beefier deskop gets double that with CPU training and GPU training shows over 600 samples/sec.

@kedarbellare
Copy link
Contributor Author

@aaronmarkham I followed your suggestions for the README. I'll add more comments in the code in a subsequent commit.

Copy link
Member

@gigasquid gigasquid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition! I tested it out and everything works perfectly.
Thank you for this contribution 👍

[]
(let [data (sym/variable "data")
;; normalize the input pixels
scaled (sym/div (sym/- data 127) 128)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great way to normalize the pixels 💯

@gigasquid gigasquid merged commit d3bd5e7 into apache:master Jan 10, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
)

* Clojure example for fixed label-width captcha recognition

* Update evaluation

* Better training and inference (w/ cleanup)

* Captcha generation for testing

* Make simple test work

* Add test and update README

* Add missing consts file

* Follow comments
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Clojure pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants