-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an asynchronous single GPU dataloader example #1521
Conversation
Hello @HenryJia! Thanks for updating this PR.
Comment last updated at 2020-04-29 20:03:05 UTC |
This pull request is now in conflict... :( |
Codecov Report
@@ Coverage Diff @@
## master #1521 +/- ##
======================================
Coverage 88% 88%
======================================
Files 71 71
Lines 4175 4175
======================================
Hits 3692 3692
Misses 483 483 |
@HenryJia The len attribute seems to be wrong. To the best of my knowledge, DataLoaders do not have len. To access the length, you should use DataLoader.Dataset to access the dataset of the data loader. |
@veritas9872 That's not correct. The pytorch data loader does have a length that computes the number of samples based on batch size and dataset/sampler length. It's the dataset that does not necessarily have a length. |
This pull request is now in conflict... :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me now :)
I think as an example this is good now. However, there are some issues, with DDP, where we automatically alter the sampler when necessary.
I'd say to merge this now, since currently your loader code is only in examples, but maybe add a new PR that add's the code to the framework and also takes care of these changes?
Thoughts @Borda @williamFalcon @PyTorchLightning/core-contributors
from pl_examples.models.lightning_template import LightningTemplateModel | ||
from pl_examples.utils.loaders import AsynchronousLoader | ||
|
||
SEED = 2334 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this shall be fixed in #1572
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the seed actually needed? The example does not actually produce any meaningful results, it's just a demo for a dataloader, right? Otherwise I would wait until #1572 is merged.
@justusschock I thought about adding it in to lightning itself instead of just as an example, but I'm not sure where exactly to stick it. It's not very generalisable as well as it's only single GPU compatible. To get it to work with infinite dataloaders would add a bit of complexity as I'd need to add in thread stopping conditions. I figured sticking with the PyTorch/PyTorch lightning philosophy of modularity, I'd keep it as an wrapper example with dataloader |
This pull request is now in conflict... :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I tried it myself and it worked fine.
The template filename could be renamed to gpu_async_template just for better lexicograhic ordering in the folder.
And maybe the dataloader could be moved to a folder "dataloaders" just like the models that are located in "models" folder.
There is a merge conflict, could you resolve it?
@HenryJia this is awesome, but i think this belongs in bolts? |
Could be a good place for these kind of examples |
maybe could go to bolts.dataloaders? |
Following some discussions with William, we agreed that this would be best added to the datamodules or dataloader section of bolts instead. As such, I'm closing this PR for now unless anyone else has something to object about it. I will open a new PR with this for bolts soon once I get around my final year university exams. |
Before submitting
What does this PR do?
Address #1454 #1404 #1316
This was the simplest way without adding an any extra complexity to pytorch-lightning itself. It's only possible to asynchronously load for single GPU training anyway. MultiGPU uses PyTorch's DataParallel.scatter() which seems to have synchronisation constraints
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃