Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online batch generation for unsupervised learning #2453

Closed
djbyrne opened this issue Jul 1, 2020 · 12 comments
Closed

Online batch generation for unsupervised learning #2453

djbyrne opened this issue Jul 1, 2020 · 12 comments
Labels
feature Is an improvement or enhancement help wanted Open to be worked on won't fix This will not be worked on

Comments

@djbyrne
Copy link
Contributor

djbyrne commented Jul 1, 2020

🚀 Feature

As an alternative to pulling batches from a pre-made dataset, provide users with a hook that allows them to specify how batches are generated for each training step. This would allow users to provide logic for generating a batch of during online during training. This would be ideal for Reinforcement Learning models and other unsupervised problems.

Motivation

Currently Lightning expects all models to utilize pre-made dataset which is used to generate mini batches through a DataLoader. This works for most use cases, but for some case, such as Reinforcement Learning models, this requirement is counter intuitive as the data is generated online during training. In order for these models to work with Lightning, custom Datasets and runners need to be create to generate the batch and wrap it in a DataLoader.

Pitch

Add an additional hooks for specifying the generation of a batch for train, val and test. Going forward I think there are two options to implement this feature.

1: Check if the user has populated the train_batch function and call this instead of the dataloader directly for the training loop. This will probably require a lot of code churn in order to handle the various checks for the dataloader in the main Trainer

2: Create a 'dummy' dataloader that wraps the train_batch function provided by the Lightning model. This is probably a simpler implementation as the underlying logic in the Trainer that handles the dataloader shouldn't need to change.

Alternatives

The alternative to this would be to have the user create custom DataLoader/Datasets that generate the batch on the fly. This will give the same end result, but is arguably a messier solution and requires more work and effort for the user.

@djbyrne djbyrne added feature Is an improvement or enhancement help wanted Open to be worked on labels Jul 1, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2020

Hi! thanks for your contribution!, great first issue!

@justusschock
Copy link
Member

Hi @djbyrne This is already possible. in your LightningModules train_dataloader you just have to provide an iterable type returning batches. Training should work just normal, only the magic that happens around the DataLoader (several sanity checks for sampler etc.) won't work :)

@djbyrne
Copy link
Contributor Author

djbyrne commented Jul 2, 2020

Hey @justusschock you are correct that this is already possible, I have actually been doing something like this for the RL models in the lightning bolts repo https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/master/pl_bolts/models/rl/common/experience.py

Although this works, it seems a bit cumbersome for users to go off an create their own data source that feeds the iterator for the dataloader. I know that this will be doing practically the same thing, but I feel that it would be cleaner and provide a much better user experience.

@justusschock
Copy link
Member

I see :) I think you are right. I'm not familiar with RL, so to clarify the interface: Do you need the function to produce whole batches or just single samples?

Because for whole batches, we would also have to make sure, that batching in the DataLoader (if used) is disabeled/don't use the DataLoader at all...

@djbyrne
Copy link
Contributor Author

djbyrne commented Jul 2, 2020

It would be both. The typical flow would be

1: carry out N steps in the environment using the agent

2: add to a buffer or just gather the experience data

3: sample from the buffer or take the gathered experience data as the batch.

I think the best way would be to have an IterableDataset that consumes this train_batch() hook. I believe that by doing this, batching would not be used.

@justusschock
Copy link
Member

justusschock commented Jul 2, 2020

If you use IterableDataset and pass it to the loader, I think batching will still be done. We could however bypass this, with a custom collate_fn. But maybe we don't need the loader here at all, since it's main purpose is multiprocessed loading and I'm not sure how useful this is for RL.

@christofer-f
Copy link
Contributor

This is something I am looking into as well.
In this repo ( https://github.com/christofer-f/Hierarchical-Actor-Critic-HAC-PyTorch ) the agent is populating 6 replay buffers before training.

@djbyrne The code solves MountainCarContinuous-v0 in 100 epochs... which is quite nice :-)

//Christofer

@djbyrne
Copy link
Contributor Author

djbyrne commented Jul 3, 2020

@christofer-f that looks awesome!

@djbyrne
Copy link
Contributor Author

djbyrne commented Jul 3, 2020

@justusschock I have pushed a proof of concept for the train_batch interface here https://github.com/djbyrne/pytorch-lightning-bolts/blob/enhancement/train_batch_function/pl_bolts/models/rl/vanilla_policy_gradient_model.py

The datamodules for this can be found here https://github.com/djbyrne/pytorch-lightning-bolts/blob/enhancement/train_batch_function/pl_bolts/datamodules/experience_source.py

This is still WIP but would like to get some feedback on it so far

@christofer-f
Copy link
Contributor

I found this article:
https://medium.com/speechmatics/how-to-build-a-streaming-dataloader-with-pytorch-a66dd891d9dd

It seems really useful...
I am trying to create a small example where you fit a network to a dataset that is created on the fly.
And that you populate the dataset with yield in chunks

Hopefully, I will put some code in my repo to show my ideas...

@djbyrne
Copy link
Contributor Author

djbyrne commented Jul 3, 2020

@christofer-f yeah that iterable dataset type is what is currently being used in the Pytorch Lightning Bolts RL module

@stale
Copy link

stale bot commented Sep 1, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Sep 1, 2020
@stale stale bot closed this as completed Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on won't fix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants