Mirage Experiments

The data for all the Q-Prop results is contained in data/local/*. To run the plotter to get the same results as in the paper, you can run python plot_rewards.py or you can run python plot_rewards.py --mini to generate the same plot where each subfigure has its own legend (useful for cropping).

NOTE: Running the experiments found in sandbox/rocky/tf/launchers/sample_run.sh might throw a ModuleNotFoundError. To fix this, add the top-level folder to your environment variable PYTHONPATH.

rllab++

rllab++ is a framework for developing and evaluating reinforcement learning algorithms, built on rllab. It has the following implementations besides the ones implemented in rllab:

The codes are experimental, and may require tuning or modifications to reach the best reported performances.

Installation

Please follow the basic installation instructions in rllab documentation.

Examples

From the launchers directory, run the following, with optional additional flags defined in launcher_utils.py:

python algo_gym_stub.py --exp=<exp_name>

Flags include:

algo_name: trpo (TRPO), vpg (vanilla policy gradient), ddpg (DDPG), qprop (Q-Prop with trpo), etc. See launcher_utils.py for more variants.
env_name: OpenAI Gym environment name, e.g. HalfCheetah-v1.

The experiment will be saved in /data/local/<exp_name>.

Citations

If you use rllab++ for academic research, you are highly encouraged to cite the following papers:

Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schoelkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". arXiv:1706.00387 [cs.LG], 2017.
Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic" Proceedings of the International Conference on Learning Representations (ICLR), 2017.
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel. "Benchmarking Deep Reinforcement Learning for Continuous Control". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
contrib		contrib
data/local		data/local
docker		docker
docs		docs
examples		examples
plots		plots
rllab		rllab
sandbox		sandbox
scripts		scripts
tests		tests
vendor/mujoco_models		vendor/mujoco_models
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README.md.orig		README.md.orig
circle.yml		circle.yml
convert-stdout-to-csv.py		convert-stdout-to-csv.py
dirty_plot_rewards.py		dirty_plot_rewards.py
environment.yml		environment.yml
move_csvs_to_data_folder.py		move_csvs_to_data_folder.py
plot_rewards.py		plot_rewards.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mirage Experiments

rllab++

Installation

Examples

Citations

About

Releases

Packages

Languages

License

brain-research/mirage-rl-qprop

Folders and files

Latest commit

History

Repository files navigation

Mirage Experiments

rllab++

Installation

Examples

Citations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages