The data for all the Q-Prop results is contained in data/local/*
. To run the plotter to get the same results as in the paper, you can run python plot_rewards.py
or you can run python plot_rewards.py --mini
to generate the same plot where each subfigure has its own legend (useful for cropping).
NOTE: Running the experiments found in sandbox/rocky/tf/launchers/sample_run.sh
might throw a ModuleNotFoundError
. To fix this, add the top-level folder to your environment variable PYTHONPATH
.
rllab++ is a framework for developing and evaluating reinforcement learning algorithms, built on rllab. It has the following implementations besides the ones implemented in rllab:
The codes are experimental, and may require tuning or modifications to reach the best reported performances.
Please follow the basic installation instructions in rllab documentation.
From the launchers directory, run the following, with optional additional flags defined in launcher_utils.py:
python algo_gym_stub.py --exp=<exp_name>
Flags include:
- algo_name: trpo (TRPO), vpg (vanilla policy gradient), ddpg (DDPG), qprop (Q-Prop with trpo), etc. See launcher_utils.py for more variants.
- env_name: OpenAI Gym environment name, e.g. HalfCheetah-v1.
The experiment will be saved in /data/local/<exp_name>.
If you use rllab++ for academic research, you are highly encouraged to cite the following papers:
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schoelkopf, Sergey Levine. "Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning". arXiv:1706.00387 [cs.LG], 2017.
- Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic" Proceedings of the International Conference on Learning Representations (ICLR), 2017.
- Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel. "Benchmarking Deep Reinforcement Learning for Continuous Control". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.