Keras RL
Deep Reinforcement Learning for Keras
- keras-rl implements some state-of-arts deep reinforcement learning in Python and integrates with keras
- keras-rl works with OpenAI Gym out of the box. This menas that evaluating and playing around with different algorithms easy
- You can use built-in Keras callbacks and metrics or define your own
What is included?
As of today, the following algorithms have been implemented:
- Deep Q Learning (DQN) [1], [2]
- Double DQN [3]
- Deep Deterministic Policy Gradient (DDPG) [4]
- Continuous DQN (CDQN or NAF) [6]
- Cross-Entropy Method (CEM) [7], [8]
- Dueling network DQN (Dueling DQN) [9]
- Deep SARSA [10]
- Asynchronous Advantage Actor-Critic (A3C) [5]
- Proximal Policy Optimization Algorithms (PPO) [11]
You can find more information on each agent in the doc.
Keras Examples
%run examples/dqn_cartpole.py
#python examples/dqn_cartpole.py
Cartpole
import numpy as np
import gym
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from keras.optimizers import Adam
from rl.agents.dqn import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
ENV_NAME = 'CartPole-v0'
# Get the environment and extract the number of actions.
env = gym.make(ENV_NAME)
np.random.seed(123)
env.seed(123)
nb_actions = env.action_space.n
# Next, we build a very simple model.
model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))
print(model.summary())
shape_test = ((1,) + env.observation_space.shape) # env.observation_space.shape (4)
shape_test
>> (1, 4)
# Finally, we configure and compile our agent. You can use every built-in Keras optimizer and
# even the metrics!
memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
Simple Reinforcement Learning with Tensorflow Part 7: Action-Selection Strategies for Exploration
Boltzmann Exploration Done Right
keras DQN implementation
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=True, verbose=2)
# After training is done, we save the final weights.
dqn.save_weights('dqn_{}_weights.h5f'.format(ENV_NAME), overwrite=True)
# Finally, evaluate our algorithm for 5 episodes.
dqn.test(env, nb_episodes=5, visualize=True)