deep-reinforcement-learning
Ctrl
K
Copy
附录
Policy Gradient
Off-Policy Actor-Critic
Generalized Advantage Estimation
Soft Actor-Critic
PPO-Penalty
Previous
QR-DQN
Next
Off-Policy Actor-Critic
Last updated
6 years ago
Was this helpful?