# 稀疏、延迟、欺骗性的回报

Montezuma's Revenge等以稀疏奖励为特征的游戏仍然是大多数Deep RL方法的挑战。虽然结合DQN内在动机或专家演示，的最新进展可以提供帮助，但是具有稀疏奖励的游戏仍然是当前深度RL方法的挑战。内在动机的强化学习，以及分层强化学习方面有着悠久的研究历史，这在这里可能是有用的。基于Minecraft的Project Malmo环境提供了一个很好的场所，可以创建具有非常稀疏奖励的任务，代理需要设置他们的目标。无导数和无梯度方法，如进化策略和遗传算法，通过局部采样来探索参数空间，在这些游戏中有着广阔的应用前景，特别是结合novelty search。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://hujian.gitbook.io/deep-reinforcement-learning/fang-fa/kai-fang-de-tiao-zhan/xi-shu-3001-yan-chi-3001-qi-pian-xing-de-hui-bao.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
