Exploration vs. Exploitation – Learning the Optimal Reinforcement Learning Policy

Exploration vs. Exploitation - Learning the Optimal Reinforcement Learning Policy

Welcome back to this series on reinforcement learning! Last time, we left our discussion of Q-learning with the question of how an agent chooses to either explore the environment or to exploit it in order to select its actions. In this video, we’ll answer this question by introducing a type of strategy called an epsilon greedy strategy.

We’ll also explore how, using this strategy, the agent makes decisions about the actions it takes. We’ll also see how exactly Q-value is calculated and updated in the Q-table mathematically using an example from the lizard game we introduced last time.

Check out the corresponding blog and other resources for this video at:
http://deeplizard.com/learn/video/mo96Nqlo1L8

TED Talk:

❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
Ruicong Xie

Support collective intelligence, and join the deeplizard hivemind:
http://deeplizard.com/hivemind

Follow deeplizard:
YouTube: https://www.youtube.com/deeplizard
Twitter: https://twitter.com/deeplizard
Facebook: https://www.facebook.com/Deeplizard-145413762948316
Steemit: https://steemit.com/@deeplizard
Instagram: https://www.instagram.com/deeplizard/
Pinterest: https://www.pinterest.com/deeplizard/

Check out products deeplizard suggests on Amazon:
https://www.amazon.com/shop/deeplizard

Support deeplizard by browsing with Brave:
https://brave.com/dee530

Support deeplizard with crypto:
Bitcoin: 1AFgm3fLTiG5pNPgnfkKdsktgxLCMYpxCN
Litecoin: LTZ2AUGpDmFm85y89PFFvVR5QmfX6Rfzg3
Ether: 0x9105cd0ecbc921ad19f6d5f9dd249735da8269ef

Recommended books on AI:
The Most Human Human: What Artificial Intelligence Teaches Us About Being Alive:
http://amzn.to/2GtjKqu
Life 3.0: Being Human in the Age of Artificial Intelligence
https://amzn.to/2H5Iau4

Playlists:
Data Science – https://www.youtube.com/playlist?list=PLZbbT5o_s2xrth-Cqs_R9-us6IWk9x27z
Machine Learning – https://www.youtube.com/playlist?list=PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU
Keras – https://www.youtube.com/playlist?list=PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL
TensorFlow.js – https://www.youtube.com/playlist?list=PLZbbT5o_s2xr83l8w44N_g3pygvajLrJ-
PyTorch – https://www.youtube.com/watch?v=v5cngxo4mIg&list=PLZbbT5o_s2xrfNyHZsM6ufI0iZENK9xgG
Reinforcement Learning – https://www.youtube.com/playlist?list=PLZbbT5o_s2xoWNVdDudn51XM8lOuZ_Njv

Music:
Thinking Music by Kevin MacLeod
Jarvic 8 by Kevin MacLeod
YouTube: https://www.youtube.com/channel/UCSZXFhRIx6b0dFX3xS8L1yQ
Website: http://incompetech.com/
Licensed under Creative Commons: By Attribution 3.0 License
http://creativecommons.org/licenses/by/3.0/

Author:

Just a figment of your imagination.