Why is RL Different than AI/ML Planning? at 3:26 Exploration starts at 19:27 Evaluation an RL algorithm at 40:18

sorry, too many buzz-sounds (ähhm, ahh) in your speech, I can't listen to this..

In the slide at 24:18, the last term is the square root of a fraction where the numerator is, e.g. in the context of a game, "twice the natural logarithm of the number of actions taken since the beginning of the game" and the denominator is "the number of times this particular action has been chosen this game", and is meant to give rarely chosen actions a little boost in how likely they are to be chosen. I'm pretty sure that's the correct understanding, anyone care to disagree?

The guy who was continuously typing on a laptop or so has managed to kill the whole record. Cong.s. ıf you do not listen why do you come?

Very good lecture, but I find the person constantly typing on the laptop is extremely disrespectful.

Why is RL Different than AI/ML Planning? at 3:26

Exploration starts at 19:27

Evaluation an RL algorithm at 40:18

sorry, too many buzz-sounds (ähhm, ahh) in your speech, I can't listen to this..

In the slide at 24:18, the last term is the square root of a fraction where the numerator is, e.g. in the context of a game, "twice the natural logarithm of the number of actions taken since the beginning of the game" and the denominator is "the number of times this particular action has been chosen this game", and is meant to give rarely chosen actions a little boost in how likely they are to be chosen. I'm pretty sure that's the correct understanding, anyone care to disagree?

The guy who was continuously typing on a laptop or so has managed to kill the whole record. Cong.s. ıf you do not listen why do you come?

Very good lecture, but I find the person constantly typing on the laptop is extremely disrespectful.

RL start here

slides?