Smartmind
.global
Which of the following is NOT a type of policy in reinforcement learning?
Stochastic policy
Value-based policy
Overlook minor misbehaviors
Impose harsh punishments for any infraction
Reinforcement Learning Exercises are loading ...