A significant challenge in designing reward functions for MARL is:
Data collection
Overfitting
Overlook minor misbehaviors
Impose harsh punishments for any infraction

Reinforcement Learning Exercises are loading ...