Project Bella

A Life of Crime

This week, I did what I've done pretty much every week and didn't do anything close to what I said I'd do last week. I took my professor's suggestion to try and fashion one of the existing Unity ML-Agents examples into a game, and I believe I'm making decent progress. I took the Food Collector example since it was so similar to my forager agent (but in 3D!) and messed around with it. In the original example, the agents were capable of picking up food and poison, and temporarily freezing other agents with a laser. I modified it so that each agent is now a sort of combination of a penguin agent (tbt my girl Joundoom) and a foraging agent:

  • There are five agents in the scene that are competing for resources
  • Each agent has its own "nest"
  • An agent can move forward and turn (like Joundoom the penguin)
  • An agent can collect little green food balls and red poison balls scattered around the scene, much like the foraging agent; these collectables regenerate upon collection
  • An agent can eat a limited number of food balls before they must drop them off at their nest to collect more
  • Eating food balls increases an agent's energy, but it will slow them down
  • An agent can enter "aggressive" mode in which they turn red and speed up; if they touch another agent while aggressive, the touched agent will be sent back to their nest and become frozen for a few moments; this mode consumes energy and is unique from both the penguin and foraging agent

Each of these agents receive the following observations:

  • The direction they're facing
  • The direction toward their nest
  • The distance to their nest
  • Whether or not they are frozen (which occurs if they're touched by an aggressive agent)
  • Whether or not they're in aggressive mode
  • Whether or not they're carrying the maximum number of food items
  • The percent of their maximum energy remaining

Finally, each agent received the following rewards and punishments (on the first attempt):

  • +1 when a food ball is collected
  • +1 when they bring food to their nest
  • -1 when a poison ball is collected
  • -1 when touched by an aggressive agent

I trained sixteen of these agents for 20 thousand iterations, and by the end of it they were grabbing food and avoiding poison with precision. Even so, they still weren't returning to their nests to drop the food off. They were also becoming aggressive as much as possible because it made them move faster. Since energy and food were connected during this run, becoming aggressive would burn both energy and food, allowing them to collect more food and causing the agent score to skyrocket despite the fact that they weren't performing the crucial step of returning to the nest. On the next attempt, I created a very small punishment for becoming aggressive, and did not reset the energy when the round resets. With these modifications, they STILL weren't returning to the nest, but they weren't becoming aggressive nearly as frequently (they would only use it as a small boost to grab a piece of food right in front of them). They also started displaying some SUPER cute behavior, like stopping, looking around, and charging.

Finally, I added a curriculum and a stronger penalty for getting touched by an aggressive agent: they get returned to their nest, temporarily frozen, AND they lose all their food. I also decoupled food and energy, so an agent can now lose energy without losing food, but still gain energy if they eat food. This forces them to have to return to their nest if they want to keep collecting food and increase their energy. The curriculum I used modulated two parameters: "nest radius" and "max food". Nest radius determines how close an agent has to be to their nest for food drop off to occur, and max food determines how much food they can hold at once. Nest radius gradually decreases while max food gradually increases. Here is the curriculum I ultimately used:

200 thousand iterations later (an hour and fifteen minutes), they were finally returning to the nest:

This game is playable if the agent's behavior parameters inferencing mode is changed from default to heuristic, but it is obviously not polished to play yet. The AIs are formidable and play like a very focused human, but I don't think the game is engaging enough yet. These agents clearly function as challenging and somewhat sympathetic (they're really cute) AIs with a complex machine learning model for a brain. I have an idea of exactly what I want to do for next week, but I won't reveal it yet; it has to do with themeing for the game for which I have very specific inspiration that I'm excited to try out.