Microsoft researchers master Ms. Pac-Man with AI

  • Microsoft researchers master Ms. Pac-Man with AI

Microsoft researchers master Ms. Pac-Man with AI

Having created an AI system to beat Ms Pac-Man, the techniques used to train it could be used to better train AI agents to perform tasks to aid people in their working and daily lives. Some members of the team handled the task of keeping away from ghosts in the game, while others go a reward for locating a particular pellet. The team utilized a divide-and-conquer method that may have significant implications in how artificial intelligence systems can be taught to carry out complex tasks in place of humans.

Associate professor of computer science at McGill University told Microsoft that the method Maluuba used was very similar to theories about how the brain works and suggested that this might be a step toward "more general intelligence".

The researchers found that the best results were achieved when each agent acted egotistically, while the top agent made the best choice for everyone. This is a significant achievement since AI researcher have always found Ms. Pac-Man among the most hard to crack.

As it plays, these agents present the AI's top agent - described by researchers as sort of a "senior manager" for a company - with feedback on which direction it should move Ms. Pac-Man.

The method is called Hybrid Reward Architecture.

Some might question why a cutting-edge technology such as AI is training itself on games designed in the 1980s.

The road to the Ms. Pac-Man ideal score required researchers to divide the task of mastering the arcade game into small pieces that were then distributed to more than 150 artificial intelligence agents. You'd think you could keep going if you wanted to. As Pedro Domingos explains in his book The Master Algorithm, reinforcement learning is a field "dedicated to algorithms that explore on their own, flail, hit on rewards, and figure out how to get them again in the future, much like babies crawling around and putting things in their mouths".

Sure a robot might not be able to speak well (right now) but chatbots are becoming more prolific and wouldn't you rather instant message a salesperson than talk to them on a phone?

With reinforcement learning, an agent gets both positive and negative responses and learns through trial and error to maximise the positive ones.