Guide to making AI agents that are so good, its actually scary š¤
A blogpost for AI and game development enthusiast that want to use leading edge technology to create amazing AIs, better, faster and scaringly good by learning from humans.
Introduction to Reinforcement Learning (RL)
Before we dive right into creating AI agents that can perform complex tasks, itās worthwhile introducing some relevant concepts that will be used in this blogpost. So lets start with what is Reinforcement Learning (RL)? RL is an area of Machine Learning focused on creating intelligent agents that can complete tasks on their own using reinforcement signals (punishment or positive rewards). RL algorithms have been used in industries ranging from robotics, manufacturing, energy to video games, with notable mentions being OpenAIās StartCraft 2, Dota 2, and Go agents that beat the best players in their respective games. A straightforward way to describe an RL agent and its relationship with its environment can be seen in Figure 1, where an agent gets constantly inputted a specific state and reward from the environment, making the agent take an action, thus changing the environmentās state and reward.
Drawbacks of Reinforcement Learning
Using RL algorithms is unfortunately not the solution for all problems as it has some drawbacks that make some implementations too computationally expensive to run. For instance, high dimensional environments, very delayed rewards, and not understanding the correlation between tasks are some hurdles that can make training and agent very tedious and computationally expensive. Even simple tasks like understanding that completing an action A gives the ability to do action B, might be too complicated for a simple agent to learn. A great example of this A-B relationship is shown in the famous Atari game called āMontezuma Revenge,ā where a key (A) gives an agent the ability to enter a door and get a reward (B). Although this might sound like a simple task, if we let our agent explore the game space without āhelpingā our agent understand the correlation between tasks, the agent might never understand the ādoor-keyā relationship and thus never reach its desired goal. But donāt worry, there are ways of solving this, and in this blog post, we will use one possible solution known as imitation learning to solve a similar problem.
Imitation Learning
Imitation Learning is a set of techniques that aim to help agents mimic human behavior by mapping state action pairs of human demonstrations or recordings and letting an agent sample from the recordings. Imitation Learning procedures are no different from the methods used to teach humans to do specific tasks, as we often imitate others behavior when we are in completely novel environments. For instance, if you are driving in a foreign city, you might imitate how certain drivers in front of you move in the road as they might have more experience regarding where certain potholes might be located.
Its worth mentioning that imitation learning encompasses a family of distinct algorithms that all aim to do a similar job. For this particular blogpost, we will be using a Generative Adverserial Imitation Learning approach which is similar to Direct Policy Learning. Both methods in short is a method of imitation learning that uses human-created data and feed it to an agent to learn a policy from, it then lets the agent proceed and learn and update its policy iteratively by itself levering human created data as well at the agents own data. For more information on different Imitation Learning Algorithms SmartLab created a great blog post on the topic here and a gail specific post here. In the Gail example we also have a discriminator that accesses if a sample is from a human or if it was created by the agent itself this will further be explained later on in the post.
Now that we got most of the definitions out of the way, we can now create an agent using Imitation Learning.
Step 1: Download Unity
Unity is a free game engine for students and individuals, which offers a vast ecosystem of tools with a lot of support from game developers from around the world. Download the Unity Hub application from the main Unity website here. Unity also has external packages that can be downloaded to import certain assets, scripts, and animations. For our AI examples, we will only use UnityML, which is an official Unity package that includes various examples and code snippets to train and visualize AI agents. You can also run multiple agents in parallel using UnityML, which greatly speeds up training time.
Once you have successfully downloaded Unity, you will see an application running called Unityhub, here you can manually download specific versions of Unity on your computer. For this blog post, we will be using Unity 2018.431f1..
Step 2: Download Python
UnityML will need Python to be installed. I will be using Python version 3.9.2 and pip version 21.0.1, a python package installer, but feel if you want to try blogpost this with a different Python configuration. Figure 3 showcases all the configurations options that can be used with UnityML.
With Python, Pip and Unity installed, we can now run the following command to make sure unityml packages are installed in our computerās terminal.
pip install mlagents
Step 3: Github Repository Cloning
Now make sure to clone the unityml repo on your computer from its official Github page. If you donāt have a Github account, feel free to make one here.
Itās very likely that just installing these packages, you might encounter specific errors, so feel free to go on Stackoverflow for unityml related questions or contact me if you are having trouble running these packages.
Step 4: Running UnityML Demos Scenes
Assuming you could download all relevant packages and cloned the Github Repo, you can now create a new Unity 3D project. (Make sure to install the specific Unity version you want to use before starting a default project). Now, open your Unity game file and its should load an empty world like in Figure 4.
Now you will want to import the packages from UnityML from the Package Manager under the āwindowā tab at the top left. Search for UnitML and download the relevant packages into your project. Also, you should be able to import the packages from your cloned ml-agents repository to you Unity3D project. If you got to this step, your setup is done.
There are a bunch of AIās worth exploring, feel free to go to the Examples folder and browse all the particular games scenes that are already present. We will focus on creating an agent in the game Pyramids as itās very complicated and can best show how Imitation Learning can considerably boost our agentās training time. (In your folder you can open this scene by going to ML-Agents>Examples>Pyramids>Scenes>Pyramids)
Step 5: Start Recording
Once your scene is set up try running the game and see how the default Ai agent behaves. Its alright at the game, nothing too extraordinary. Now, while being on edit mode, select the agent in the game and add an MLComponent using the Right Inspector Tab. Figure 6 helps for reference. Once you scroll down and press the Add Component Button and select ML Agents, add a āDemonstration Recorderā Script. Once the script is added toggle the āRecordā button, add a name to your recordings(optional add a name fo the folder you want to save your recording too).
Now in the same inspector tab, find the āBehavior Parameters Scriptā Tab and change the property āBehavior Typeā to be āHeuristic Onlyā (Its usually setup as Default). These steps make sure that we setup our environment to record our movements in the game. Now start the game, notice that you will be controlling the game with your keyboard. Make sure to do multiple runs with the agent (Controls are WASD). Once you finished your recording, a file will appear wherever you set up your recordings to appear. You can view stats of you gameplay by clicking the recording script as seen in Figure 7.
Now look for the projects āconfigā folder which might be outside your Unity Folder.
Inside the folder go inside imitation and then Pyramid. (recap config>imitation>pyramid). Open up a yaml file in a code or text editor and copy the following line of code inside the Yaml file.
gail:
strength: 0.01
gamma: 0.99
encoding_size: 128
demo_path: "Full Path of your Game Recording"
Feel free to play around with the properties like strength, gamma, econding size and so forth. There is a great officla Unity Blog Post about Gail and its parameters here.
Now press play again and you will be able to notice a difference on how your agent plays now. The difference in how the agent behaves is because it will not imitate some of you actions that you recorded. I leave you with the following GIF of how my agent played.
Congratulations, you were able to record yourself playing a game in order to help an AI agent navigate a very complicated tasksš„³š„³š„³.
I plan to do another tutorial on how to create an agent brain from scratch later on as well as how to submit you trained models to an AI model agreagator that is coming soon. Thank you for your time and I hope you found this helpfull.
Links mentioned for quick access.
https://blogs.unity3d.com/2019/11/11/training-your-agents-7-times-faster-with-ml-agents/
Great Intro to Reinforcement Learning
https://www.kdnuggets.com/2018/03/5-things-reinforcement-learning.html
Montezuma Revenge Explanation and why its so complicated to solve
https://www.theverge.com/2016/6/9/11893002/google-ai-deepmind-atari-montezumas-revenge
Imitation Learning Post from SmartLab
https://smartlabai.medium.com/a-brief-overview-of-imitation-learning-8a8a75c44a9c
UnityML Github