I randomly found THIS blog online and got very inspired by it.
I have already created a basic neural network.
So I quickly programmed the typical snake game mechanics in unity and then created a simple player controller to verify that everything is working as expected.
After that, I had to figure out how the network should "perceive" the world around it.
I decided to go for four inputs which all asked the question:
Is the path ahead (1 square) safe? Yes: 1 , No: 0
The four directions would be up, down, left and right. However after trying to do that I figured out that the direction "down" will always be a 0 once the player picked up the first point/food. So I scratched the "down" input and was left with three, up, left and right.
I decided to calculate fitness similar to the reference article. I calculate the distance from the snakes head to the food and determine whether this distance has gotten bigger or smaller with the previous move. I will then substract 1.5 if the distance has gotten bigger and add 1 if the distance got shorter, which intern gets stored in a variable that if below -20, kills the snake. This approach is taken since just surviving and not dying is not enough to determine how good a network is a playing snake, it actually needs to find food and eat it to get bigger and not circle around forever.
After this I had snakes that would stop running into the edges of the playing field but they could not determine where the food was and would die out rather quickly. So I give them the ability to see food that is not just close to them but also up to 5 squares away from them. Same rule applies as above, a simple yes or no question is asked and 1 or 0 is fed into the network. There are three directions so there will be three more input nodes for this question. Does the path ahead (5 squares) contain any food? Yes: 1 , No: 0
You can see the result of these networks in the first video linked at the top of the page.
A bit more about how the snake determines its moves: It checks all six input values, which determine the neural networks input values. A simple feed forward then calculates the output of these nodes. In video 1 there are two output nodes. Horizontal and vertical output. The algorithm chooses the biggest absolute value of the two. If its the horizontal value, the snake moves forward. But when its the vertical value, the algorithm will whether the value is negative or positive. Negative will move the snake left and positive right.
In video 2 there are three output nodes. (I wanted to try this because the main article also had three output nodes) Here the output would be horizontal, left and right. Similar to the first one, the biggest of the three is calculated and then the snake gets moved accordingly.
The neural networks trains by "breeding" the best survivors. I sort networks by fitness and select the best 4 to breed:
1 and 2, 2 and 3, 3 and4
1 and 3, 2 and 4
1 and 4
To create 6 offspring - 10 total networks for the next generation
"Breeding" simply calculates the average values for all weights in two neural networks.
Please reach me at email@example.com if you want to get in touch.