28/01/2008 - So far I have finished random bot and am now giving thought to how to start the first iteration of my GoBot.
01/03/2008 - Implemented first stage of back-end, Simple 1-ply capture/save MGU, started work on influence MGU.
The overall design of my GoBot is to have a collection of move generation units (MGU) that will each generate one or more moves (must generate at least one even if it is random) with an associated value. This value is intended to be an estimation of how many points are generated by the move. It can actually be a negative value. The values for the moves are then weighted based on trust of the MGU. The move generation units can have a dynamic trust value, for example a MGU that attempts to maximize influence on the board will have a larger trust in the beginning of the game and it will lower as the game goes on. The output of one or more MGUs can be the input to another MGU. For example there could be an MGU which looks at all generated moves and does a large ply tactical search to see if the move may actually die in the end and will then spit out a negative value associated with the move.
I have chosen this method for two main reasons: first, it allows me to easily improve the GoBot by adding more MGUs so I can start quickly with a crappy GoBot and improve over time; second, each MGU has flaws and I am not a good enough player to find all the exceptions so I hope that by having a large number of MGUs and the ability to train the weighting scale I will mitigate the flaws.
I am going to need to build the core of the MGU engine. I am currently planning a tree network with each node spiting out a linked list of move/value tuples. I have two ideas of how to make it work: first, each node prunes a list of moves so the base nodes receive all legal moves and then prune all moves that they see as being meaningless and then pass on the list to the following node to be further pruned; second, each node generates moves internally and isn’t limited to just in the input moves. While the first seems like an interesting design it doesn’t allow the same simple training as the second.
Each link will be given a static weight. All dynamic weight functions will need to be done as a node itself, presumably the top node. The highest value on the final nodes list will be played, if all values on the list are negative then the GoBot will PASS.
The back-end will need to handle passing the messages of each node through to the top node. I am currently unsure how the design will look. I am considering a calling the upper most node which will then call each linked node which will call each linked node etc. until the base nodes which evaluate. This allows each node to be a class with links to connecting nodes and an internal weight value for the links. This makes implementation easier but makes it more difficult to train as the network is generated on runtime and could theoretically be dynamic also each node could be called more than once if it is a child of more than one node wasting processing time. The other option is a static tree that is stored in a file with weight values.
I would like to have the moves have an identification of what node they came from but I am not sure if this is a good idea. It may make better and more versatile code if I limit each node to only have a list of move/weight tuples and no idea how they were generated.
Implemented a simple version of the backend. Only one level of nodes. All nodes go into the master node which merges all the move/value pairs and plays the highest one. This should be enough for now. Though slower (more redundant calculations) it is simple to work with and I can optimize later. I plan to implement a simple liberty optimizer (moves have a value proportional to size of group, and how much it increases the number of liberties by) and an influence MGU (value determined by increase to total board influence). Perhaps a few versions of these. Once done I’ll run the node weighting though a GA to see how well it can play against amigo.