These notes were posted on the GnuGo mailing list the 30th of July, 1999

These are ideas I looked into many years ago when wanting to write a semei module. Basically, the interesting features on the board is the mapping from position to worms, the set and number of liberties of a worm, and potentionally the set of neighboring worms. As these features are costly to calculate we want to maintain them incrementally. Below I'll just focus on the worms themselves, but liberties and other can be treated similarily.

As we know, a worm at a given position is characterized by the maximal reachable subgraph from that point such that all points are occupied by the same color. The effect of placing a stone is one of three: the creation of a new worm, the extention of an existing worm, or the merging (connection) of two or more worms. There is a very efficient algorithm with O(log* n) complexity to do this incrementally, known as the union-find algorithm. I'll attempt a description here, but this is certainly described better in the literature.

The union find maintains an equivalence relation on some discrete set S (isomorphic to a range of integers) with two operations, union and find. Find returns a cardinal element (a representative for the equivalence set), thus two elemets e1 and e2 are equivalent iff find e1 == find e2. Union merges two equivalences into one, taking as argument two arbitrary elements from each set. Initially every member is related only to itself, that is, find is the identity.

The efficient way to implement this uses an array to represent inverse trees. The array maps an element to another element from the equivalence. The cardinal elements are fixpoints, ie they map to themselves. Find thus only need to search through this chain of elements until it reaches a fixpoint. Union is also simple: just find the cardinal elements of one (or both) of the sets and set one to point to the other.

The efficiency comes from a slight modification to find. In addition to finding the cardinal elements, it afterwards backpatch all the other elements to point directly to the cardinal elements, such that future finds on these will terminate immediately. With this modification find gets an amortized "near-constant" (O(log* n)) cost.

It is an easy extension to get efficient access to all the members of an equivalence, just link all the members in the equivalence in a circular list maintained by union.

Another attraction of this algorithm is the simplicity of the code. It can be done in only a handful of easily understandable lines of C. The problem with union-find is that splitting an equivalence has linear complexity. In the case where you only want to undo a union operation this can be made reasonable by saving the set of elements from the smaller of the two sets (I used a sligthly different approach, see Tinygo2 for the details).

Unfortunately the union-find doesn't help much if you want to maintain the enclosed region of empty intersections (it is the reverse problem).

Tinygo2 (I didn't name it that) doesn't follow completely the thoughts above. The reason is that I reasoned (perhaps wrongly) that accesses would be so much more often that it would pay to do the patching of find up front and rendering the find operation a trivial array lookup. In addition, tinygo maintains the liberty sets and counts. Here's a local copy of Tinygo2.

I'd be happy to have comments, but please remember that the code is old :-) For example, today I would use the point representation I suggested earlier today.

Cheers, Tommy