Now With Line of Sight

Over the weekend I updated my bot to use a line of sight to decide on what to do for each ant. This fixes some of the worst behaviors my ants had, but it’s not perfect. The ants no longer crowd up trying to get to food or a hill they can’t reach, but they’re still not terribly bright. I had to turn off the food reservation system because it wasn’t intelligent enough and the side effects were worse than the benefits.

So how smart is my bot now? I’m not sure. It should be better, but the AI Challenge keeps having server problems. My bot had to wait most of the weekend for games to start being played again, and they seemed to have another problem today. Because of this games aren’t happening very fast at all so my bot hasn’t had the chance to move up to it’s true rank. The AI Challenge forms have posts about alternate servers you can test you bot on, I may have to go to that.

I’m a little tired of working on the bot at this point, even though I can think of quite a few things I should do. After all the time I’ve put in lately, it would be nice to see my bot reach it’s potential (or at least where it was before the last change). The slow game rate was very demotivating.

Getting Better At Clojure

My bot topped out around 1000th place. After a while it started to lose, which wasn’t that surprising. For various reasons, my bot is very bad on the maze maps and those seem to come up the majority of the time.

I spent the week trying to make my bot more intelligent. I’ve wanted to use gradients to give my bots marching orders, so I worked on code to do it. I wrote it over two days or so and rewrote it at least twice before I tested it as I kept thinking through the algorithm. I was really pleased when, except for one a mistyped variable name, it worked the first time.

I started testing how fast it was (since bots in the AI Challenge are limited to 3 seconds per turn) and it was great on tiny test cases. As I went to test it on random data that represented something the size of the large maps, it started crashing. I spent a bunch of time chasing down a StackOverflowError bug. I was pretty sure it was a bug in my code for quite a while, but as it turned out it’s just a known issue in Clojure, my code was technically correct. If you embed LazySeq in LazySeq in LazySeq over and over when Clojure goes to get a value it can go so deep that it runs out of stack space. The solution is to force evaluation to remove the “Lazy-ness”.

Unfortunately, things are still way too slow. Doing a full re-build every turn is never going to work, so I need to make it so I do partial updates every turn. While doing a little research I found that Clojure has a hidden queue type, which is drastically faster at removal (about 60x in a simple test case) than my List based solution.

In the mean time I have made my bot smarter. I’m hoping my bot is now up to tired toddler, but we’ll see as it starts to run games. I added a food reservation system to prevent tons of ants from getting piled up trying to get to a single piece. I also added some code to help with exploration, but I had to disable it as it caused a weird side effect I didn’t feel like debugging tonight. It actually caused more jams.

Speaking of jams, while watching my last replay I noticed an odd case. The ants are programed not to go backwards if possible. When a group of ants gets boxed in to a concave area in a map, if the outside ants push “in”, they won’t want to go backward and end up locking all the ants in. I did shuffle some code around, so this may no longer be an issue. I’ll just have to keep an eye on it.

Ants Gain Courage, Film at 11

I’ve uploaded a new version of my bot to the AI Challenge site. My bot was, I believe, the 10th best Clojure bot before I updated my code and wiped out my rating. The new code probably won’t play it’s first game until early tomorrow morning. I put in a fix for the possible issue of my ants failing to defend their hills, giving my ants courage when they could see one of my hills.

The problem with ants getting trapped that I mentioned before actually bit me. In my bot’s 5th game, it’s first move was into such a corner. As a result, it did precisely nothing the entire match, until an enemy ant kindly came along and wiped me out.

At least that’s fixed now. New intelligence rating: very tired toddler.

AI Design Flaw

Aside

My bot has played two more games, and legitimately won both. But while I was thinking last night I came up with a bad realization: since my bot runs from enemy ants, it would run away from protecting the hills. This could allow a single ant to capture a hill that had lots of ants around it.

This hasn’t happened yet. I’d better put a fix in before it does.

My AI Challenge Ants are Alive

My entry for Google’s Fall 2011 AI Challenge has been posted. In it’s first game it pulled out a decisive victory, in that it was gathering food but none of the other bots were. Still, it’s up there.

Current intelligence rating: badly sleep deprived toddler.

Right now, the logic is pretty simple. For each ant it works through a list of objectives, making the first move that an objective returns. It tries to raze opponent’s hills and gather food, all while running away from enemies. If it can’t do either of those, it will basically make a random move. There is code to prevent the ants from cycling between two squares, but it has an unintended consequence. The ants aren’t allowed to go back to the last square they were on. The leads to ants trapped on little islands surrounded by water (there are five such trapped ants in the shot above).

On the whole it works well. Since my colony spends so much time gathering food, it grows fast. The cowering I built into them helps keep them alive, and the random walking combined with the food seeking provides some exploration. In the end, the colony can grow pretty large and they tend to stay clumped up. I’d imagine I’m no match for the good bots, but I should be able to survive OK.

As I post this, there are only 25 Clojure based bots. I’m hoping to make a good showing on the list. That’s my main goal.

Trying AI Challenge Fall 2011 with Clojure

About two weeks ago a post popped up on Reddit that the fall 2011 Google AI Challenge had opened. This time, you command your own little ant army in an attempt to beat out the other colonies on the map through attrition (total or partial). I’ve been interested in Lisp and functional programming for a few years now, with a special eye towards Clojure. My interested had been rekindled after reading Writing Tetris in Clojure, and once I saw that Clojure was a first class citizen in the content I was off.

I don’t think Clojure 1.2 was out the last time I played with it, so there are some small changes to the language. I’ve been using the La Clojure plugin for IntelliJ as my IDE, and I’m trying to learn the Leiningen tool for Clojure project management. To top it off all, I’m using git instead of SVN to keep my changes. This means I’m learning quite a bit, but things aren’t perfect.

La Clojure doesn’t know anything about Leiningen, and I had to search for a while to find a good tutorial on Leiningen. Most seem to be in the form “Download, run this command, you’re all set”. That doesn’t give you much information. The blog entry I finally found was more useful, but I lost the link (it’s somewhere in my history).

Debugging Clojure is just as frustrating as I remember it. There is no debugger, you’re basically on your own. You can try to use a Java debugger, but that makes little sense. There is a package of macros for Clojure that is supposed to make things a little better, but I haven’t used it yet. This is all complicated by the fact that the AI Challenge tools run your program for you in a little sandbox, so I’m not even spawning the process. I’ve fallen back to the tried-and-true method of printing out values I’m interested in, which isn’t ideal. Leiningen supports running tests, so as I’m starting to get a grip on my program I’m going to have try writing some soon.

I took the Clojure starter bot and split it out into a couple of files to help me keep the code organized. At this point, each ant moves towards the closest piece of food, or randomly if it can’t move in that direction. The ants aren’t intelligent and they like to run into each other. But I’m making progress. It took me quite a while to sort things into files, but only an hour or two to do some additional refactoring and add the basic food seeking.

But still stymie me though. The other day I was stuck with my program just exiting after the setup phase. There was no error message, it would just quit. It came down to some changes I made to the code. I took a piece of code that look like this:

(loop [blah blah]
	(if (test? program-exiting-condition)
		(do-something-and-return)
		(more-calculations-and-recur)))

and I removed the part I didn’t need, the (do-something-and-return), replacing it with a comment to put code there later. This effectively turned the code into this:

(loop [blah blah]
	(if (test? program-exiting-condition)
		(more-calculations-and-recur)
		)); Fall through and exit program

That took me way too long to find. Once I figured out where the problem was, it took me a few minutes to get back into the Clojure mindset and remember that wouldn’t work.

Don’t use FileMerge with 25mb files

Aside

How long does it take to use OS X’s FileMerge on a pair of 25mb files? Somewhere between 20m and 2 hours. I left my computer after 20m since it wasn’t done, but it was when I came back. At this point, I’m convinced I don’t need to worry about using PyPy with Skeinforge, so I can start saving time. Now if I could only quadruple the speed of my ToM…

Differences in PyPy generated gcode (Updated!)

Comparison of CPython and PyPy fill patterns

As I said in my last post, the gcode generated when using PyPy instead of CPython can differ, so I’ve been trying to find out when. I already knew that the standard 20mm calibration cube had no differences, and the same seems to be true about a simple cylinder generated with OpenSCAD. So next, I generated a cylinder intersected a box and the differences appeared. It’s in the picture below, but here is the code:

$fn = 25; # So the cylinder is reasonably smooth

union() {
	cylinder(30, 12, 12);

	translate([5, -5, 25]) {
		rotate([0, 30, 30]) {
			cube([35, 35, 5], true);
		}
	}
}

So what’s different?

The first thing I noticed when running a diff between the files is that the bounding boxes for layers can be defined in a slightly different order. For example while CPython will generate this:

(<boundaryPoint> X26.813 Y-12.614 Z16.375 </boundaryPoint>)
(<boundaryPoint> X9.313 Y17.697 Z16.375 </boundaryPoint>)
(<boundaryPoint> X5.232 Y15.341 Z16.375 </boundaryPoint>)
(<boundaryPoint> X22.732 Y-14.969 Z16.375 </boundaryPoint>)

PyPy will put the first line as the fourth. It’s the same bounding box, but it shows up in a diff. More importantly, it also makes different decisions when it comes to fill. My profile is set to a 20% fill, and you can see the differences in the image above. Neither implementation’s output seems to be actually suffer. In the case above (a few layers after the intersection starts), PyPy uses fewer straight lines. But further up on the object CPython put more angles in. Things are never too different though, they are kind of close. Maybe this isn’t something to worry too much about.  I’ll have to spot check one of the more complicated objects to see if an obvious problem occurs.

Update:

After posting this I received this email from Carl Friedrich Bolz, one of the PyPy developers:

[I] wanted to point out two of the most likely sources of differences (I’m a PyPy developer):

– dictionary order is not guaranteed between implementations
– CPython probably uses x87 floating point math, whereas PyPy uses SSE2. those two have slightly different rounding behavior, so if the algorithms are not numerically stable, you can get diverging results.

if you find out that it’s not one of those too, it might be a bug in PyPy.

The SSE issue was actually one of my theories for what was going on. Rounding on floating point numbers would explain what Skeinforge is doing. On my example layer above, PyPy must have thought it was a hair under on plastic so it added some extra kinks in the fill lines to make up the difference. The dictionary ordering explains the outline coordinates.

I haven’t gotten around to spot checking the larger object, but between my results above and confirmation that PyPy handles floating point number differently… I think I’m going to start using it permanently with Skeinforge. The 4x speedup is really wonderful.

Trying PyPy with ReplicatorG

PyPy + ReplicatorG

While I’ve been having a ton of fun with my MakerBot Thing-O-Matic, generating the gcode to run the machine takes an amazingly long. Despite having a quad-core Core2Duo running a hair over 2.5GHz, Skeinforge is not efficient. Python isn’t fast at that much number crunching, and the process that turns the 3D model into a series of 2D layers is almost totally serial.

I’ve been using a program called Pleasent3D to look at the gcode Skeinforge makes to look for errors. While reading about the software, I found a blog entry by the author describing how he got tired of waiting on Skeinforge and then ported the gcode gene ration algorithms to Objective-C and ended up with a massive speedup (up to 230x in one module).

Now I don’t want to get away from the official MakerBot way of doing things yet, I’m not that confident. The easiest way to speed up Skeinforge would be to speed up Python. People used to recommend Psycho, but it is no longer updated and doesn’t run in 64-bit mode (like Python does on my Mac). I finally got around to playing with PyPy, which grew out of Psyscho, to see just how much of a difference it made. I’d seen references to other people doing this, so I knew all the libraries had to be there.

Installing PyPy & TkInter

Installing PyPy was as easy as downloading a pre-built binary distribution and unzipping it. The problem is that to use PyPy with Skeinforge (and therefor ReplicatorG), you need to have TkInter installed. I found a great set of steps to follow in a PyPy blog entry about using the IDLE IDE. The only problem was the TkInter package for PyPy is looking for Tk8.4, but OS X Lion has 8.5. Luckily since it’s a point release I figured I could just swap them out and it worked out fine. I updated the files in the tkinter-pypy install file and it worked fine (updated zip for download). Making ReplicatorG use PyPy is as easy as selecting it in the preferences window.

Benchmarking

So I ran some benchmarks by running a couple of models through with both CPython and PyPy with my normal profile, no support:

Model CPython PyPy Speed
20mm calibration cube 19s 13s 1.46x
Vica illusion sculpture 5m 36s 1m 48s 3.11x
V8 cam shafts 6m 37s 1m 18s 5.09x
Tank! 2h 45m 29s 38m 45s 4.27x

As you can see, PyPy can be dramatically faster. I chose the Tank! model (which unmodified is about 4x the max print area on my ToM) to see just how big the difference would be on very complex models. Getting a 4x speedup is a great result, but there is just one minor problem: the gcode is different. Generating gcode should be deterministic. Given the same profile, the same model, and the same version of Skeinforge you should always get the same output.

In the case of the 20mm calibration cube, the gcode is identical. But for every other model, it’s different. The F parameters on the move commands are different, and sometimes the points seem different. I haven’t looked at it too hard yet. It seems like it might be due to floating point differences, but PyPy is supposed to be a drop in replacement for most Python programs so I doubt this should be happening.

I’m going to have to look into it further.