The Lumberjacks are Winning

I’m still working on ForestryService. I spent two hours last night trying to convert it to using a mutable vector instead of a list to keep track of the forest, but I couldn’t get all the monad stuff sorted out.

Tonight I switched to a non-mutable vector and had it running in about 10 minutes. When I turned optimization on the program flew until it got to image generation, which is still an issue. So with that solved I decided to try to track down my lumberjack bug.

It didn’t take too long. I added printouts to see how many lumberjacks there were at the start and end of each turn and quickly found the culprit. When a bear and a lumberjack collided and a mauling occurred I had the test backwards. I was removing the non-mauled lumberjacks and keeping the one who was supposed to be removed due to injury. One character change.

But that led to another issue, one which I thought might be a problem. We can end up with a couple of bears, but the number of lumberjacks quickly explodes. Within 25 years I have 1400 lumberjacks. At that size the list operations on lumberjacks (specifically finding if a square if free of other lumberjacks) absolutely dwarfs all other calculations in the program. I need to store them (and while I’m at it the bears) in sets. That would change my search from linear to log and should fix the issue. It will also make some of the tests cleaner since I don’t have to find intersections manually.

After that it’s back to the image code. I’m not sure if the problem is my pixel generation or the image library it’s self. I think it may be the integer division I’m doing to implement a basic scaling algorithm, but I haven’t profiled it since I switched to vectors.

Also I fixed all the spelling errors in the file. I can’t spell “forest.” I fixed it in the file and repo name before I first pushed ’em to github, but I didn’t touch all the variables and methods before.

A Week of Haskell. Also… Ruby

Over the last week and a half I’ve spent quite a bit of time writing Haskell programs. I needed ideas for what to do so I started giving some of the challenges in /r/dailyprogrammer a try.

First Challenge – Lines

The first one was LineIntersections.hs, which I posted as a GitHub Gist. It was for the #163 – Hard challenge. This was mostly practice since it didn’t require much in the way of new skills compared to my WordLadders program. It took me a few tries to find a good line intersection algorithm that would tell me where the lines intersected and to get that implemented.

Second Challenge – Termites

Things got much more complex with #163 – Intermediate, my solution is called BunkerMaster. You’re given a little map and you have to help humans build walls around their living area to keep out giant killer termites that come out of a nest.

Most people were doing an A* algorithm repeatedly to keep closing off the paths that the termites could taken. This often minimized the number of walls used but also tended to minimize the human’s area due to the way people implemented it. I’ve written A* algorithms before, in a way that’s what WordLadders used. I wanted to try something different so after a while I came up with the idea of flood-filling the map from both endpoints and putting walls wherever things met.

To do this I marked two boolean bits on each square of the map to note if the humans or termites has reached that square. As soon as both bits were set we knew we reached a place that needed a wall. I would end up with more walls than necessary but the humans and termites has similar amounts of territory on ‘fair’ maps.

The biggest problem I had once my solution was done was that because I was processing both opponents each turn if they met in the middle you can end up with double walls. It could be fixed but it didn’t seem worth it.

Third Challenge – Befunge

I really had to press myself for the third challenge. #164 – Hard was a challenge to create a Befunge 93 interpreter. If you’ve never heard of it before it’s an esoteric language designed to be nearly impossible to compile by using a 2D grid of mutable characters as instructions and non-stack storage. Here’s an example that prints 99 Bottles of Beer:

v v0\1:\1:{CODE}{CODE}\!:p15-<     Bottles of Beer for Befunge
0 \ {befunge} >" ekaT">v   written by Brian Raiter, 5/97
0>>.0"llaw eht no "v<#,:
"\,     >"eno"^>0 #"^1^_v
c1,>51g#^_"ti"^. >vr :  $
"::^" down, pass "<e    5
>^|\*25,<^     # i e ^g1< 
  ,>052*":dnuor t"vbv:<
v0_^    .      , ^< " "
>52*".ll"v   >,^   fb e
v<v<v_$!|>"aw eht no r"v
""" ,:  >" ;"^      f ^<@
meo >^"bottle"<    "o   $
o m^"re:"<v  _^#g15<  v_^
s""  >52*^>"s"^v"les"<,:
"^<  ^"99 bott"<    >^>^<
>" yub ,erots eht ot oG"^


So if writing an interpreter (which I’ve never done) for an odd language (which I didn’t know) wasn’t enough, I pushed myself by using the State monad (actually StateT stacked on IO) to keep track of and modify the interpreter state.

My interpreter is called BefungeIt and ran the test programs I found. It’s not 100% compatible since I don’t limit the size of the stack and I don’t think Befunge is supposed to have 32 bit stack values.

Figuring out the state stuff took quite a few tries. My worst bug took me maybe 20% of the total programming time to fix. When I was supposed to pop two values off the stack and then save the result in a memory cell I was accidentally *restoring* the stack values at the end of the function because of the way I modified the state. This caused any program that used the ‘p’ instruction to go haywire, but after stepping through things (and learning a fair amount of Befunge along the way) I figured it out. In the end the state monad was quite useful.

Fourth Challenge – Forestry

Now I’m working on ForestryService, which I started for #165 – Hard which asked for an ecology simulator.

I started this using the StateT stuff from my interpreter to carry the state of the world. I’m learning quite a bit here as I’m using things like sequence, mapM_, and more.

I went crazy with this too, because what fun is a step-by-step ecology simulator if it doesn’t output graphics? So I’m doing something I’ve never done in any programing language: output an animated gif of the world changing over time.

In fact that’s the slowest part of the program by far. Having to get the state of each of the 10k cells each frame is where most of the time (and memory) are being spent in my program. At the moment I’m doing the easy thing and using lists of lists to store everything and that’s not nearly performant enough. I plant to try replacing that stuff with a mutable Vector, but I haven’t gotten to that yet.

I’ve had to fix quite a few bugs. Things are supposed to be able to move to any of the 8 neighboring cells, but a mistake in my list comprehension meant they would only walk diagonally. That was harder to find that it should have been because I forgot to call the functions that reset how far characters could walk each turn so they would only walk once ever.

Right now I know of one bug (besides performance): something is causing a plague to kill my Lumberjacks. I haven’t figured out what’s happening but over the course of a year I can lose hundreds of ’em when no more than 1 should disappear. This is probably a test somewhere that’s not restoring values correctly, but it needs to be debugged.

When I’m done with this (or at least further through it) I think I’m going to ask /r/haskell about it. All the state transformation code looks way too imperative to me. I know that’s supposed to be a benefit of the state monad but I’m worried that I may have taken it too far.

I ran a quick profile on the code to see what was slow (any other hotspots are currently dwarfed by the gif creation, which was pretty obvious just from running it), but I was surprised how much faster the code got when compiled with -O2. Since I spent so much time in Java I’m not used to thinking about having to turn on a special optimization mode, that’s done automatically by the JRE. It’s been a while since I worked with a compiled language (well the Objective-C stuff I did didn’t need that kind of performance tuning).

Then There’s Ruby

Playing with Haskell has had quite an effect on my brain. I enjoy strongly type languages and have been LOVING ghc’s ability to find errors and suggest solutions. I’ve found myself doing things in Java at work and thinking “Haskell would have caught that”.

Well in the last few days I’ve been working on a little Ruby program/set of scripts at work and… it’s different. After my little Haskell immersion week going to an ultra-loosely typed language like Ruby is a huge swing of the pendulum. Besides simple parameter checking Ruby doesn’t seem to be too sharp at parsing. As an example I was missing an “end” in my file and instead of indicating the problem at the end of the file or in the actual problem area it indicated a problem with a requires statement at the top of the file (which was a total red-herring).

I’ve looked at Ruby before but I’m not very familiar with it and I keep getting surprised at how many options there are to do some things. Often there are multiple methods that are just aliases for each other (such as Array.length, Array.size, and Array.count), which makes me wonder a bit. I’m also having problems with documentation I’m finding online. While the core documentation is pretty good 3rd party gems often have highly questionable documentation that was clearly auto-generated and doesn’t provide any information that’s not in the method signature. I already know that a method takes 2 parameters and returns an Object; I want to know if that’s a String or a MyStruct or a whatever.

Bits of Ruby style are also getting to me. I’m so used to camelCase that I keep having to catch myself to match the use_of_underscores that seems to be the common Ruby style. I’m not sure about some other stylistic points, such as when I should use parenthesis in method calls or if people put “then”s on their “if”s.

I was quite happy to see characters on the end of method names to indicate if they’e a predicate (such as .nil?) or if they have side effects (such as .map!). I was really impressed with that pattern when I first saw it in Lisp/Scheme and it’s nice to see another language use it.

I’m having problems finding advice on Ruby online though. For example I wanted to print a number in Hex and did a quick Google search. Numerous sources told me that num.to_s(16) would do it, but Ruby complained that to_s didn’t take parameters. I’m not sure if that’s a version issue (I’m using 2.0, but 1.9.x and 2.1.x both seem to say they support it) or what. I’ve also run into a few things questions the solutions are clearly poor and a library found in a later answer is a much better solution. I ran into this finding people suggesting rolling your own command line option parser instead of something in the standard library. I get the feeling this is because of the number of people writing Ruby without much programming experience not knowing how many libraries are out there, much like I remember seeing in PHP.

Despite all this I’m having fun. I love playing around in new languages and I’m getting work done.

But I want to get back to playing with Haskell. I want to help my Lumberjacks.

WordLadders in Haskell

Over the weekend I wrote my first Haskell program. It’s called WordLadders and I’ve spent a total of about 5.5 hours on it.

I’ve been interested in functional programing for quite a while. Last year I started reading about Haskell more heavily, including all of Learn You a Haskell and Real World Haskell. I’d been trying to find a small program that I thought I could actually make on my first try, but I hadn’t come up with any really good answers.

Then on Friday Nathan Smith posted I wrote a Go program for solving “doublets” (AKA word ladders) using A* search algorithm on /r/golang. It uses a dictionary and the A* algorithm to try to connect two words through a list of real words, changing only one letter at a time. So if you entered “goat” and “toad”, the path might be goat -> goad -> toad.

Well that seemed simple enough to me. There are only four parts to the program:

  1. Reading in the dictionary file
  2. Creating a graph that links the words together
  3. Finding the shortest path between two words with A*
  4. Handling user IO (getting words, showing answers)

That’s the order I ended up writing the program in. Haskell has a reputation that “If the code compiles, it works (almost all the time)“, and that was my experience. My program contained only a single bug that the compiler didn’t find. When doing recursion within my A* implementation I was forgetting to remove the node I just processed from the open set, leading to an infinite loop. There’s no way for the compiler to catch that, but it caught everything else.

The actual process of writing the program was a little rough. I use IntelliJ all day at work and wanted to continue to use it, but the Haskell plugin doesn’t work right now. Development just picked back up after a two year hiatus and the old plugin doesn’t work with my (much newer) version of IntellJ. In the end I ended up using TextWrangler with a simple syntax highlighter and the ghci REPL.

My program works correctly as far as I can tell. It can find the longest word ladder (between ‘charge’ and ‘comedo’) in 8.5s on my 2010 MacBook Pro. It also correctly handles words that can’t be linked (such as ‘apples’ and ‘spoons’).

I’ve posted it to /r/haskell to get some advice on improving my program. There are a few things that I felt I stumbled on. There were also some issues I didn’t fully understand, but as often happens the solutions came to me while I was writing out the questions. Rubber duck debugging strikes again.