Haskell fence4 great good

Learning Haskell

Some time ago I decided I'd like to learn a purely functional language and settled on Haskell as the language of choice. A few weeks ago I came across a wonderful free book by Miran Lipovača humorously titled "Learn You a Haskell for Great Good". I cannot recommend this book highly enough; the author's clear explanations are entertainingly complemented by his most appealing sense of humour. I found it to be an excellent guide. As I explain below it helped me to write my first bit of Haskell available here.

A computational geometry exercise in Haskell

After I finished Lipovača's book I made a list of bite-sized coding projects that I thought might help me get acquainted with my newly-forged Haskell weaponry. Amongst these was a nice little computational geometry problem from the USA Computing Olympiad website. The problem, titled "Closed Fences" and nicknamed "fence4", is as follows:

Input: A point in the plane, known as the eye, together with a list of points in the plane which form the vertices of a polygon. All coordinates are integers.
Output: The set of sides of the polygon which are at least partially visible to the eye (looking in any direction).

For example, in the polygon below, the edges coloured red are those visible to the black dot representing the eye:

Example edge visibility problem

The exact description provided by the USACO training website is available here.

Choice of algorithm

The obvious approaches to this problem fall into two categories:

Raycasting
Clipping

In both cases we independently consider the visibility of each side in turn. Thus we seek an algorithm to determine whether or not one particular side is visible.

Raycasting algorithms

The easier class of algorithms to implement are those based on raycasting. To get the basic idea we take up an old idea of Plato's and imagine the eye to be an omnidirectional light source. Thinking this way, whether or not a side is visible to the eye depends on whether it is fully shadowed by other sides. If a side is partly shadowed, then the ray joining the edge of the shadow to the eye must just graze a vertex of the polygon. We can thus find all potential shadow boundary points for a given side by intersecting all rays from the eye through each polygon vertex with the line containing the side we're testing for visibility. With these points in hand, the side is visible if and only if any point strictly between any adjacent pair has a direct line of sight to the eye. The algorithm needs a little tidy-up but it should be clear this will work and in its current form it requires O(N²) time per side where N is the number of vertices.

Clipping algorithms

Slightly more difficult to implement are those algorithms based on clipping. For these algorithms the key object is what we will call (by a slight abuse of terminology) the viewport. This is the triangle whose vertices are the eye together with a pair of points, on the side whose visibility we are trying to determine, that represent our current belief about which portion of this side is visible. Initially this pair of points is the pair of vertices that form the end points of the side. We update the viewport sequentially by "clipping" it according to whether it intersects the other sides of the polygon. For example the initial and final viewports of the lower-most side in our example polygon are shown below:

Initial (include shaded region) and final (exclude shaded region) viewports for lower-most side

A side is visible if and only if its final viewport has positive area. In its current form this algorithm requires O(N) time per side.

Algorithm chosen

I opted for a clipping-based algorithm. I also decided that I wanted perfect accuracy and so I only allowed myself to use integer arithmetic. Most importantly though, I wrote the algorithm in that beautiful language, Haskell! Below is an outline of the algorithm I ended up with.

The algorithm is based around a function with the following type:


  lineSegMeetsLineSeg :: LineSeg -> LineSeg -> Bool

This function returns true or false according as whether or not two line segments in the plane intersect. Given this function, the algorithm to determine visibility of a given side starts out with a viewport that can see the entire side and for each other side of the polygon uses the lineSegMeetsLineSeg function to check if it meets either edge of the viewport and takes action accordingly, taking care to handle end point intersections correctly. This is accomplished by the following functions:

clipEdge :: Edge -> LineSeg -> ViewPort -> ViewPort

When a polygon side does intersect an edge of the viewport, this function returns a new appropriately-clipped viewport.

clipByOneSide :: ViewPort -> LineSeg -> Maybe ViewPort

This function determines whether a viewport needs clipping and calls clipEdge if necessary.

clipByAllSides :: Point -> [Point] -> Int -> Maybe LineSeg

This function calls clipByOneSide for all polygon sides (other than the side whose visibility we are testing) keeping track of the viewport as it proceeds.

And that's it! If we wanted to speed things up we could also use 2-dimensional back-face culling but there is no need.

Source code

The USACO website is more than a list of problems: it allows you to submit your source code for automatic testing against a range of test cases. Unsurprisingly, Haskell is not amongst the set of languages that the USACO accept. Thus, as soon as I had my Haskell code working, I set about converting it to C++ so that I could submit to the USACO website for testing.

Although it was a slightly tedious exercise translating the Haskell code to C++, it was interesting to see the sort of C++ code that resulted. It seemed to me to be a good example of how it is possible to write good quality code with almost no object orientation. If nothing else, this exercise convinced me to try harder for referential transparency when writing C++ in future.

In any case for those who might be interested, here is a link to both the C++ and Haskell sources. I think the algorithm is particularly easy to understand in the Haskell implementation.

One final comment on the Haskell. The key function has the following type:


  clipByOneSide :: ViewPort -> LineSeg -> Maybe ViewPort

and the visibility problem is solved by folding this function over the polygon sides. Since clipByOneSide can return the Monadic type Maybe ViewPort, I had to use the Monadic version of foldl, namely foldM. However when debugging I wanted to look at the progress of the clipping as it proceeded and so I wanted a Monadic equivalent of the scanl function. To my surprise, the Control.Monad module (which contains foldM) does not contain a corresponding scanM function. Here is how easy it was to introduce the missing scanM function:


  c :: (Monad m) => (a -> b -> m a) -> m a -> b -> m a
  c f y z = y >>= ((flip f) z)

  scanM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> [m a]
  scanM f x l = scanl (c f) (return x) l

An aside on the projective plane

Computational geometry algorithms are notorious for being a lot more difficult to get exactly correct than to get mostly correct. This is often due to the large number of degenerate types of behaviour that tend to occur (e.g., intersection along edges or in corners etc.).

Suppose we imagine the viewport triangle and ask ourselves what are the different ways a line can intersect it. (This line represents the line containing a side which might be clipping the viewport.) For now let's exclude the case where the line passes through a vertex of the viewport and see how many cases there are. In the pictures below, E represents the eye and A and B are points on the side to which the viewport belongs. The "clipping" line is in blue:

Case 1

Case 2

Case 3

Case 4

Case 5

Case 6

Case 7

Case 8

Case 9

Evidently there are 9 cases because there are 3 different locations for P (behind E, between E and B, beyond B) and 3 for Q. In fact we can simplify things by introducing the line at infinity and so passing to the projective plane. If we do this then the cases "behind E" and "beyond B" are the same since the two ends of the line BE are connected at infinity. Similarly for Q. Thus there are now just 2x2 = 4 cases instead of 9 overall. In terms of the picture above, these four cases are:

Cases 1, 3, 7, 9
Cases 2, 8
Cases 4, 6
Case 5

OK great, just 4 cases to deal with instead of 9 but now what about those degenerate cases we excluded, i.e., when the blue line passes through A or B or E? Even in the projective plane there are still rather a lot of these. The first set of degenerate cases occurs when the blue line passes through exactly one of A, B, E and there are 2 cases for each of A, B, E giving 6 of these. The second (and last) set of degenerate cases occur when the blue line passes through two of A, B, E, i.e., when it is in fact coincident with one of the three lines AB, BE, EA. So even though the generic behaviour is described by just 4 cases we still have to handle a total 4+6+3 = 13 cases overall. Grouped by how degenerate they are, the cases occur in families of decreasing dimension: 2, 1, 0. This is classic computational geometry behaviour and I usually regard it as a bit of a pain. However there is a nice way of looking at it.

First note that the viewport actually defines a triangulation of the entire projective plane. All we have to do is extend its three sides to full projective lines. Here is a picture of what happens:

Projective plane triangulation

The triangulation has four faces which are numbered 1, 2, 3, 4 in the picture above. The first three look disconnected but they are in fact just as good triangles as face 4 except that they include a segment of the line at infinity. Similar remarks apply to the edges, of which there are 6, and which are coloured to indicate when they are really connected because they pass through infinity (e.g., there is just one red edge, the two segments we see are connected at infinity). Lastly there are 3 vertices which, for once, are all visible to us. (Thus this triangulation has \( V = 3, E = 6, F = 4\) and so Euler characteristic \(\chi = 3 - 6 + 4 = 1\) as it must.)

All very well but what's the point? Well, now note that in fact we also get a natural induced triangulation of the dual projective plane with the same number of vertices, edges and faces as the original. This is because three points in general position (i.e., non-collinear) in the projective plane uniquely determine three points in general position in the dual projective plane (i.e., three non-concurrent lines). In other words, it is equivalent to specify a triangle by providing its vertices or its sides. To spell this out, recall that the points of the dual projective plane are lines of the original and that the lines of the dual projective plane are points of the original (and that the incidence relation between points and lines is unchanged). The 4 faces of the dual projective plane triangulation correspond to the 4 2-dimensional families of lines that have no degenerate intersections (i.e., do not pass through A, B or E). The 6 edges of the dual triangulation correspond to the first set of 1-dimensional families of degenerate intersections for lines which pass through exactly one of A, B, E and the 3 vertices of the dual triangulation are the 3 lines AB, BE, EA.

Thus the different types of behaviour of the clipping line, including all its degenerate intersection possibilities, correspond to where it sits, as a point in the dual projective plane, in terms of the triangulation of the dual projective plane that is naturally induced by the viewport triangle. I find it pleasing that something which can be annoying, namely the degenerate intersection cases, can be described so concisely.

One final word on the use of the projective plane. Since the problem of clipping concerns only incidence geometry, we are free to employ the full group of projective transformations as a tool to simplify our problem. I didn't bother, but it is easy to find a projective transformation that takes the eye to infinity and the vertices A, B to (0, 0), (1, 0) and the lines AE, BE to vertical lines. Doing this transforms the viewport into a semi-infinite rectangle which can greatly simplify and speed up clipping calculations. I believe this is the standard approach when clipping in three dimensions.

Lastly note that if we really wanted to opt for purity in our algorithm, we could construct an algorithm which is only told which of the 13 cases it is in and then told the ordering of the 4 points on the clipping line that are P, Q, I, J where P, Q are as above and I, J are the vertices of the side which define the clipping line. I considered implementing this but decided there were more cases than I wanted to handle especially as we need to consider cases when some of the points P, Q, I, J coincide. I believe this could be done elegantly though.

Quickly testing multiple eye positions for same polygon

Here is a question which the USACO did not ask but is worth mentioning briefly. Suppose we were required to solve this problem a large number of times for the same polygon but differing eye positions. How might we do this efficiently?

Having already indulged myself with ludicrously expansive comments on the projective plane I will keep it brief and say only that good algorithms would be based around spatial subdivision data structures (e.g., BSP trees with nodes corresponding to lines through polygon sides). The point is that if we subdivide the plane into regions in which the set of sides visible to the eye is constant then we need only decide which region the eye is in to solve the problem. Thus the challenge is to represent this planar subdivision in a way that is optimized for repeated querying. For example, it would be possible to set things up so that query time is determined by the (log of the) size of a BSP tree and not (directly) the number of sides of the polygon.

Back in the good old days of first person shooters when all walls were vertical and you could not look up and down, i.e., when I was idkfa-ing my way through Doom, this was exactly the problem that had to be solved and I believe BSP trees were part of the solution that John Carmack used. Doom's predecessor Wolfenstein 3D employed simpler raycasting-based algorithms and the geometry of the levels was necessarily much simpler (though on the flip side you did get to shoot it out with Hitler at the end of Wolfenstein).

I think it would be fun if the next challenge the USACO website presented to the user who has just finished "fence4" was the multiple-eye-position question with otherwise unchanged parameters but a huge number of different eye positions and a tight running time constraint!

USACO test cases

I thought I would end by exhibiting a few of the test cases that the USACO website uses to test this problem. (You get to see the tests after you have submitted your code.) They range from embarrassingly easy cases such as this rectangular polygon:

to more sensible tests like these:

and even include the following rotter:

When I saw this last case, I was glad I had opted for an exact integer-arithmetic-based solution.