Thursday, November 14, 2013

Ramblings on Haskell's 'cognitive overhead'

Does Haskell have 'excessive' cognitive overhead?

The question about whether Haskell (or a very similar language) might one day become a popular language for solving the increasingly tough programming problems ahead of us (concurrency, distributed computing etc.) seems to come up quite often.

It's an interesting question.  

One of the great things about Haskell is that it does seem to have answers to big problems and those answers can be traced directly back in many cases to the defining fundamentals of the language (universal purity, composability and to some extent laziness).  Pure functional programming tackles many of the ills that beguile software today.  That ideas from the functional programming world are highly valuable is less and less controversial, as features like lambda functions/closures and even monads are being imported into other languages and their libraries with gusto.  

However, those features alone aren't the sum total of Haskell.   Indeed, Haskell is characterized as much by its powerful type system as it is by its exclusive use of functions for expressing computation.  Haskell's type system is also differentiated from most other languages in that, while static, it is also inferred - so for the most part the programmer never has to write type annotations… as much as this is considered good practice.  This property is very hard for imperative languages to achieve in general (although languages like C# have introduced limited forms of it).  

I have watched many developers, who have spent most of their careers being trained and practiced in imperative programming, attempt to take to Haskell.  Admittedly most of these developers have not spent much time thinking about and using functions as first class objects, even as these have been added to imperative languages.  That's because most developers I know have spent years with Java, which unfortunately has been a laggard in terms of first class functions.  Nevertheless, most programmers do believe they understand functions when first asked to compare their experience with what they think they'll find in Haskell.  What they may be unaccustomed to doing however, is thinking exclusively in terms of functional composition.  

Invariably, there's a lot to learn when approaching a new language, and even more when meeting a new programming paradigm for the first time.   
Haskell presents a number of potentially avoidable initial challenges to those seeking to learn how to use it effectively, for example:
  • Tutorials are getting much better, but traditionally have been couched in mathematical terms of reference.  Even today, a lot of conversations on Haskell language features and programming issues are discussed in terms that the average software developer will find completely foreign.  Unfortunately, it seems that the established Haskell community enjoys a certain amount of this mystique and even some outright ‘occultism' ;-)  
  • Similarly, Haskell's library documentation, especially beyond the standard libraries, is somewhat sparsely documented.  Neither does it help that Haddock generates type signatures for functions, but does not include argument names.  Argument names are also often incredibly terse in many Haskell libraries and programs, but where they do convey useful information this does not appear in the resulting Haddock docs.  This can make browsing and discovery of library functions more difficult that it should be.  There are plenty of examples of functions with arguments like drawRectangle :: Widget -> Bool -> Int -> Int -> Int -> Int , where the meaning of most of the arguments is completely opaque.  
  • Developers used to IDE support have been somewhat challenged with the support for Haskell, although again this is improving nicely with EclipseFP and Leksah.  Even so, it is remarkably hard to search for functions when you know some arguments or some type that you want.  Hoogle is one tool that purports to do this, but I still find this very awkward and I've been playing Haskell for years.

Notwithstanding these peripheral issues however, for imperative programmers new to the Haskell language the real unavoidable conceptual displacement surely comes with:
  • Learning how to think in terms of composing functions
  • Learning how Haskell's type system works
  • Getting comfortable with common language features and patterns like type classes and monads
  • Learning a basic 'vocabulary' of functions in standard libraries

The first two of these seem to present the largest overhead for newcomers.  The plain fact is that most people just have not created the mental apparatus to think naturally about function composition and polymorphism with the degrees of expressibility that Haskell's type language offers (although thankfully "generics" have been a feature of OO languages like Java and C# for years now).  

New tutorials like the excellent Learn you a Haskell finally provide an accessible and well-paced route through the material that seems to work for most developers converting to Haskell from the imperative world.  This has really helped with some of the 'unavoidable' learning curve. 

Once developers have genuinely grasped the fundamentals of Haskell, they are typically aware of the following dynamics:
  • It can take more front-loaded reasoning to get to an initial shape of a solution
  • The type system is an ever-present reality.  
  • Once the program compiles, it usually works solidly
  • However, it may not work efficiently or perform as desired (and then exploring and debugging a running program is quite a different experience)
  • It may also begin to dawn on them just how flexible and powerful functions are as a raw material - due to every function having its own 'interface' and being directly composable with other functions.  

Some developers when initially confronted with the ideas of Haskell, or perhaps as they are learning the ropes, start to emote that the type system is an impediment.  This seems to be a reaction that is far more pervasive when newbies connect with Haskell than it is when they pick up something like Java or C#.  Adding to this is perhaps the increasing exposure that programmers get to scripting and dynamic languages (such as Python, Clojure or Ruby) that can condition them toward code-test cycles that do not involve any form of compile time type checking. 

This gets us squarely into the discussion about static vs dynamic languages.  While this is a dichotomy that lies outside the scope of Haskell's advantages compared to imperative languages in general, it is nonetheless an important overall factor in how Haskell is perceived these days within the community of people likely to express an interest.  So, it's worth a bit of discussion on the topic.

People who support statically typed languages argue along these lines:
  1. Correctness in software is clearly critical
  2. Statically typing code is a way to get early validation and warning about a fairly broad category of problems that can exist in code (though runtime testing is essential for validating algorithms and workflows)
  3. Computers are extremely good about managing large quantities of detail.  Humans are good at planning and design, but can only focus on a handful of things at once.  Ergo, it's advantageous to have the computer play a role as a correctness tool for details as software is being developed.  Types provide low-level coverage of general details, where runtime testing provides higher-level validation of behaviour (but only for things you remember to test). 

People who support dynamic languages argue along these lines:
  1. All that really matters is whether some code eventually behaves correctly compared to its requirements.  
  2. Therefore runtime testing is the only thing you need if you are disciplined enough to write appropriate tests as you write your program logic.
  3. The cost of managing the annotations required to feed a static type checker to verify your code is outweighed by the benefits of just coding and writing tests.

Given that the use of dynamic languages in large, complex, software projects is a relatively new wave, I think the jury is still out of how the economics play out over the full lifetime of a large software system.  I'm certainly very curious to hear and read of experiences with this software practice.  However, I am already well aware of a number of projects that have gone very much off the rails, specifically with the use of Ruby at large scales.  

One aspect of all this that seems common sense (to me!) is that software development is an extremely human activity.  As we scale up the number of people actively engaged on a software project, we naturally have to be concerned with:
  • Division of labour and responsibility
  • Practical and effective coordination
  • Good communication (intent, architecture, design and implementation choices/constraints)

Personally, I think this social aspect of development is under-discussed as a determinant in the appropriate choice of programming language.  

The concept of contracts is essential in any activity that involves multiple actors.  These may be formal or informal, technical, legal or social contracts.  
What's pertinent to this discussion is that groups of developers have to establish and maintain many agreements between parties if they are to be efficient (which includes reaching an acceptable implementation quickly and finding problems with quality as early as possible).  Many agreements are not formalized and checkable by computer, e.g. who works on what, who accepts work as complete, what libraries should be used.   However, where we choose to divide up the design of software (for problem decomposition or assignment of responsibility) we create the need to agree contracts.  Software contracts are typically called interfaces of course and these are something that can be formally expressed and checked by a computer.  Preferably, the checking should be both types and behaviour - but type checking alone is nonetheless highly valuable as a computed proof of some very important properties of an interface.

It seems to me, while pondering some of the disasters I'm aware of, that dynamic vs static issue should be rather more nuanced than it is often played out.  I suppose people find it easier to cleave to black and white positions, as much as many people also like to flock together when taking a line on technology (usually motivated by their own prior investment in skills).  In truth, the value of formalizing contracts and interfaces does change with scale.  I might ask the neighbour's son to rake some leaves on the lawn in exchange for a small fee, but if I'm engaging a company to paint my house then invariably we'll want to ascertain more precisely what gets done and for what price.  Similarly, a dynamic language (we used to call many such things scripting languages) can be a very lightweight way to express solutions to small problems.  By small I probably mean a few tens of pages or code, or perhaps a few 'modules', but in fact that is proportional to the collective 'detail capacity' of the actors involved.  This is a very interesting part of the problem.  

It's far more likely that a single author can construct and memorize details of code they've personally created.  Of course, the passage of time is well known to disrupt this, but still.  
I have seen high-quality teams of a half-dozen people informally maintain the quality social contracts, discipline and communication necessary to keep everyone well-coordinated in software projects.  This works well when there is a strong sense of leadership or shared values in the team that can make it a 'high context' environment in which people mostly just know what the right things to do.  
However, there's a point where, whether by pure scale or by heterogeneity (dislocation, politics, disagreement, competition), a team will simply not be able to function with the requisite cohesion to allow informality, culture and values to constrain possibilities enough for efficient progress of the group as a whole.  At some point more formal contracts become highly valuable and arguably therefore so does well-typed interfaces.

Given the foregoing discussion, I would characterise Haskell as being an appropriate language for high-complexity, large-scale projects.  

However, as discussed, it is also differentiated from most statically typed languages by virtue of having almost universal type inferencing.  This means that theoretically you don't have to write a single type declaration in your program, but Haskell will still have the power to check and reject bad programs.  Of course, that's different to having no type checker at all, but it goes some way to addressing a complaint that you often hear from proponents of dynamic languages: that they'd like to write code without having to spend time writing out and managing a bunch of types as they add and modify things.  In that sense Haskell makes a great prototyping language, but I would add another aspect too.  In many languages, there's a lot of boiler-plate code to capture essential structural elements that are required by the language design.  In languages like Java and C# for instance, you will be typing quite a lot of just to express the minimum wrapping of code in a class or two, or code to express accessor methods on properties.  That's different in something like Python or Ruby where there is implied encapsulation for 'naked' code, and this is also true to a great extent in Haskell.  In Haskell, every function has its own interface, so you certainly get to experiment with combining them much more freely before you ever need to start thinking about the distillation of design structure (e.g. custom data types, type classes, etc.).  

Ultimately, you will be dealing with types in Haskell however, no matter what might happen at the small scales of prototyping and trivially small projects and experiments.  So you will certainly be thinking about types you're consuming and probably types you're producing.  If you consider this an overhead, then you certainly have to place it into the context of the cost:benefits of contracts in general.  As projects get larger, the nature of where the costs and benefits actually are changes.  With more actors and separation (of code and people) there are more interstitial elements.  Are these challenges going to be managed purely socially, or are they going to be at least partly formalized and checkable by a computer?

Given that the type system is one of the more unique things about Haskell, at least as differentiated from most other languages people will be using in commercial software development, I suspect that this is where most of the "cognitive overhead" lies.  Gaining intuition about functional composition is a close second, followed by learning common libraries (which isn't a conceptual issue and is a common problem when converting to any new language).

In very practical terms, the most stumbling blocks I've seen with newbies are:
  • Understanding, in terms of type expressions, how functions work and what application does to a type. 
    That includes seeing a function type as a superposition of possible results (other functions).  Most imperative developers don't naturally understand application and partial application.  They tend to think of a function as having to consume all its formal arguments and producing a simple value. 
  • Interpreting type expressions and trusting them
    You can trust Haskell's type expressions in a much deeper way than you would an OO interface or method signature.  Imperative programmers tend to take a long while before they start trusting that type expressions are really awesome and that's the path to seeing Haskell's type system as your friend.
  • Understanding 'dispatch' on types, including the return type. 
    Functions like 'read' and 'return' can be completely mysterious to most OO programmers
  • Unnecessary confusion about monads.
    Too much fuss is made about monads.  They are called out as some strange new thing, whereas they are just a pattern in types (a wrapper around some value) with a way to produce new wrapped values from other ones. 
  • The relationship between 1st order and higher-order patterns is something that's probably net new to an imperative programmer and Haskell doesn't help with its current naming schemes.  I've always thought that "fmap" should just be called "map".  If you were going to qualify anything, the should be "lmap".  History is what it is I suppose.   This also touches on the mathematical nomenclature issue.  To most people, "Functor" is an awfully scary name that suggest a hard-to-grasp new concept.  When people actually have it explained (or learn to read type expressions as they should) it's just such a no-brainer!

So far, we haven't considered features of Haskell that set it apart as having real benefits compared to most alternatives.
Amongst these are its fitness for concurrency and the value that laziness has in separating producers and consumers.

Concurrency (and to a lesser extent even wider-area distributed evaluation) is such a big and growing problem that every significant language and application framework vendor is scrabbling to find answers to offer to their developer communities.  Apple has Grand Central Dispatch, a framework for scheduling closures on serial queues and one system managed parallel queue.  Microsoft has its async library and C# language extensions for continuation passing and futures/promises.  

Haskell gets so many of the fundamentals right here, in a way that is very hard to retrofit to other languages.  It is difficult to see how it won’t continue to be held in ever higher regard as its fitter DNA continues to make it differentiated in solving the hardest problems.   Moreover, it achieves its benefits quietly and cleanly through not having sold its soul to the devil, even when it was by no means certain how some of the ‘simpler’ issues such as side-effects could ever be accommodated satisfactorily.  Meanwhile the popular imperative languages like Java, C#, C++ and Python are adopting features so rapidly that they are becoming a multi-paradigm mess.  It will be interesting to watch them continue to evolve, but also to see how the folklore develops in terms of how to wield their many features appropriately.   The naturalness of expressing various forms of concurrency in Haskell is of course a different thing to automatic concurrency (still no more than a pipe-dream), yet with Haskell you can freely integrate concurrency of various types with libraries without distorting any of the general idioms of the language or forcing heavy-weight design patterns.  Still, concurrency is unlikely to be an issue that a newbie would find essential to learn anyway, so while Haskell has many advantages here, this is also no an ‘overhead’ thing for learning the language.

In Haskell, lazy evaluation allows generators to be described at any time.  If you have a function that produces a thing, it's trivial to create a generator for a stream of those things.  This generator will be as easily consumable as any data structure, e.g. a list of things.  You don't have to create a specific interface e.g. "getNextThing" or worry about the states of enumerators and iterators.  This is a very powerful feature in a language that is essentially about the transformation of values and it is also representationally powerful in that it allows a fully algebraic style in the code.  The benefits are laziness then are that it defers details of usage and has the power to conserves generality and reuse.  Laziness does present some challenges for newbies in several ways.  First, developers are not generally conditioned into thinking about whether processing occurs ‘in the provider’ or ‘in the consumer’.  This is often unimportant, but at scale it can become important for efficiency reasons (e.g. generating lots of suspensions/lazy-thunks in memory).  One of the first places that this is likely to be exposed is in discussion around the family of fold functions and which fold is correct in what circumstance.  A second difficulty arises with the subject of infinite data structures, quintessentially lists.  With nothing similar in OO/procedural languages, Haskell can seem a little mysterious.  One of the biggest challenges of laziness-by-default however is the differences it causes in terms of reasoning about time and space.  This is not likely to be an issue that newbies will encounter when they are just in their pure learning phase, but later as they tackle real projects with some scale they are likely to bump into performance issues that will require some additional knowledge of these subtleties.  In this sense, laziness adds some cognitive overhead in the ‘medium term’. 

In summary, I think it's probably true that Haskell has more overheads compared to something like Python, if you're just tackling small projects, unless these overheads have reduced to close to zero due to accrued experience of Haskell and its libraries.  Haskell arguably has features that scale much better for large projects/teams however.  It will be interesting to hear more reports from groups using Haskell at ‘industrial scale’ in the coming years, assuming the language continues to accrete followers.  With the continued and growing challenges in software engineering arising from scale, parallelism and distribution of data and processes, it will also be interesting to see if Haskell is indeed “doomed to succeed”, or whether the current popular engineering languages can evolve to be ‘good enough’ - and therefore maintain their ascendency.  I really hope that a handful of little impediments to using Haskell more broadly in general engineering will get addressed.  Improved tooling is a major theme here, including better Haddock, fixing cabal’s DLL-hell issues, and a full-featured and stable IDE.  Removing more of these unnecessary hurdles still leaves a real learning exercise and some real conceptual displacement for OO developers, but as the benefits become more evident then the motivation should override the learning curve issue.

Monday, October 14, 2013

My Introduction to Monads

Everyone seems to eventually do an "introduction to monads" in their Haskell ramblings.  FWIW, here's my take on it.

A good way to grok monads is to look at the history of the advent of monads in programming - specifically, why and how they were adopted into Haskell.

Haskell is a pure functional language.  That means it is fundamentally about using mathematical functions to build up computation.  Mathematical (pure) functions cannot generate results that are independent in any way of their formal arguments.  Therefore there can be no side-effects, no referencing of state outside of the function and a function that is passed the same argument values on several invocations must always return the same results.  

Pure functions as a basis for programming have huge advantages (though we won't go into these here), but one big disadvantage is this:  the world changes around the program and these changes appear to the program as values that change spontaneously.  Therefore if you want to "readCurrentTemperature" or "getLineOfTextFromConsole" or "readMousePosition", then you would apparently need a function that returned values independent of any inputs it was given.  Taking the console reading example, in Haskell's type language, it would seem that you would want a function:
getLine :: String  -- a function that when called, gets a typed line of text from the console as a String value

It would seem that no arguments would appear to be needed to simply receive text from the console and you would expect to receive the text that user typed on the line.
It's easy to see that in a pure functional world, this is a big problem.  A function with no arguments is a constant - it can return a value sure, but that value cannot change because there are no arguments that could be used to affect the computation of the result.  If we allowed this function to return different results on each call, then it would no longer be a pure function.  This is anathema in Haskell because the impurity would propagate through the whole program and suddenly Haskell would lose all the advantages it gets from guaranteeing that everything in the program definition is pure. 

Another example of this is of course writing to the console.  Here, we might expect to provide a String, the text, which will get written out to stdout or somewhere.  There's nothing particularly interesting that this function might return (other than some success/failure flag, but it's reasonable to assume success in this case).  In Haskell, such a function might be typed:
putStr :: String -> ()
This type expression says that putStr is a function that takes a String and returns a ()… which is called 'unit' and is a value that can carry no useful information.  It's fine to think of unit being the same thing as "void" in the C languages.  In other words, once the text has been written out, there's nothing useful to return from the function.
On the face of it this function isn't so bad as getLine… you could argue that it just happens that all text maps to the same (unit) value - so in isolation it isn't as intrinsically evil as getLine.  Nevertheless it is required to perform a side-effect, and this is completely invisible from the type as we currently have it.  Worse though, is the fact that Haskell reserves the right to optimise away some/any calls to putStr because it knows that it must result in a type that has exactly one value (unit) - i.e. is essentially a function that ignores its input and always returns a constant value.

Aside from the obvious problem with getLine in isolation, another issue comes with combining actions with side-effects like getLine and putStr.  If we want to read a line of text from the console and then output it to the console, we might expect to combine these two actions together into an 'echo' like this:
echo = putStr (getLine) -- get a line of text and then output the string that is obtained
which would then have the type:
echo :: ()
This type says absolutely nothing about the actions that are implied by its definition.  It just says that it results in no useful information!
Moreover, there's a deeper problem.  One of the important things that Haskell is able to do because of its insistence on purity is to reorder function evaluation, as well as sharing and remembering any constant values in the program.  There is nothing in the echo definition that would tell Haskell to always (re)run getLine before running the putStr action (especially if getLine had already been run once and produced a string that looks like it is the only constant output possible from that function!).

By all accounts Haskell languished with the incompatibility of stateful, action-based aspects of programs with its essential, existential requirement for purity for quite a long time in its early development.  Eventually however, Philip Wadler (after reading some discourse on monads in a paper by Eugenio Moggi) made a connection between a mathematical object called a "monad"  and the problem of encapsulating stateful effects in Haskell.  So what is this "monad" thing and why did it appear to solve that problem?

A "monad" (and many would say that it's a shame that we have to use the mathematical name) is simply a object that wraps a value and that allows functions that operate on that inner value to be composed together - BUT those functions must always put the result back into the same kind of wrapper.  This is what creates the encapsulation because:
There's an outer shell around the value, with a distinct type
You can only get at the inner value by using a monadic combinator to convert a function that operates on the inner value into one that operates on the whole monad (and must return a monadic value)
It's possible to completely deny raw access to the inner value by refusing to create a function that simply unwraps the value
It's possible for a given monad to build sequencing of evaluation into the definition of that monad's composition function.  This means you can force the order in which IO operations must be performed, for instance.

Any monad has two main defining functions - these are the functions that any type must declare for itself in order to be a monad:
  1. return - a function to wrap an inner value with the monad type 'wrapper'.  This has the type:  a -> m a.  You could also say that return lifts a value into the monad.
  2. bind - a function to compose some transformation on the inner value with access and re-wrapping in a monadic type.  This has the type:  m a -> (a -> m b) -> m b
With these two, you can define a monadic type and provide the means to make new monadic values of this type (wrapping other values 'inside') and the ability to apply functions to the inner value - but only so that the result is another monadic value of the same type. 

What does it mean to have in 'inner' value and a wrapper?  Well this happens all the time.  Consider Haskell's Maybe type for instance.  This is Haskell's type for adding a type-safe null value to any other type.  The general type is Maybe a, so for a String value, this would be Maybe String.   The inner type (payload) here then would be a String, while the wrapper is the Maybe.  The outer type provides some extra functionality - in this case Maybe adds the one extra value "Nothing" that the value might represent if it doesn't actually hold a value of type String.  The Maybe type predates monads in Haskell, but it illustrates the structural wrapping of a value in a simple way.  

So, back to the the initial motivating problem of IO operations...  

Wadler realised that these features of a monad were completely appropriate and transformative for the "IO problem".
All IO (reading, writing, input, output, etc.) can be encapsulated by the IO monad.  This monad's values describe IO actions that can be performed at some time in the future.  These IO actions can be composed together with the IO monad's bind operator in order to create more complex IO actions from more basic ones.  However, the result is always an IO action value.  Not only does this mean that any code that performs IO is 'honest' about this in its type, but it also keeps all IO bound up inside the IO monad and clearly separated from any pure code.  The IO value itself remains completely pure.  Furthermore, the IO monad forces IO operations to be performed in the strict sequence that is intended.

So, looking at the problematic getLine and putStr functions, getLine would now have the type:
getLine :: IO String -- a function whose return value represents an IO action that will produce a String 

and putStr will have the type:
putStr :: String -> IO ()  - a function that creates an IO action (returning nothing of interest) when provided with a String to output

How do you combine these together to make our little echo program?  
Well, now you can use the IO monad's bind function.  Recollect this has the type m a -> (a -> m b) -> m b
If the first argument of bind is getLine (with type IO String) the "m" will be IO and the "a" will be Strring.  This specializes the bind to be:  IO String -> (String -> IO b) -> IO b
The second argument fits the type signature of putStr, making the "b" into ().  So we can write the echo program as:
echo = bind getLine putStr
… or with Haskell's operator form of bind (>>=) we can write:
echo = getLine >>= putStr
… where bind has been specialized to the type:  IO String -> (String -> IO ()) -> IO ()
… and the type of echo (which is the result of bind, the last component of the type above) is therefore just IO ()

We are free to continue binding more IO operations onto echo to chain together even more IO operations.  For instance:
twoEchos = echo >> echo
will just perform one echo operation (getting a line and writing it out) and then another.  By the way >> is like >>= (bind) except that it doesn't use a result from the first monadic value in the production of the resulting monadic value.  That's needed here because echo doesn't take an argument and only produces an inner value of (), which we don't need for anything as it represents 'void'.

So, if we create a big conglomerate IO action by combining lots of smaller actions, when does the IO actually get done?  Well, the performing of the IO only starts when the Haskell program is run.  The top level of a Haskell program is its "main" function which is typed as "IO ()" - so the entry point of a Haskell program is an IO action itself.  Haskell will start performing the contents of any IO actions that are composed up to the main function and the specific definition of the bind operator defined for the IO monad will ensure that the actions are executed precisely in the order that was intended by the program.

The IO monad was a huge success for separating IO operations in Haskell from pure code, but other monads are also used to separate stateful effects and/or enforce sequential execution.  For example the State monad allows arbitrary state to be collected and maintained by a program, the Writer monad allows log-like actions to be emitted by code at various points, the STM monad defines transactions and the MonadPlus monad describes branching computations that generate values.

As a programming idiom, monads not only offer an encapsulation of values, but also a way to create embedded domain languages right in Haskell.  Any functions that can operate on values inside a specific monad must declare the monad they operate over in their types and only functions 'in' the same monad will compose directly together.  Of course, the inner value can remain polymorphic so monads can work like an interface.  

Monads are so successful and prevalent in Haskell that the language has grown special syntax to make monadic composition (e.g. bind/>>= and anonymousBind/>>) more convenient.  This is the so-called "do notation" and consists of the keyword "do" followed by a block of 'actions' that will be combined with bind or anonymousBind.  The echo function could be written this way with do notation:

echo = do
lineText <- getLine
putStr lineText

The "do notation" is sometimes called Haskell's embedded imperative language, as it resembles the 'top to bottom' style of programming of other languages.  

So, that's my quick introduction to monads.  For some reason monad 'introductions' are often seem very convoluted.  Some people want to talk about the category theory from where the concept was derived and imported into Haskell, but that's an irrelevancy for most people.  Most people just need to know that it's an encapsulation that preserves the outer wrapping type and allows operations on the inner value to be chained together.  Put like that, you are quite likely to have seen monads as a pattern before used for different purposes in different languages… you just didn't know there was an awkward mathematical name for them :-)

Monads seem to have been the subject of quite a few "me too" discussions and implementations in a range of other languages recently (Java, Ruby, Javascript, Python etc.).  That's probably mostly a bit of a meme thing, due to the exotic name.  It would be nice if the importance of composability came to the fore much more in the industry and certainly the monadic pattern is an important way to have encapsulation and composability.  

Nevertheless, the real issues arguably lie with the problems of shared mutable state in software and the features that languages may or may not have to limit side-effects and handle change in a rational way.  Concurrency and distribution of computing are both looming issues that we're struggling to deal with, aside from creating more options that developers must learn to wield effectively.  In transitioning common languages from a state of being extremely unfit for these new challenges to at least offering some solutions, we are possibly creating quite a mess of patterns, idioms, libraries and folklore which we expect teams of developers to apply with necessary correctness and consistency.  

While Haskell may not be the answer to all of this in the end, it does have some fundamental properties that address these issues head on.  Indeed purity and laziness (with purity being the most important) look increasingly essential when code has to be analysed, decomposed, transmitted and evaluated in multiple cores or locations; especially as the complexity of sequence and coordination has to go somewhere.  Patterns like monads were essential in Haskell to make the language pragmatic when it already had some of the fundamentals right, whereas we see other languages adopting feature after feature (e.g. lambdas/closures, monads, actors etc) as cool ideas when they still haven't put their houses in order in other more fundamental ways.  In other words, monads (or a similar thing like arrows) are essential to Haskell, whereas they are merely perhaps just a fun pattern and some 'good practice' in other languages.  

One reason that's cited as being why the world doesn't just adopt Haskell, given that it seems to otherwise be such a good fit for the emerging new challenges, is that it has some cognitive overhead.  It's fascinating, to me at least, to ponder whether this overhead comes from training away from Haskell's concept space when developers are learning the ropes, or whether there are some essential features that are just genuinely harder to grok (when you strip away the boring issues of positioning and nomenclature) than using the current popular stock of languages.  I may ponder on that some more in a future blog post.

Monday, April 29, 2013

Haskell Close Encounters (of the TUI and GUI kinds)

Most actual functional programs I've ever written have been 'engine' like things, not having any IO for attended operation per se.  That includes utilities and small apps that just use the command line for processing data and outputting results to the console or files.

Due to some prospective new projects, I have been wondering what sort of UI libraries might exist for Haskell and what the experience of writing UIs natively in Haskell might be.

My all-time favourite GUI framework is Apple's Cocoa environment.  I find this to be extremely powerful and (normally) easy to use.  Certainly, it is mature, with a strong/clean design.   It's also convenient and fast.  Unfortunately it is to all intents and purposes Mac only and there's no up-to-date, well maintained Haskell binding (I've found hoc, but that appears to be abandoned).  Although I don't spend any time in the Windows world any more, the situation there seems to be pretty fluid - but in any case the main UI frameworks there are again 'proprietary' and platform-specific.

Now, I can see value in two types of UI for various kinds of program I write:
  1. For simple applications and utilities, a pseudo-graphical terminal UI in the style of Midnight Commander can be very effective.  Moreover, these TUIs are potentially cross-platform and are very light-weight for use across networks.  While these TUIs aren't exactly common these days (I suppose they hearken too much back to DOS!), they are practically zero-client!  
  2. For more sophisticated applications, and of course where actual audio-visual media must be presented or edited, then a full GUI is required.
As I'm interested in both of these types of UI, I have taken to exploring the options of doing each with Haskell.  There are also web UIs of course, and Haskell does seem to have several very good frameworks for developing these (Yesod and Snap come to mind).  

TUI libraries in Haskell

I'm surprised we don't see more utilities and 'technical' apps provided with this first kind of UI.  I suspect this may be partly to do with the relative cost of creating TUIs with commonly available libraries, though it's probably also true that the set of apps that are appropriate for such treatment are a limited set and most people just reach for the GUI toolkits come what may.

If you have spent any time around Unix, then you'll know that there's a family of C libraries known as "curses" libraries available for building apps that do textual 'drawing' in a terminal/console.  These libraries have had a long history.  The latest variant is "ncurses" (new curses), which is has been available for years on unix-style systems, including Linux.  

Because this library is so well known, my first search for a TUI library involved looking through the hackage database for curses bindings for Haskell.  There are no fewer than 4 different libraries for accessing ncurses from Haskell.   The two that seemed to offer good coverage of the C library and that were relatively recently maintained were:
The first of these, hscurses, appears to be the most complete of these two.  It has a bindings for the C functions in ncurses and a few extra modules providing utilities, including a module defining a "widget" abstraction on top of the basic curses functionality.  

Experimenting with hscurses is pretty straightforward, though you can see how quickly application code gets complicated due to the relatively low-level nature of addressing a cursor around the screen.  Terminal resizing is supported, but if you want any persistent content, this must be stored and redrawn when ncurses is effectively restarted with a new terminal size.  One strange thing about the library is that it appears to miss a key binding to the function that sets the 'background' of a window (the name of a simple area of the screen configured in curses).  It's interesting to note that the widget module does not use windows at all, but rather draws directly on the top level screen 'window'.

The ncurses library makes a claim to providing a much more convenient programming interface.  This isn't a simple one-to-one mapping of the C interface to Haskell functions, and the few code samples to illustrate the difference look compelling enough.  Unfortunately, the library doesn't seem to work properly on the Mac (maybe elsewhere?) for its colour features, because the maximum number of colours reported for the terminal is "-1".  Perhaps this is not supposed to be a signed value, but in any case the library refuses to allow colour configuration because it expects this value to be a positive value greater than zero if colour configuration is allowed.   ncurses also has support for "panels", which allow 'overlapping' windows of the sort needed to implement menus, forms and dialogs.  That would be a great addition on top of the working basic functionality. 

In the final analysis then, both of these libraries have issues, albeit probably quite easily fixable.  Both are also quite low level and even some simple experimentation reveals how much housekeeping and legwork is required to achieve a reasonable effect.  

However, there's at least one more option.  On the Haskell IRC channel, the "vty-ui" library was suggested to me as a possibility.  This is a library that is designed to provide a much higher-level API for creating Text User Interfaces.  Written by Jonathan Daugherty at Galois, the library appears to be actively maintained and has excellent documentation - something of a rarity.

In chatting with Jonathan, it's apparent that the vty-ui library currently does not support overlapping widgets of the sort made possible with 'panels', but rather just a flat tiling of widgets like the basic ncurses functionality.  Nevertheless, vty-ui is the library that I'm exploring for use making TUIs.

GUI libraries in Haskell

The choice of Haskell GUI library looks like a minefield.  There are many offerings for writing GUIs in Haskell, but broadly speaking the options fall into two camps:
  • Fairly direct bindings to established GUI toolkits
  • More idiomatic 'functional' approaches for configuring GUIs and the events that flow between elements and application logic
The latter category was the one that piqued my initial curiosity.  After all, when you're in a programming environment with a strong flavour like Haskell, it seems reasonable to use the most natural methods to express things.   Of course, whether the most elegant functional stylings are necessary the most 'natural' begs the question "Natural for whom?".  Nevertheless, I investigated which high-level GUI libraries existed with a view to figuring out which are the most established and popular. With GUI programming being so ubiquitous, I was expecting a clear leader or two.

These high-level languages are typically based on the concept of "Functional Reactive Programming" (FRP).  In a nutshell, this works by 'wiring' up event streams between producers and consumers.  The libraries I looked at all used the very general compositional power of "arrows".  

Sadly however, I could not find a single high-level GUI library that was established and well maintained.  These libraries appear to have quite short shelf lives.  Many seem to begin as research exercises, manifesting ideas enunciated in academic papers.  They have a burst of attention and updates, but then get abandoned.  Probably, this abandonment is due to a combination of factors: novelty/learning-curve, limited functionality and general lack of uptake in the developer community.  

If there's no good choice for a high-level GUI library yet, then what about the lower-level Haskell wrappers for the GUI toolkits?  The two currently active such libraries are:
Out of these two, wxHaskell comes from a motivation to provide a full cross-platform GUI abstraction. Consequently, it is rumoured to work well on Windows.  Conversely, gtk is very mature and hugely popular, but it comes from the Unix/X11 world, so support for Linux will be very strong, followed by Mac OS X, with Windows likely to be a distant third.  

The last updates of these two packages were recent...ish, with wx having been updated May 2012 and gtk in Nov 2013.  While such dates aren't particularly significant, it might suggest that more people are interested in the gtk binding.  Certainly, the Haskell GUI apps that I have used (Leksah, threadscope) and several examples posted on YouTube are using gtk.

So, my limited survey of GUI toolkits seems to conclude that gtk2hs is currently the best option.  
I have consequently begun to play with gtk2hs on the Mac, along with the glade-3 GUI builder tool for creating loadable XML UI descriptions.  

One wrinkle with gtk on the Mac, is that the standard gtk 'back end' uses X11 as the underlying graphics stack.   While X11 apps are quite nicely supported under Mac OS X via the XQuartz app, nevertheless apps running under X11 still seem a little 'foreign' as they take longer to start up if X11 isn't already running and have other look and feel details that don't seem quite right.  There is however a Quartz backend for gtk that aims to solve most of these issues.  This appears to have to be linked with individual apps, so I'll have to figure this out some time.  In the meantime running under X11 works just fine. 

Tuesday, April 16, 2013

Amazing Mouse (Squeak the Last)

This is the last post in the Amazing Mouse series.

We're going to round out the application with some IO so that we can track the mouse through the maze.  We'll print the state of the world (maze plus mouse) for each move that the mouse makes and we'll pause for a key press between each move.

In the process, the following concepts will be introduced:
  • IO in Haskell 
  • Getting a package from Hackage using cabal
  • Using a library function from the installed package
We're going to print the maze on the console.  We can use the original character maze (CharMaze) for this, rather than converting the Material maze (Maze) back to characters.  However, we need to create a function called markLocation that will replace one character in the maze with the position of the mouse (representing the mouse as a different character... a dot).

The markLocation function needs to take a coordinate for the location of the mouse, and a CharMaze. It will then return a CharMaze (with the location of the mouse indicated).  The type is therefore:
markLocation :: (Int,Int) -> CharMaze -> CharMaze

In order to deconstruct and reconstruct the maze with the mouse location replaced with a dot, the markLocation function will need to do the following:
  • Separate the rows into those before the mouse location row, the mouse row, and those after the mouse location row
  • For the row containing the mouse, separate the row positions into those before the mouse, the character with the mouse, and the positions after the mouse
  • We can then build a new mouse row, with the character position of the mouse replaced with a dot
  • We can then rebuild the maze, with the rows before, the new mouse row, and the rows after
Here's a definition for this transformation:
markLocation (x,y) map =
        (beforeRows, theRow : afterRows) = splitAt y map
        (beforeItems, _ : afterItems) = splitAt x theRow
        newRow = beforeItems ++ '.' : afterItems
        beforeRows ++ newRow : afterRows

We are making use of the splitAt standard list function to divide a list into two components.  It so happens that the second list after the split always includes the item at the split position, so it's easy to split this individual item off the front of this second list using the list cons operator (:).  In the case of the  rows, we bind the name theRow to this single row at the mouse (split) position.  Similarly, when we deconstruct the row 'items' we bind names for beforeItems and afterItems, but this time there's no point assigning a name to the character at the mouse location as we're about to replace it with a dot.

To create a new row, we do everything in reverse, constructing the new row out of the beforeItems appended to the list made from the dot character cons'ed with the afterItems.

Finally, we similarly recompose the rows, with the beforeRows appended to the list of rows made up of the newRow cons'ed with the afterRows.

We're now ready to make the function that will plot a sequence of maze states, one for each mouse location in a given Path.  This involves doing IO for the first time in this project.

Haskell's IO is extremely clever, but you won't have seen anything quite like it in any other language.  Because function purity is sacrosanct in Haskell, we are not allowed to do anything in a function that directly causes or uses side-effects.  IO is a huge elephant in the room when it comes to being side-effect free, because side-effects are the whole point of IO!  The clever thing about Haskell is it does IO by not performing the IO directly in regular functions, but allowing you to create and compose together IO operation values that are then 'performed' in the right sequence at the right time.  The right time is when the main function in your program has to perform the top-level IO command in order to get anything done.  At this point, all the IO commands are 'performed', along with the regular functions that were glued into them to do normal things like adding up numbers.  

The biggest thing to remember about using Haskell's IO features is that if you're defining a function that returns an IO operation value then you're in this special world.  You'll be using some special ways to compose together IO operations to make bigger IO constructs.  It's all quite cute when you get used to it, but there are some more special functions to learn, and some special syntax called "do notation" that makes composition of IO sequences more convenient.  In fact, when you're using do notation, you're doing something very similar to using an imperative language embedded in a functional language.

With some scene-setting out the way, let's start by thinking about the type of the function (we'll call it plotPath) that we want to create:
plotPath :: CharMaze -> Path -> IO ()

The arguments here are straightforward.  plotPath will take an original CharMaze and a Path to plot. The original CharMaze will be modified with markLocation at each step in the given Path, then printed to the screen.  We'll then wait for a character to be entered at the keyboard with the getChar function.

The return type of plotPath is one of these IO operations.  In fact it's the IO operation that has no residual value, which is represented by the unit type that we mentioned previously when discussing tuples.  Output IO tends to leave no 'residue' (the values are 'gone' to the output device), whereas input is all about obtaining some value that is the residual of the IO.  The type of getChar for instance is:
IO Char 
... indicating that a character is the result of performing this IO operation when that finally happens.

Here's the definition for plotPath:
plotPath charMaze path = forM_ path printMazeAndWait
        printMazeAndWait point = do
            forM_ (markLocation point charMaze) putStrLn

The first line of the definition "forM_ path printMazeAndWait" works just like a 'foreach' in imperative languages, however it builds a sequence of IO operations, each operating on one element from the list of values provided.  In this case, we're saying "for each location in path, do the printMazeAndWait operation".  Again, it's important to remember that these functions are only composing together IO operations that will eventually be performed later by something underlying the main function in our program.

In the 'where' part of the definition we define this output IO function in turn.  Clearly, from the foregoing, this function takes a point (a location) as its argument.  We then use Haskell's "do notation" to glue together a sequence of IO operations...
  • First, for each row in the transformed maze (using markLocation) we print the row to the console (stdout) with a newline after each row.
  • Then we wait and obtain the next key press from the console (stdin)
The result of this function is the IO () value representing all these IO operations combined and in order.

We need one more utility function to be ready to finally put this through its paces.  This is a function that brings together generating a single path for a CharMaze and then showing it on the screen:

plotFirstPath charMaze = plotPath charMaze $ head $ 
                                             mousePathsFromStart (materialMaze charMaze) 1

Now, we can finally complete the main function that is the entry point of the program and the top-level IO operation that gets performed.

main = do
    putStrLn "Press space to move mouse one step"
    hSetBuffering stdin NoBuffering
    hSetEcho stdin False
    plotFirstPath charMaze0
    putStrLn "\\o/ The mouse lived happily ever after!"

Again, we use the "do" syntax to sequence some IO operations that represent the root of the program that will execute when you run the binary in the console.  There are a couple of special IO commands here for technical reasons:
  • "hSetBuffering stdin NoBuffering" just makes sure that the input IO system doesn't wait for a newline before making characters available to the program (otherwise getChar will not behave as expected to pause the program between maze drawings).
  • "hSetEcho stdin False" also turns off the echoing of typed characters to the keyboard.  We want getChar to do its thing 'silently'.
If you compile and run the program, you should see mazes scrolling up the console when you press keys, until the program completes. 

To go the final mile, we'll make a change to clear the console between maze drawing for a better effect.

To do this, we'll engage the services of one of the very many additional packages from the hackage repository.  This one is called "ansi-terminal" and it provides a "clearScreen" function that will clear any ANSI terminal screen.  This should certainly work on any Unix and Linux system.  YMMV on Windows!

To hackage repository has a browser allowing users to see packages by category and to search for packages.   All the available packages are obtainable using the standard package manager tool for Haskell called "cabal".   Cabal is installed automatically when you install the Haskell Platform, which is the standard way of obtaining Haskell these days.  Cabal takes care of downloading, building and archiving packages in standard locations.  As most modern Haskell IDEs use cabal packages for projects, building projects (e.g. in EclipseFP or Leksah) will automatically invoke cabal to build the package artifacts (applications or libraries).  

Both EclipseFP and Leksah have a way to declare external package dependencies for your project.  
In EclipseFP, double-clicking on the "<name>.cabal" node near the top of the hierarchy in the Project Explorer will open the cabal properties editor.  Clicking on the "Executables" tab will allow package dependencies to be added to executables included in the project cabal package.

In Leksah, the cabal settings are available from the Package -> Edit menu and the "Dependencies" tab provides a similar editor.

Whichever IDE you are using, you will need to add "ansi-terminal" as a package dependency.  Then, this package will need to be fetched and installed.  Both EclipseFP and Leksah have means to install dependencies from within the IDE.  However, this can also be done from the command line with the cabal tool:

> cabal install ansi-terminal

When you execute this command in the shell, you should see messages indicating that the package is being fetched, built and installed successfully.  Cabal has the ability to transitively fetch and install dependencies, so several packages might be obtained and built whenever you request a specific package for inclusion in a project.

Having installed ansi-terminal and added it to our project dependencies there are only a couple of things left to do to use it...
  1. Include an import statement for the modules you need from the package.  In this case we need to add: import System.Console.ANSI
  2. Add the line "clearScreen" beneath "getChar" in the printMazeAndWait local function of plotPath.  
When the application is rebuilt and run, you should see the effect of the clearScreen between key presses and maze drawing (assuming you are running the application in a shell in a ANSI terminal window).  

So that just about wraps it up for the Amazing Mouse exercise.

Haskell can initially be a steep learning curve, especially when cross-training from imperative/procedural and OO languages.  However, as is always the case, immersion and experimentation are the fastest paths to enlightenment.  

I've tried to strike a balance in this example of explaining enough to avoid big mysteries, but invariably there will be gaps if you are completely new to the language.  That being the case, I would highly recommend the book "Learn you a Haskell for Great Good".  Once past the first half-dozen chapters, then this example should make a lot more sense!

There are plenty of ways that this example, as presented, could be improved for efficiency and style, let alone adding a few more bells and whistles.  I may subsequently post a few suggestions in case anyone lacks imagination (!) but wants to tinker some more.

Monday, April 15, 2013

Amazing Mouse (Squeak the Second)

Continuing our maze solving Haskell programming example, we now get to the interesting stuff involving navigation about the maze.

Maze solving is a classical search problem, involving the following basic concerns:
  • Generating one or more solution 'paths'.  Where...
  • Each path is a list of visited locations from the Start to the Exit.  Where...
  • At each step, we explore the possible next steps in some order of preference and we never consider moves that would cycle back to a previously visited location. Where...
  • Next steps can be generated from the current location along the path by sampling all the locations around the current location and determining which adjacent locations are navigable (not a Wall!)
So the algorithm is basically a multistep 'generate and test'.

The most basic ability to provide our virtual mouse, is the ability for it to determine which adjacent locations can can be visited.  For a given x y location, we will need to consider the eight possible adjacent locations (we'll allow our mouse to move vertically, horizontally and diagonally around the grid to any non-Wall location).

What are the rules for generating 'legal' pairs of (x,y) move-candidate locations around a current location?
  • The location can only be -1, 0 or 1 squares removed in either x and y coordinates.
  • ... but the location may not be 0 squares removed in both x and y, as this is the current location.
  • The location must be legal in the maze (no smaller than 0 x or y, and smaller than width in x, smaller than height in y).
  • The location must not be a Wall
Given that we're generating another list (of candidate locations around the mouse), this is another job for a list comprehension.  Here's the code:

possibleMoves maze x y = [(x',y') | x' <- [x, x - 1, x + 1], y' <- [y, y - 1, y + 1],
                        not (x' == x && y' == y),
                        x' >= 0, y' >= 0, x' < mazeXMax, y' < mazeYMax,
                        materialAt maze x' y' /= Wall
                    (mazeXMax, mazeYMax) = mazeSize maze

Recall that a list comprehension has list constructor syntax (square brackets), with the first term in the brackets (up to a vertical bar) being the constructor for individual elements of the list.  So, we're going to be building a list of (x',y') pairs.  The apostrophes perhaps look unusual, but they are just characters that Haskell allows in identifiers that are borrowed from mathematics.  You can read them as "derived" i.e. we're constructing a pair "derived x" and "derived y".  Mathematicians call the apostrophe "prime", by the way, e.g. x' is pronounced "x-prime".

The rest of the clauses in the list comprehension match the rules we outlined above, that generate and test possible new locations, given the original location x y that are provided as arguments.  

To begin with, we generate possible derived x and y candidates by taking the current coordinate, the immediately preceding coordinate (one space to the left/up) and the immediately succeeding coordinate (one space to the right/down).  

Then, we ensure that this candidate location is not the current location (where x' == x and y' == y).  

Next, we check that this location lies properly within the bounds of the maze.  We use names "mazeXMax" and "mazeYMax" that are bound in a where clause at the end of the function definition, again using our mazeSize function to obtain this.  This is another potential inefficiency, as we'll be recalculating the maze size every time possibleMoves gets called, but this doesn't cost that much ;-)

Finally, we test to ensure that the candidate location is not a Wall using materialAt.  In the current implementation, only the Wall Material is not traversable in the maze.

Now, our mouse can 'see' around him and discern possible directions to travel in the maze.  However, he still has no idea which direction might be best.  We could provide no distinction at all between the possible moves the mouse could make, in which case the mouse would have to make an essentially 'random walk' through the maze (albeit being able to refrain from doubling back on himself).  Alternatively, we could provide the mouse with some ability to sense a better direction to travel - perhaps he can smell the pile of cheese lying just beyond the maze exit, or maybe he has some general sense of the direction of the exit. 

Let's give the mouse a general sense of direction to the Exit, rather than model any particular physical manifestation that might indicate the direction of the exit.  We'll allow our virtual mouse to know an approximate distance to the Exit from any location (thus, the possible locations of a move).  This will help discriminate between possible moves, but it will not help the mouse avoid the obstruction of maze walls, as distance is independent of such obstacles.  

OK, so let's create a distance function.  This will return some distance metric for any two locations (given by two pairs of x,y coords).  A true distance would involve the full Pythagorean equation, e.g.:
distance = sqrt(deltaX^2 + deltaY^2)
However, we don't really care about the exact distance.  We only need a distance score with which we can order a set of locations.  Consequently, it is sufficient to use:
distanceSqr = deltaX^2 + deltaY^2
or frankly, even the sum of absolute distances in x and y would do.

The actual code should be pretty straightforward by now:
distanceSqr (x,y) (x',y') = deltaX * deltaX + deltaY * deltaY
            deltaX = x - x'
            deltaY = y - y'

Note that we're again unpacking two data structures (the coordinate pairs) in the argument list (LHS) of the function definition.  

Now that we have our distance function, we can specialise possibleMoves with a new function orderedMoves that will sort the locations from possibleMoves using their distance from the Exit.

orderedMoves maze x y = sortWith (\location -> distanceSqr location $ exitLocation maze) $
                                            possibleMoves maze x y

The sortWith function uses the result of a provided function as the property on which to sort any element (in this case a location).  Sorting will be ascending, which is what we want.  The nearest locations will end up toward the beginning of the resulting list, and the more distant locations at the end.

Armed with some orderedMoves to try on each step along the path toward the Exit, we're now ready to generate paths through the maze.  We'll create a function that returns an arbitrary number of possible paths that the mouse could take given the navigation rules we're coding.  Again, this will be a 'generator' kind of function, and later on we can write functions that, for instance, just take the first path found, or take some number of generated paths and return the shortest.

Let's write the type declaration for a function mousePaths:
mousePaths :: Maze -> Int -> Int -> [[(Int, Int)]]
This says that mousePaths will take a maze, two integers (actually an x and y starting location) and will return a list of lists of pairs of integers.  The inner list here is of course a path (a list of locations defined as (x,y) coordinates).  

Actually, that last type is a bit ugly, but we can clean it up by saying that the inner part has a name "Path":
type Path = [(Int, Int)]
in which case the type of mousePaths can be rendered:
mousePaths :: Maze -> Int -> Int -> [Path]
... which is a lot more readable.

Let's now think about how to generate paths.

Clearly, a path can be constructed by recursively choosing a new location from the end of the current path.  If the initial path is just the starting location, then we'll 'grow' a path from this point, using orderedMoves to select a new location from the last location.  

One important rule that we'll build into the path generation is that we won't let a path double back on itself.  This is easily achieved by disallowing any proposed new location that already exists in the path list we've already build (the locations previously visited). 

We know when we have a new complete path to return when the next location is in fact the exit location.  

Here's a definition for mousePaths, that captures the above strategy:
mousePaths maze x y =
        exit = exitLocation maze
        mousePaths' path = do
            let (thisX, thisY) = head path
            next@(nextX, nextY) <- orderedMoves maze thisX thisY
            guard $ not $ elem next path
            let newPath = next : path
            if next == exit then return newPath else mousePaths' newPath
        map reverse $ mousePaths' [(x,y)]

A lot of the discussed rules should be recognisable right away, but there are a few things to pull out for discussion.

The mousePaths function is provided with the maze and an initial x and y value.  We need to define a recursive 'helper' function that deals exclusively in the paths that are being built (again, a path being a list of (x,y) location coordinates).  This local helper function (mousePaths') is defined in a let block along with a definition to obtain the exit location from the given maze. 

The outer function, simply kicks off the search for mouse paths by calling mousePaths' with an initial path containing only the current location, constructed from the starting x and y coords provided as arguments.  Because paths are built up as lists (with the last location visited being the head of the list) we will need to reverse any paths that are returned by the mousePaths function; hence the "map reverse" applied to the list of paths returned by mousePaths'.

Clearly, the real meat of this function lies with the helper function mousePaths'.  While the type of this function is not provided in the source, it is quite interesting:
mousePaths' :: Path -> [Path] 
You can copy this type declaration into the source above the definition of mousePaths' if you want to verify this (or indeed to better annotate the source!).

This means that for any Path that this function is given, it will return a bunch of potential Paths.  In other words, mousePaths' represents the strategy of generating alternative new Paths, given some rules, from some initial Path.  In this case, we'll be extending the initial Path with new locations from the orderedMoves list on each iteration.  

As a side note, this is an example of a branching computation.  From some initial 'seed' path, containing only the starting location, we are continually splitting into new Paths one per legal move, each of which in turn split again (on recursion).  Thus, we 'concurrently' explore all the possible paths that are allowed by the branching rules.  

The body of the mousePaths' helper function defines the branching rules for exploring multiple Paths.  First, we simply get the last location of the current path we're processing.  This is the head of the path list.  The coordinates thisX and thisY are extracted from the location coordinate pair.  

Now we need to determine the potential next locations to which we could move.  We use orderedMoves to do this, so that the 'best' locations are tried first.  Note that the left arrow in this line works exactly the same as the left arrow in a list comprehension (in fact we're really doing another list comprehension, just in a different format!).  Each item returned by orderedMoves will therefore be bound to symbols on the left hand side of the arrow, which in this case is a pattern match to extract the location details nextX, nextY and well as preserving the original location pair returned by orderedMoves as next.   The commercial-at sign (@) used in pattern matching simply allows us to bind a name for the whole value, at the same time that we use pattern matching to extract details from the value. 

Just like the earlier list comprehensions, once we have a candidate value for returning, we can test to see if it conforms to some rules.  In this case, the only extra criterion is that the new location must not be any of the locations previously visited so far in this path.   To do that, we simply assert that the next location is not an element (elem) of the current path.  The 'elem' function is from the standard list library.  Whereas we were able to express predicates like this directly within a list comprehension, outside of this we must use the guard function to apply a predicate and decide whether to reject the current proposed next location (and therefore path under construction).  The guard function is from the standard Control.Monad library.

If our proposed next location survives the guard, then we consider it fit for extending the current path.  Commensurately, we build a newPath value by prepending the existing path with the next location.   Again, we're building up the path at the head of the list, because lists are only cheaply modifiable in their initial value.  

Having determined the new path, all that remains is to decide whether we have 'arrived' at the Exit location.  If we have then this defines a complete path, which is returned as a 'result' Path from mousePaths'.  Otherwise, this partially constructed path to the Exit still needs further 'moves' to extend it to the Exit.  Of course, the function that extends any single Path is mousePaths' itself, so we just make a recursive call, but with newPath.  

Hopefully, you were able to follow all the logic here, but it may take a few reading passes.  

There is a little 'magic' operating beneath the hood here, but it's exactly the same as a list comprehension.  This magic is simply that lists in Haskell belong to a Type Class that implements a branching computation strategy, where a single initial value (e.g. Path) can expand into a list of values (e.g. Paths).  The 'guard' function requires this membership, as it is used to prune away rejected paths (which are represented as empty lists).  We only ever get to see lists in the result that have not been 'rejected' in this way.  In any case, this behaviour (and of course list comprehensions) are absolutely standard features of Haskell and its core libraries.  

The final addition in this episode will be to simply wrap the mousePaths function with function that provides the start location and only returns a set number of Path solutions.

mousePathsFromStart :: Maze -> Int -> [Path]
mousePathsFromStart maze n = take n $ mousePaths maze startX startY
        (startX, startY) = startLocation maze

There shouldn't be anything surprising in this definition, given the foregoing.  

So, that's a wrap. 

In the next instalment, we'll add some console IO to plot mazes and the movement of the mouse step-by-step.