Geeky things and other research from a work in progress

2009-02-19

The strict danger of switching from pure to monadic Template Haskell

Due to several issues with the Template Haskell (TH) deriving in EMGM (which I knew were there, but needed significant motivation to fix), I realized it was time to convert much of my code from pure to monadic. From the beginning, I developed it to be as non-monadic (i.e. not in Q) as possible for all the reasons one is supposed to. This led to several challenges that were not too difficult to surmount at the time. Namely, reify and newName had to be used at the top level of my function hierarchy, and their values had be passed down. But now I really need reify in a few places in the depths of the code, and it is quite impractical to do otherwise. For example, I need to expand type synonyms, and if I do that at the top level, every name is now either a name or some adaptation of a type declaration.

Most of the refactoring was surprisingly easy. I simply changed uppercase constructors to their lowercase, monadic equivalents (see the docs for the difference). However, I ran into a problem in which a function specific to bifunctor types was getting applied to monomorphic types. This lead to strange errors (compile-time of course, since it's TH) that told me something had changed in the way the functions were used. It was very likely not the programming logic itself, because I was carefully migrating individual functions a few at a time and testing as I went. But somehow, once I reached the point in the code where I needed reify, I started getting these error reports (my code, though it was for unexpected inputs).

After several hours of tracing through with report, I found the problem. Here is the blah-ified (and simplified) original code:

-- Non-monadic:
blah :: ... -> ...
blah ... = ...

-- Monadic wrapper
blahM :: ... -> Q [...]
blahM ... = return [blah ...]

-- Exposed
fun :: ... -> Q ...
fun ... = do
  a <- ...
  b <- blahM ...
  let x =
        case ... of
          1 -> a
          2 -> b
          _ -> []
  return (... ++ x)

Most of the logic lies under blah, so it follows that I monadicized (monadified?) blah which required a minor changed to blahM:

blah :: ... -> Q ...
blah ... = ...

blahM :: ... -> Q [...]
blahM ... = do
  x <- blah ...
  return [x]

Now, this didn't affect anything immediately. I kept converting my functions to the monadic religion, and at some point, I suppose I evangelized too much. Then, I believe that blah, which by the original design was meant to be called lazily, became too strict. It was getting called every time, instead of only when the case expression matched on 2. Once I realized this, it was evident that I needed to change fun.

fun :: ... -> Q ...
fun ... = do
  x <-
    case ... of
      1 -> ...
      2 -> blahM ...
      _ -> return []
  return (... ++ x)

Now, everything is strictly normal again. No strange errors to keep me up at night. I believe now that blahM was always getting evaluated even though blah was not. So, as I pushed the monad down into the code, more and more things were getting strictly evaluated.

I keep learning that I don't understand laziness as well as I think I do. Or maybe it's that I don't understand strictness. Or perhaps I'm too lazy to strictly learn both.

I also resolve never to write non-monadic Template Haske11 code again (excluding small and/or simple functions, of course). The amount of work required is not worth the benefits gained.

No comments:

Post a Comment