[Ed: *This being the inaugural post in my technical blog, I felt I should start with a topic that's rather introductory and... *constructive* as the case may be.*]

I recently found myself confronted with the following problem in my Haskell code. I needed to create elements of an algebraic data type; however, in order to store the intermediate results of a computation, part of each constructor was generated from the other parts. It can be difficult to work with such a type, but there is a solution! Let me attempt to provide a scenario and explain both the naïve way and the *smart* way.

First, let's use the following data type as our pedagogical substrate:

dataExpr = Val Int | Add Expr Expr

The above tells me that this *expression* type may be constructed as a value `Val`

containing an integer `Int`

or as an addition operation `Add`

containing 2 subexpressions. With this fairly simple type, I might want to encode an arithmetic expression `e1`

such as `(4 + 2) + 7`

. The result might look like so:

e1 = Add (Add (Val 4) (Val 2)) (Val 7)

The canonical follow-up to this would be an evaluation mechanism for expressions:

eval :: Expr -> Int

eval (Val i) = i

eval (Add e1 e2) = eval e1 + eval e2

When I call `eval e1`

, it recursively traverses the tree described in `e1`

as evidenced by the calls to `eval`

in the `Add`

component.

Now, suppose I'm developing an application using the data type `Expr`

[Ed: *What—a calculator?*], and I need to deal with very large expressions. The expression trees can be very deep and wide, requiring `eval`

to spend time visiting every node. In order to remove the need for that potentially time-consuming traversal, I want to have expressions evaluated upon construction from the bottom up. So, given that the only real computation involved in our `eval`

function is addition, let's append the result type `Int`

onto the `Add`

constructor for storing the result.

dataExpr = Val Int | Add Expr ExprInt

Now, when we build our expression, we do the addition as well.

e2 = Add (Add (Val 4) (Val 2)(4+2)) (Val 7)(4+2+7)

But wait... That doesn't really make sense. Shouldn't the computer be forming the addition operations for us? This seems very error-prone. Let's try breaking down our construction of `e2`

, so that we're building a single expression at a time.

e3 = Add (Val 4) (Val 2) (4+2)

e4 = Add e3 (Val 7) (??+7)

That didn't really help much, either. I suppose we could extract out the `4+2`

bit like so...

i = 4+2

e5 = Add (Val 4) (Val 2) i

e6 = Add e3 (Val 7) (i+7)

but then it seems like we're creating extra work. I mean, this is *functional* programming! There should be an easier way to do this! [Ed: *Fine. Now, will you please get to the point soon.*]

And indeed there is! It's called a smart constructor. [Ed: *Oh really? The title did *not* give that away at all, I swear!*] The process is as simple as replacing the actual `Add`

constructor with a function. This function takes in the same arguments as `Add`

with the exception of the result value, computes the result, and returns a complete `Add`

. Here it is in all its glory.

makeAdd :: Expr -> Expr -> Expr

makeAdd (Val i1) (Val i2) = Add (Val i1) (Val i2) (i1+i2)

makeAdd (Val i1) (Add e2a e2b s2) = Add (Val i1) (Add e2a e2b s2) (i1+s2)

makeAdd (Add e1a e1b s1) (Val i2) = Add (Add e1a e1b s1) (Val i2) (s1+i2)

makeAdd (Add e1a e1b s1) (Add e2a e2b s2) = Add (Add e1a e1b s1) (Add e2a e2b s2) (s1+s2)

As you can see, there are no recursive calls in `makeAdd`

. We are effectively hiding the computation as well as the fact that there is this third argument to the constructor. Consequently, `makeAdd`

is just as easy to use as the constructor of the original `Add`

(before appending the `Int`

):

e7 = makeAdd (makeAdd (Val 4) (Val 2)) (Val 7)

In addition, we can rewrite `eval`

so that it will simply fetch the precomputed value of an `Add`

.

eval :: Expr -> Int

eval (Val i) = i

eval (Add e1 e2 s) = s

And that's the basics for smart constructors. There are a few more things to add, but I think I'll stop for now. Perhaps I'll revisit this subject in a later entry.

Welcome to Planet Haskell! I like your writing style, keep it up. =)

ReplyDeleteThanks for the encouragement! I will do my best.

ReplyDeleteI'm not exactly sure why your strategy should buy you anything in terms of efficiency (in time or space). Maybe you want to demonstrate the purpose of your idea in a follow-up article?

ReplyDeleteIf you are building this as you parse an expression, why not just eliminate the data type altogether and just build up the value directly? I just don't see how this buys you any efficiency or clarity.

ReplyDeleteWhy can't makeAdd simply be:

ReplyDeletemakeAdd e1 e2 = Add e1 e2 (eval e1 + eval e2) ?