[Ed: This being the inaugural post in my technical blog, I felt I should start with a topic that's rather introductory and... constructive as the case may be.]
I recently found myself confronted with the following problem in my Haskell code. I needed to create elements of an algebraic data type; however, in order to store the intermediate results of a computation, part of each constructor was generated from the other parts. It can be difficult to work with such a type, but there is a solution! Let me attempt to provide a scenario and explain both the naïve way and the smart way.
First, let's use the following data type as our pedagogical substrate:
data Expr = Val Int | Add Expr Expr
The above tells me that this expression type may be constructed as a value
Val containing an integer
Int or as an addition operation
Add containing 2 subexpressions. With this fairly simple type, I might want to encode an arithmetic expression
e1 such as
(4 + 2) + 7. The result might look like so:
e1 = Add (Add (Val 4) (Val 2)) (Val 7)
The canonical follow-up to this would be an evaluation mechanism for expressions:
eval :: Expr -> Int
eval (Val i) = i
eval (Add e1 e2) = eval e1 + eval e2
When I call
eval e1, it recursively traverses the tree described in
e1 as evidenced by the calls to
eval in the
Now, suppose I'm developing an application using the data type
Expr [Ed: What—a calculator?], and I need to deal with very large expressions. The expression trees can be very deep and wide, requiring
eval to spend time visiting every node. In order to remove the need for that potentially time-consuming traversal, I want to have expressions evaluated upon construction from the bottom up. So, given that the only real computation involved in our
eval function is addition, let's append the result type
Int onto the
Add constructor for storing the result.
data Expr = Val Int | Add Expr Expr Int
Now, when we build our expression, we do the addition as well.
e2 = Add (Add (Val 4) (Val 2) (4+2)) (Val 7) (4+2+7)
But wait... That doesn't really make sense. Shouldn't the computer be forming the addition operations for us? This seems very error-prone. Let's try breaking down our construction of
e2, so that we're building a single expression at a time.
e3 = Add (Val 4) (Val 2) (4+2)
e4 = Add e3 (Val 7) (??+7)
That didn't really help much, either. I suppose we could extract out the
4+2 bit like so...
i = 4+2
e5 = Add (Val 4) (Val 2) i
e6 = Add e3 (Val 7) (i+7)
but then it seems like we're creating extra work. I mean, this is functional programming! There should be an easier way to do this! [Ed: Fine. Now, will you please get to the point soon.]
And indeed there is! It's called a smart constructor. [Ed: Oh really? The title did not give that away at all, I swear!] The process is as simple as replacing the actual
Add constructor with a function. This function takes in the same arguments as
Add with the exception of the result value, computes the result, and returns a complete
Add. Here it is in all its glory.
makeAdd :: Expr -> Expr -> Expr
makeAdd (Val i1) (Val i2) = Add (Val i1) (Val i2) (i1+i2)
makeAdd (Val i1) (Add e2a e2b s2) = Add (Val i1) (Add e2a e2b s2) (i1+s2)
makeAdd (Add e1a e1b s1) (Val i2) = Add (Add e1a e1b s1) (Val i2) (s1+i2)
makeAdd (Add e1a e1b s1) (Add e2a e2b s2) = Add (Add e1a e1b s1) (Add e2a e2b s2) (s1+s2)
As you can see, there are no recursive calls in
makeAdd. We are effectively hiding the computation as well as the fact that there is this third argument to the constructor. Consequently,
makeAdd is just as easy to use as the constructor of the original
Add (before appending the
e7 = makeAdd (makeAdd (Val 4) (Val 2)) (Val 7)
In addition, we can rewrite
eval so that it will simply fetch the precomputed value of an
eval :: Expr -> Int
eval (Val i) = i
eval (Add e1 e2 s) = s
And that's the basics for smart constructors. There are a few more things to add, but I think I'll stop for now. Perhaps I'll revisit this subject in a later entry.