- F# Pattern Matching – Part 1 – Introduction
- F# Pattern Matching – Part 2 – Active Patterns (this post)
In part 1 of this series, we were introduced to pattern matching basics in F#. We started with very simple examples (standard
let bindings) and worked our way to more complicated examples with record types and discriminated unions. We also looked at the useful match expression construct.
In part 2, we will continuing our exploration of pattern matching via active patterns. In addition, I will sprinkle in a few useful tidbits that we didn’t have a chance to examine in part 1.
Just as standard pattern matching in F# has several types of patterns available there are quite a few variations of active patterns at our disposal. This myriad of patterns can empower us to create concise code that more closely models our business domain.
We will explore each of these active patterns in this post.
Let’s begin with the definition of active patterns, according to MSDN:
Active patterns enable you to define named partitions that subdivide input data, so that you can use these names in a pattern matching expression just as you would for a discriminated union. You can use active patterns to decompose data in a customized manner for each partition.
So what does this mean?
My simplified explanation is that you can use active patterns to match an abstraction, rather than the specific underlying data. You can hide away some complicated logic that represents the pattern you are trying to match and give it an appropriate name. This sounds a lot like a function doesn’t it? However, the compiler will not allow functions as a pattern in a match statement. Active patterns give you a way to do that.
Let’s dive into some examples to illustrate this point.
I’ve seen this active pattern referred to as single-case, single-complete and single-total. I prefer single-case myself, but if you see any of those terms when reading about active patterns, this is the one being referenced.
A single-case active pattern can be thought of as a pattern match compliant function that represents a single-case discriminator (think of it as a single-case discriminated union that contains a function body). Let’s do an example.
First I will show you a function being used in a match expression.
Here you can see that we are trying to do a match expression on a pattern that is a function (an abstraction that counts the number of words in a string). Now we could have just as easily put the code from the function body directly in the match expression like so:
match document.Split([|' '|]).Length with
However, the function models our domain better (i.e. it is more self documenting as to the intention of the match statement). Also, we may have more places in our code that we need to use the
countWords function. Lastly, this is a contrived example where were are simply using a one line function body.
Unfortunately, line 6 (just the first illegal line of many that the compiler hits) will give you the following compiler error:
The pattern discriminator ‘countWords’ is not defined
Bummer. So how can we get around this limitation? Active patterns to the rescue!
This looks very similar to the previous example. The only difference is that we put
|) (called banana clips) around the
countWords function identifier (and we also changed the name to
CountWords, to follow standard naming conventions). What this does is transform the identifier into an active recognizer.
Let’s examine the match expression a little further. The expression starts with
match document with. This is passing the value in
document to the list of patterns. The first pattern makes a call to the
CountWords active recognizer, passing the value from
document as a parameter. The expression is then evaluated and returned to be matched against the rest of the pattern (in this case, the integer
5). This repeats for each pattern until a match is found. You’ll notice we are doing three types of patterns:
- Matching against the literal value
- Capturing the result of
wcand comparing it to a
- Capturing the result of
wcand treating it like a wildcard.
Number 3 is required to ensure we have a complete match expression, otherwise our result could fall through the bottom (the compiler will warn us too). Remember from part 1 in this series, we could have used the underscore character if we didn’t need to use the value on the right hand side of the
| Countwords _ -> printfn "too many words to count").
We can also combine multiple single-case active patterns to create something like a validation chain.
Here we’ve created two single-case active patterns checking the character count and whether or not we have at least one numeric character. Next, we’ve combined those together in a match expression for
IsValidPassword, which is itself a single-case active pattern. These active patterns combine to form a (poor) password validation chain that returns a tuple of the validation result and a failure reason (note: check out this excellent post on a much better approach for success/failure strategies).
A quick digression here. Note the tidbit I snuck in with the
IsValidPassword active pattern. The match expression (
match ... with) has been replaced by the
function keyword. Also, the parameter being passed into
IsValidPassword is not there. This is a short-hand for any active pattern (or function for that matter) that simply does a pattern match on its parameter (or specifically, its last parameter). It also comes in handy for inline pattern matching in lambda expressions. I will not show that here, but take a look at this stackoverflow question for some good explanations.
So with these single-case active patterns in place, we could create a function such as this:
setPassword function calls the
IsValidPassword active recognizer, which in turn calls each active recognizer contained inside the
IsValidPassword until a match is found. The result returned is a tuple that is then matched for
(true, _) or
(false, failureReason). This will then result in either the password being set or an exception being thrown (see the two examples at the bottom of the code sample).
While, we’re still on this example, we’re going to look at a few different ways to use
Here we’re actually using the active recognizer directly in the parameter of
setPassword. This is nice as it simplifies the match expression in the function. The end result is the same (although I changed the bad password to trigger the other failure message this time).
Here is one final thing I will show you before we move on to the next type of active pattern.
From this example you can see that a
let expression is just a pattern match. This allows us to use the
IsValidPassword directly, without needing a separate function. Very cool!
By the way, those tick marks for
message' on the
goodPassword let expression are just because we already have
message in the line of code above. We’re just avoiding identifier name collisions. The ticks are nothing special.
By now you’re probably in single-case active pattern overload. So let’s move on to the next type of active pattern.
As the named suggests, you can create a single-case active pattern that contains extra parameters. Let’s jump right to an example.
There is not much of a difference from a standard single-case active pattern here other than passing a couple of extra parameters to the active recognizer
GreaterThanThreshold. The key thing to note is that the value from the
match .. with (in this case the
text identifier) expression is passed in as the last value to the active recognizer.
Remember too, that we could have done the following:
profanityChecker function, we’ve used the
function keyword shortcut in place of explicitly declaring the
text parameter and then using it in the
match text with expression.
There’s not much more to say about parameterized single-case active patterns. As I said, they are exactly the same as normal single-case active patterns except for allowing extra parameters.
I’ve heard this active pattern referred to as multiple-case and multiple-total. This active pattern is a nice way to classify some value and optionally decompose it. Let’s do a couple of examples. We’ll start with a very simple one that is sort of the canonical multiple-case active pattern.
The notable difference from the previous examples is that there are two active recognizers in the same definition,
(|Even|Odd|), separated by a
| (pipe) character. In a way, this resembles a discriminated union. Conveniently enough, the function
describeNumber looks like a pattern match on a discriminated union too. That’s probably the easiest way to reason about multiple-case active patterns.
This example was simply classifying the input (the integer parameter) as
Odd based on the expression in the multiple-case active pattern.
We can do more interesting things by decomposing the data, however. Let’s do a more complicated example, showing how we can take advantage of multiple-case active patterns with abstractions that represent our domain.
We’ve created four active recognizers in our multiple-case active pattern (you can have a maximum of seven) representing the four seasons of the year. Let’s breakdown what is going on:
- Our multiple-case active pattern takes a
DateTimeobject and decomposes it into a tuple representing the
- The tuple then goes through the following match patterns:
Dayis disregarded) or
Dayis disregarded) return
Dayis disregarded, but we already failed the
< 21check, so it has to be
>= 21) or
Dayis disregarded) or
Dayis disregarded) return
- Repeat similar pattern for remaining seasons…
- The last pattern is our catchall, which will match when
Dayhas to be
>= 21because of the previous pattern). Using the underscore for the final pattern makes the compiler happy since it doesn’t know that the integer representing
Monthhas to be in the 1-12 range (assuming we can trust the .Net base class library).
You may have noticed, I slipped in another tidbit we never covered in part 1. That is the single
| (pipe) used to combine match patterns into a single
OR expression on one line. This is a convenient way to return the same result for multiple pattern matches. There is one caveat though. All of the patterns must bind to the same variable(s). So that means we couldn’t do the following to the match expressions, even though it is a bit more convenient:
This will result in the following compile error:
The two sides of this ‘or’ pattern bind to different sets of variables.
This is because
Day is being bound to
_ (which really disregards the day) for the
2 cases, but
d for the third case. Also, even removing the
when clause or changing the two
_ to a bound identifier other than
d will still result in the same error. It cannot be different between the cases, period.
Now with that being said, we could shorten the code by doing the following (I cleaned it up to look nicer and more readable, though I tend not to space things this way. It makes code maintenance harder when you add new cases etc.):
Each pattern is decomposing the
Day into the same
d variable. It just has to be the same for each
OR clause (the same line), not the same for each pattern (separate lines). The downside to this approach is that we are binding
Day to the
d identifier, but then not using it on the right hand side of
->. It is required to use an identifier for the
when clauses however, so we can’t just ditch the
So it is a toss up and a matter of preference. This version is probably more readable to most people, but it creates unnecessary bindings.
Now that we have this nice little multiple-case active pattern to determine the season, let’s make use of it. We’ll even use another
OR pattern for good measure.
Here we have a pattern matching function that will take a
DateTime and convert it to a Canadian season. In Canada we joke that we really only have two seasons: Construction and winter. So our function will take a match on
Fall and print Construction season. A match on
Winter will print Winter season.
Remember that by using the
function keyword shortcut, since this function only contains a pattern match expression, we don’t have to specify the
DateTime parameter explicitly or the
match .. with.
Now before we move on to the next active pattern, I shall digress for a moment. Since we are talking about the
OR match pattern, we should probably look quickly at the
AND match pattern.
As you would expect, we use the
& (ampersand) character. We’ll go back to a simple combination of single-case active patterns for this example.
This is a solution to the standard FizzBuzz problem. Yes I went there. It should be pretty self-explanatory. If you don’t know what that is and can’t deduce it from the code (and comments), checkout the link above.
There is also a nice solution doing something similar that I found on FsSnip using multiple-case patterns.
While we’re in this digression intermission, lets hit on one final tidbit that didn’t make into part 1. I promise we’ll get back to active patterns after this.
Maybe you’ve heard of partial application for functions. That is where you make a new function that calls another function with some parameters baked-in. This is an effective form of code reuse in F#. Check out this post for an excellent overview.
Active patterns also allow partial application. Let’s look at a rather silly example.
So what are we doing here? First we have created a parameterized single-case active pattern called
StartsWith. This takes a string,
text, which is then checked to see if it starts with the first parameter
find. The partial application happens on line 4 with
IsAnFWord. We are baking the string
f into a call to
wordCheck function can then use the
IsAnFWord active recognizer in a match statement without needing to pass in the first parameter (remember the string
f is already included for us), while still using
StartsWith (which does requires the first parameter to be passed in).
I hope that made sense. If not, do some reading on regular partial application for functions (there are lots of examples out there). The exact same principles apply to partial application for active patterns.
Ok, intermission is over. Let’s move on to the next active pattern.
Partial Active Pattern
Ironically, partial active pattern sounds similar to partial application for active patterns (what we just talked about), but they are not the same thing. Partial active patterns are also referred to as single-case partial active patterns or a few other variations. There are no multi-case partial active patterns though, so partial active pattern should suffice.
So what does partial active pattern mean? The simplified explanation is that sometimes you will get a match and sometimes you won’t. So what is returned is an option type. If you don’t know about the option type, go check it out. I’ll wait here.
Sometimes option types are a pain, because you need to match against
Some x (to extract the value and store it in
x). Luckily, partial active patterns allow you use option types for cleaner pattern matching expressions without the extra fuss. Recall our
FizzBuzz example from earlier:
In lines 6, 7 and 8 we are forced to match against the value
true. This seems like extra noise that would be nice to avoid (unless for some reason we wanted to match against
false of course). So how can we make this a little nicer? By using a partial active pattern for both
Notice we’ve added a
_ (underscore) as a second active recognizer for
Buzz. This signals that we are dealing with a partial active pattern, which then requires us to return an option type (the compiler will complain if you do not). We have done that by returning the given active recognizer wrapped in an option type (calling the
Some constructor) when our condition is
true and returning
None when our condition is not met.
Looking at the
fizzBuzz function, you can see that we no longer require
true for the match expression.
Let’s try a more interesting example to show off the power of partial active patterns. We’ll do some log parsing.
There is quite a lot going on here in this small amount of code.
First, we have created a discriminated union to hold our types of log messages. We can have an
Error. Each union type holds a
DateTime and a
Message string. For
Error, there is also a
Next we have two partial active patterns that are wrapping the
Int32.TryParse functions. These will handle invalid data for us by wrapping data in an
Option type. Either we have the data we expect in the proper format or we don’t. When we don’t, we’ll get
None returned, which then saves us from ugly error handling code.
Now let’s put this code to use with another partial active pattern.
This partial active pattern tokenizes a line and attempts to match the tokens against our expected format for an
Error (we’ll see the format shortly).
Notice in the match expression, we can decompose the tokens nicely using a
List match pattern, which in turn decomposes data using our existing
Int partial active patterns.
So if we have a proper
DateTime, then the string
Integer and then our message string, we will match successfully and create an
Error, wrapped in an
Option. If any of those pieces don’t match we return
We could do the same for
Warning, but I’ll leave that out for brevity. Next let’s create a function to parse errors from the log file and then test it out.
parseErrors takes a
log string sequence and recursively parses each line (we convert the sequence of strings to a list of strings on the initial call to the recursive function on line 9 to take advantage of list pattern matching). If an
Error is matched via our partial active pattern for errors, we prepend it to our list of errors. Otherwise, we just continue parsing the next line.
For our test, we have sample sequence of strings (stored in
testLog) in the format
[DateTime] [Type] [Error Code if applicable] Message. This sequence is to simulate reading lines from a log file. We take that sample log and call
parseErrors to produce a list containing only errors.
Ok, on to the final active pattern.
Parameterized Partial Active Pattern
A parameterized partial active pattern is just the same as a partial active pattern, but with one or more extra parameters. Let’s jump straight to our example.
This is very similar to the regular partial active pattern FizzBuzz example. However, we’ve added a
MultipleOf partial active pattern that takes two parameters. The first parameter is our
multiplicand and the second parameter is our
number that we are matching against. With this parameterized partial active pattern, we can then take advantage of partial application for the
Buzz partial active patterns by baking in the
5 values. The rest of the code is exactly the same as the previous example used in regular partial active patterns.
So concludes my extremely long winded explanation of active patterns and a few extra tidbits thrown in for good measure. Hopefully you see how they can take pattern matching to the next level to allow your problem domain to shine through your code better.
Thanks for reading. Please leave any questions or suggestions in the comments.