Pattern Matching

Rust also has match which is like match from many functional programming languages. It is very much dependent on types. Perl's attempt at a switch syntax with smart-matching is perhaps the closest analogy. Indeed, I think smart-matching failed because it tried to use information about the types of the operands and it didn't really have this information. Unlike every other operator in Perl, the smart-match operator was not the boss.

Update (2022-05-29): I talked about pattern matching at the Rust & C++ Cardiff book club. We are reading The Rust Programming Language together and I did the bit on chapter 18 (it starts about 50 minutes in). Also, you can view my slides.

Regular Expressions

When we think about pattern matching in Perl, we probably think of regular expressions. Regular expressions aren't just built in to Perl, they are part of its DNA.

Rust has no built in regular expressions, nor even any in the standard library. To employ regular expressions in a Rust program, we have to import a library from crates.io.

But before reaching for a regular expression library, consider other solutions. We use regular expressions for all sorts of things in Perl because they are so fast and easy. A simple string match like

my $s = 'Timothy';

say $s if $s =~ /^Tim/;

might better be done in Rust with a string method


#![allow(unused_variables)]
fn main() {
    let s = "Timothy";

    if s.starts_with("Tim") {
        println!("{s}");
    }
}

No regular expression required.

But when we do wish to use a regular expressions library in Rust, there are several to choose from. Probably the most common choice is regex, which provides regular expressions like Perl's re::engine::RE2 regexes, not its built-ins. If we need any of Perl's fancier features like backtracking or look-arounds, we will need to choose a different library.

Here is a Perl regex I used in is_epoch

    # Version 1 UUID's have timestamps in them. For example,
    # 33c41a44-6cea-11e7-907b-a6006ad3dba0 => 1e76cea33c41a44
    # -------- ----  ---               $+{high}$+{mid}$+{low}
    # low      mid   high
    #  8        4     3
    # and 1e76cea33c41a44 => 2017-07-20T01:24:40.472634Z
    my $UUIDv1 = qr{(?<low>[0-9A-Fa-f]{8})    -?
                    (?<mid>[0-9A-Fa-f]{4})    -?
                    1                            # this means version 1
                    (?<high>[0-9A-Fa-f]{3})   -?
                    [0-9A-Fa-f]{4}            -?
                    [0-9A-Fa-f]{12}              }mxs;

then later I use it like so

if ($arg =~ /^$UUIDv1$/) {
    ...
}

A similar Rust regex I used in epochs-cli looks like this


#![allow(unused_variables)]
fn main() {
    static RE: Lazy<Regex> = Lazy::new(|| {
         Regex::new(
            r"(?x)
            ([0-9A-Fa-f]{8})  -?
            ([0-9A-Fa-f]{4})  -?
            ([0-9]{1})
            ([0-9A-Fa-f]{3})  -?
            [0-9A-Fa-f]{4}    -?
            [0-9A-Fa-f]{12}
            ",
        )
        .unwrap()
    });
}

then later I used it like this


#![allow(unused_variables)]
fn main() {
if let Some(cap) = RE.captures(text) {
    ...
}
}

We've already talked about the regex crate, but what's that Lazy business? Compiling a regex is not inexpensive, so we only want to do it once at compile time. This is similar to what qr does in Perl. We'll do that with help from the once_cell crate. That is, at the top of this program we have


#![allow(unused_variables)]
fn main() {
use once_cell::sync::Lazy;
use regex::Regex;
}

which provides the Lazy and Regex which appear in the code above.

So there is a bit more ceremony involved in using regular expressions in Rust, but it's not too bad. And we use them less often in Rust because of the rich set of string methods at our disposal, so it's really not a problem.