Perl to Rust

Introduction

Many introductions to Rust already exist. Most of them are aimed at C++ programmers. That makes sense, but lots of folks are coming to Rust from other languages now.

My current1 day job is mostly Perl. It occurred to me that an introduction to Rust aimed at people who already know Perl could be useful.

Rust is exciting to Perl programmers for a number of reasons.

  • We can write faster programs. Rust is generally more performant than Perl. Often much more performant.
  • We can write more memory-efficient programs. Rust gives us much greater control over the memory we use than does Perl. Our Perl programs are often memory hogs.
  • We can write multi-threaded programs. Rust provides easy access to threads. In Perl, we are largely restricted to processes.
  • We can write Perl extension modules in Rust. Rust interacts with the C ABI really well. If we have a slow Perl module that we were thinking of re-writing in C, we can consider writing it in Rust instead.
  • We can target WebAssembly in Rust. If we care about web programming, we might be able to use Rust in place of much of our JavaScript. And with the advent of WASI, there are more reasons to target WebAssembly; it's not just for the web anymore. For example, with Krustlet we can substitute WebAssembly artifacts for Docker containers in the Cloud. And now there is Spin!

Rust is a big, complicated language. Where do we begin? It feels like we have to know everything at once. As Lisa Passing put it, the learning curve can be more like a wall you hit. But she goes on to say, "once you've learned how to walk through this wall...you can walk through walls."

I want to walk through walls!

Having to know everything at once makes it hard to teach Rust as well. It seems like no matter where we start, we are always touching on concepts that we haven't covered yet. This is quite the opposite of Perl, where it's fairly easy to learn as we go. But perhaps making this one assumption--- that we all know Perl--- will help us navigate the complexities of Rust. I don't know if this is going to work, but I thought I'd try it.

-- Tim Heaney


1

Update: I no longer have this job. If you would like to hire me, please get in touch!

Installation

If you don't already have Rust installed, we should probably start there. It's not entirely necessary. You could explore for some time on the Rust Playground without installing anything. But eventually, you'll probably want to install it.

System Rust

One way to install Rust might be through your operating system. I am writing this on a Debian 10 machine, so I could install Rust with

$ sudo apt install rustc

or, better, install Cargo with

$ sudo apt install cargo

which has rustc as a dependency.

Rustup

Another way to install Rust is with the amazing rustup. Just as we often install and manage multiple versions of Perl with perlbrew or plenv, we likely want to do the same for Rust. If we're doing this to ensure we always have the latest version, then it's even more important for Rust than for Perl. They release a new version of stable Perl about once a year. They release a new version of stable Rust every six weeks! That sounds like it could be painful, but it's usually no big deal. We run one command

$ rustup update stable

and a few seconds later, we have the latest Rust toolchain. I can handle that every six weeks.

Overview

Tools

As we've just seen, Rust includes some great tooling. Things like cargo and rustup make it a real pleasure to use. And we'll see more as we go.

RustPerl
rustcperl
cargocpanm, dzil, and more
rustupplenv, perlbrew
rustfmtperltidy
clippyperlcritic
rustdocperldoc
modulemodule
cratedistribution
crates.iometacpan.org
Rust FoundationThe Perl Foundation

We've spent decades coming up with some of this stuff in Perl. In Rust, it's all here already!

Differences

As we'll see, there are a lot of huge differences between Perl and Rust.

RustPerl
static typesdynamic types
strong typesweak types
move, borrowcopy, reference
immutable by default (mut)mutable
private by default (pub)public
expressionsstatements, expressions

Similarities

But some things will be familiar. Scope works pretty much the same. Both languages use the use keyword and :: in similar ways. And both have lots of "C-like" syntax in common.

Hello, Rust!

Let's start with the "hello world" program! This was originated by Brian Kernighan some 40 years ago. It's a brilliant piece of pedagogy in C. It's less interesting in Perl and Rust, but it will serve our purposes nonetheless. The "hello world" program in Perl might look like this

perl -E 'say "Hello, World!"'

Here we call perl directly, feeding it our Perl source code to be immediately interpreted. Alternatively, we could put our code in a file

#!/usr/bin/env perl

use v5.28;
use warnings;

say "Hello, World!";

and then feed that file to perl. If the file were called hello, then we could call perl directly

$ perl hello
Hello, World!

or through the shebang line

$ ./hello
Hello, World!

In Rust, the "hello world" program looks like this

fn main() {
    println!("Hello, World!");
}

To run it, we must first compile it. If it were in a file called hello.rs, we could call the rust compiler directly with

$ rustc hello.rs

This would create another file, hello which is an executable

$ ./hello
Hello, World!

Our Rust source code is no longer required. We have a stand-alone executable, specific to our machine.

$ file hello
hello: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=05fddacb48ac6fa3ac26ff897c61f31941a04c4d, with debug_info, not stripped

However, if we wanted it to run on some other machine, we might have to recompile it for that platform.

Contrast that with Perl. Our Perl source code can be run by any perl on any platform. Perl itself is a big ole C program that must be compiled to a perl executable for each platform. But once we have it (often it comes with the operating system, so we don't even have to compile it), we can run any Perl code from anywhere.

Compile and run string diagram

So that's the first big difference in our workflow. We run Perl immediately as we are developing, but we must first compile our Rust.

Hello Cargo

In truth, we rarely call the Rust compiler directly like that. Rust comes with a package manager called cargo which does all kinds of things for us, including calling rustc.

In fact, cargo will write the "hello world" program! If we use "cargo new" to create a new project

$ cargo new hello
     Created binary (application) `hello` package

Then we get a directory called "hello" with a few things already in it

$ tree hello
hello
├── Cargo.toml
└── src
    └── main.rs

1 directory, 2 files

If we look in that src/main.rs file, we see the "hello world" program

$ cat hello/src/main.rs 
fn main() {
    println!("Hello, world!");
}

If we change to that directory, then "cargo run" will first call the compiler and then run the resulting executable

$ cd hello
$ cargo run
   Compiling hello v0.1.0 (/home/tim/hello)
    Finished dev [unoptimized + debuginfo] target(s) in 0.28s
     Running `target/debug/hello`
Hello, world!

So this is more like a typical workflow in Rust. Not bad, right? Cargo really takes the rough edges off. It does a whole lot more too, so we'll be seeing it again.

Cargo run string diagram

Output

Let's go back to our "hello world" programs, in Perl

#!/usr/bin/env perl

use v5.28;
use warnings;

say "Hello, World!";

and Rust

fn main() {
    println!("Hello, World!");
}

We see that println! in Rust is like say in Perl. There are similar things for print and warn.

RustPerl
print!print
println!say
eprint!warn
eprintln!warn

But for anything more complicated than a simple string like "Hello, World!" we need to format things first. In that sense, maybe a better analogy is to Perl's printf and sprintf

RustPerlTMTOWTDI
print!printprintf
println!say
eprint!warn
eprintln!warn
format!interpolationsprintf

Instead of interpolating the values of variables as in Perl

my $name = 'Tim';

say "Hello, $name!";

we format them like so


#![allow(unused_variables)]
fn main() {
let name = "Tim";

println!("Hello, {}!", name);
}

The brackets are a placeholder for the value of name. To insert more variables, add more brackets.


#![allow(unused_variables)]
fn main() {
let name = "Tim";
let salutation = "Mr.";

println!("Hello, {} {}!", salutation, name);
}

(I don't think this is common elsewhere, but here in Baltimore I am much more likely to be called "Mr. Tim" than either "Mr. Heaney" or "Tim"; especially by younger people.) Anyway, there are lots more options, but you get the idea.

One of my favorite ways to show output while I'm working on Perl is with Data::Printer. We can achieve something similar with Rust's formatting.


#![allow(unused_variables)]
fn main() {
let name = "Tim";
let salutation = "Mr.";

println!("Hello, {:?} {}!", salutation, name);
}

When we run this, we get something slightly different. The {} formatted the variable according to something called its Display trait. But the {:?} formats it according to its Debug trait. It doesn't make much difference here, but in practice I find it indispensable.

I often use Data::Printer in Perl to print out the values of complex data structures. It does that with clever introspection at run time. We can't do anything like that in Rust, but if we make sure the data structures we care about have a Debug trait (and there is a procedural macro that does exactly this for us), then we can just print them out with {:?}. If they're very complex, we can print them out with {:#?}, which is the same thing only prettier (arguably more like Data::Printer).

RustPerl
println!("{:#?}", foo)use DDP; p $foo

I've been using println! everywhere (which is generally true in practice as well), but this format syntax is true for all of the above.


#![allow(unused_variables)]
fn main() {
// Format the string and return it.
let greeting = format!("Hello, {} {}!", salutation, name);

// Print the line to stderr, rather than stdout.
eprintln!("Hello, {} {}!", salutation, name);
}

There is also a handy little dbg! macro, which prints out the variable name and value, along with the file name and line number, to stderr

RustPerl
dbg!(foo)warn "\$foo = $foo"

Update (2022-01-13): With the release of Rust 1.58, we can put captured identifiers in format strings! This is kind of like interpolation in Perl!


#![allow(unused_variables)]
fn main() {
let name = "Tim";
let salutation = "Mr.";

println!("Hello, {salutation} {name}!");
}

Macros

What are macros? When we see a "function" with an exclamation point in its name like println!, format!, and dbg!, we know it is not actually a function, but a macro.

Macros are how we do metaprogramming in Rust. A metaprogram is a program that generates a program. In Perl, we do this with things like source filters, Devel::Declare, Moose, and eval. In Rust, we have macros.

Macros are not Rust code, per se. Rather, they generate Rust code. They happen before the compiler gets to see the code. As such, they can do things that functions can't. For example, you may have noticed that println! is variadic. We've already called it with one, two, and three arguments.


#![allow(unused_variables)]
fn main() {
println!("Hello, World!");
println!("Hello, {}!", name);
println!("Hello, {} {}!", salutation, name);
}

You may not have noticed, because that's normal for Perl. But Rust does not have variadic functions. We couldn't make a println function like this. But we can do it with a macro.

There's a lot more to macros, but for now it's probably enough to know that the exclamation mark indicates we're looking at a macro rather than a function. Sort of like the sigils in Perl!

Control Flow

Flow of control in Rust will look pretty familiar. In Rust, we don't use round brackets on the condition in if, while, and for as we do in Perl.

RustPerl
ifif, ternary operator
if !unless
whilewhile
loopwhile(1)
forfor, foreach
continuenext
breaklast
'label:LABEL:

In Rust, everything is an expression. There are no statements. That includes if, so we don't need a separate ternary operator.

In Perl, we might rewrite this if statement

my $name;
if ($formal) {
    $name = 'Timothy';
} else {
    $name = 'Tim';
}

with a ternary operator.

my $name = $formal ? 'Timothy' : 'Tim';

Many eschew the ternary operator, but I think that's better code. We replace a separate variable declaration, an if statement, and two assignment statements with a single assignment.

In Rust, we would probably never write the first version


#![allow(unused_variables)]
fn main() {
let name;
if formal {
    name = "Timothy";
} else {
    name = "Tim";
}
}

because the second version


#![allow(unused_variables)]
fn main() {
let name = if formal { "Timothy" } else { "Tim" };
}

just seems more natural. If if is an expression, then the first version is really going out of its way to ignore what it's returning just to repeat the name = assignment twice.

Note that neither the Perl nor the Rust versions need to stay on one line.

my $name = $formal
    ? 'Timothy'
    : 'Tim';

#![allow(unused_variables)]
fn main() {
let name = if formal {
    "Timothy"
} else {
    "Tim"
};
}

Loops

Loops are pretty much the same. If we want an infinite loop, we say loop instead of "while true".

while (1) {
    # do stuff
}

#![allow(unused_variables)]
fn main() {
loop {
    // do stuff
}
}

We don't have a C-style three-part for-loop in Rust as we do in Perl.

for (my $i = 1; $i <= 10; $i++) {
    say $i;
}

That's okay. I almost never use it in Perl, so I almost never miss it in Rust.

Rust's for is more like Perl's foreach back when we distinguished between for and foreach.

for my $i (1..10) {
    say $i;
}
fn main() {
for i in 1..=10 {
    println!("{}", i);
}
}

If we needed a three-part for-loop, we would have to write the analogous while-loop with the three separate parts.

my $i = 1;
while ($i <= 10) {
    say $i;
    $i++;
}
fn main() {
let mut i = 1;
while i <= 10 {
    println!("{}", i);
    i += 1;
}
}

I might have more to say about Rust's for later when we talk about iterators. It's really quite interesting how it works.

Maybelet

Rust also has if let and while let, which we will talk about more after we've discussed error handling. Paul Evans described them beautifully at FOSDEM 2021. At about 24:30 of this video, he proposes if(maybelet and while(maybelet for Perl 2025. That's pretty much how and why if let and while let work in Rust.

Scope

Scope in Rust works much like scope in modern Perl. Every block (if, for, while, &c.) creates a new scope. We can add a bare block for a new scope in exactly the same way ({}).

These two do the same things for the same reasons.

my $first_name = "Dean";
my $last_name = "Venture";
{
    my $first_name = "Hank";
    say "$first_name $last_name";
}
say "$first_name $last_name";

#![allow(unused_variables)]
fn main() {
let first_name = "Dean";
let last_name = "Venture";
{
    let first_name = "Hank";
    println!("{} {}", first_name, last_name);
}
println!("{} {}", first_name, last_name);
}

Both print

Hank Venture
Dean Venture

We first print "Hank Venture" because the first name is Hank in the new scope, but we can still see the Venture in the outer scope. Then we print "Dean Venture" because the inner scope is over and first name is still Dean in the outer scope.

One difference is that Rust encourages shadowing of variables (re-using a name in the same scope), but Perl does not.

my $first_name = "Dean";
my $first_name = "Hank"; # Warning!

#![allow(unused_variables)]
fn main() {
let first_name = "Dean";
let first_name = "Hank"; // first_name is *rebound* to "Hank".
}

Code like this throws a warning in Perl, but is often seen in Rust code. Note that the re-used symbol gets a completely new binding, so it can even change type.


#![allow(unused_variables)]
fn main() {
let first_name = "Dean";
let first_name = String::from("Hank");
}

Here first_name changes from a &str to a String.

Pattern Matching

Rust also has match which is like match from many functional programming languages. It is very much dependent on types. Perl's attempt at a switch syntax with smart-matching is perhaps the closest analogy. Indeed, I think smart-matching failed because it tried to use information about the types of the operands and it didn't really have this information. Unlike every other operator in Perl, the smart-match operator was not the boss.

Update (2022-05-29): I talked about pattern matching at the Rust & C++ Cardiff book club. We are reading The Rust Programming Language together and I did the bit on chapter 18 (it starts about 50 minutes in). Also, you can view my slides.

Regular Expressions

When we think about pattern matching in Perl, we probably think of regular expressions. Regular expressions aren't just built in to Perl, they are part of its DNA.

Rust has no built in regular expressions, nor even any in the standard library. To employ regular expressions in a Rust program, we have to import a library from crates.io.

But before reaching for a regular expression library, consider other solutions. We use regular expressions for all sorts of things in Perl because they are so fast and easy. A simple string match like

my $s = 'Timothy';

say $s if $s =~ /^Tim/;

might better be done in Rust with a string method


#![allow(unused_variables)]
fn main() {
    let s = "Timothy";

    if s.starts_with("Tim") {
        println!("{s}");
    }
}

No regular expression required.

But when we do wish to use a regular expressions library in Rust, there are several to choose from. Probably the most common choice is regex, which provides regular expressions like Perl's re::engine::RE2 regexes, not its built-ins. If we need any of Perl's fancier features like backtracking or look-arounds, we will need to choose a different library.

Here is a Perl regex I used in is_epoch

    # Version 1 UUID's have timestamps in them. For example,
    # 33c41a44-6cea-11e7-907b-a6006ad3dba0 => 1e76cea33c41a44
    # -------- ----  ---               $+{high}$+{mid}$+{low}
    # low      mid   high
    #  8        4     3
    # and 1e76cea33c41a44 => 2017-07-20T01:24:40.472634Z
    my $UUIDv1 = qr{(?<low>[0-9A-Fa-f]{8})    -?
                    (?<mid>[0-9A-Fa-f]{4})    -?
                    1                            # this means version 1
                    (?<high>[0-9A-Fa-f]{3})   -?
                    [0-9A-Fa-f]{4}            -?
                    [0-9A-Fa-f]{12}              }mxs;

then later I use it like so

if ($arg =~ /^$UUIDv1$/) {
    ...
}

A similar Rust regex I used in epochs-cli looks like this


#![allow(unused_variables)]
fn main() {
    static RE: Lazy<Regex> = Lazy::new(|| {
         Regex::new(
            r"(?x)
            ([0-9A-Fa-f]{8})  -?
            ([0-9A-Fa-f]{4})  -?
            ([0-9]{1})
            ([0-9A-Fa-f]{3})  -?
            [0-9A-Fa-f]{4}    -?
            [0-9A-Fa-f]{12}
            ",
        )
        .unwrap()
    });
}

then later I used it like this


#![allow(unused_variables)]
fn main() {
if let Some(cap) = RE.captures(text) {
    ...
}
}

We've already talked about the regex crate, but what's that Lazy business? Compiling a regex is not inexpensive, so we only want to do it once at compile time. This is similar to what qr does in Perl. We'll do that with help from the once_cell crate. That is, at the top of this program we have


#![allow(unused_variables)]
fn main() {
use once_cell::sync::Lazy;
use regex::Regex;
}

which provides the Lazy and Regex which appear in the code above.

So there is a bit more ceremony involved in using regular expressions in Rust, but it's not too bad. And we use them less often in Rust because of the rich set of string methods at our disposal, so it's really not a problem.

Types

Perl is dynamically typed. Rust is statically typed. In practice, I think this is the biggest difference for the programmer. In general, we have to be much more aware of the types of variables in Rust than we do in Perl.

Also, Perl is weakly typed, whereas Rust is strongly typed.

RustPerl
staticdynamic
strongweak

That means that Rust is not going to do any casting or automatic conversion of types the way that Perl does.

Turning that table inside-out, we see that Rust and Perl couldn't be more different

staticdynamic
strongRust
weakPerl

Let's fill in the rest of the table with other languages you may know

staticdynamic
strongRustPython
weakCPerl

Python is dynamic, like Perl, but its types are pretty strong. For example, we can't just use the string "3" as a number in Python as we do in Perl; we have to convert it first.

$ perl -E 'say "3" + 4'
7

$ python -c 'print("3" + 4)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects

$ python -c 'print(int("3") + 4)'
7

C is static, like Rust, but its types are very weak. You want to add a character to an integer? Have at it!

    char c = '3';
    int i = 4;

    printf("%d\n", c + i);

This may not print what you expected, but it will compile and print something.

So Rust types are both static, like C, and strong, like Python? That sounds like it might be a pain to use. And indeed Rust is a good deal more finicky than Perl. It will take some getting used to for sure.

The good news is, Rust has a really nice type system! Unlike the type systems found in most imperative programming languages, Rust has algebraic data types --- something usually only found in functional programming languages.

Rust also has type inference, so we can often leave off the type and Rust will figure it out.

These two properties combined mean Rust's types become more of a tool we use, rather than a burden we endure.

Numbers

In Perl, something like "3" + 4 works because the + operator places each of its arguments in numeric context. In Rust, the types of the operands must match because there's a different + operator for each type.

In Perl, the operators are in charge. In Rust, the operands are. (More specifically, the types of the operands.)

And we don't just mean strings and numbers; Rust has lots of different numeric types. There are signed integers, unsigned integers, and floating-point numbers, each in various sizes. We cannot do arithmetic until all the types are the same.

The names of the types are letters i, u, and f followed by the number of bits. There are also types for the native size on the current platform. So we have

signed integers: i8, i16, i32, i64, i128, isize
unsigned integers: u8, u16, u32, u64, u128, usize
floating-point numbers: f32, f64

That's fourteen kinds of numbers, each with its own + operation. Yikes!

In practice, it's not as bad as you might think. This Perl code

    my $x = 3;
    my $y = 4;

    my $z = $x + $y;

    say $z;

looks nearly the same in Rust


#![allow(unused_variables)]
fn main() {
    let x = 3;
    let y = 4;

    let z = x + y;

    println!("{}", z);
}

We have let instead of my and all of the dollar signs are missing, but that's about it. So where are all the types we were so worried about? In Perl, the dollar signs at least tell us we have scalars. In Rust, we get nothing. In fact, everything is implied. First, those numeric literals (the 3 and 4) have defaults. Since we didn't specify, Rust assumes they are 32-bit signed integers. We could have written them with the type appended, like so.


#![allow(unused_variables)]
fn main() {
    let x = 3i32;
    let y = 4i32;
}

From there, the types of x and y are inferred. Since we're stuffing i32's in them, they must be of type i32. And then the type of z is inferred. Summing two i32's gives another i32 and we're stuffing that in z, so z must be of type i32.

Incidentally, we can put underscores in numeric literals anywhere we want, just as in Perl. That's handy for big numbers, like 1_000_000 in both Perl and Rust. But in Rust, it's also common to use it for these type annotations.


#![allow(unused_variables)]
fn main() {
    let x = 3_i32;
    let y = 4_i32;
}

Alternatively, we could specify the types of the variables, like so.


#![allow(unused_variables)]
fn main() {
    let x: i32 = 3;
    let y: i32 = 4;
}

We could even do both, but that's starting to look silly.


#![allow(unused_variables)]
fn main() {
    let x: i32 = 3_i32;
    let y: i32 = 4_i32;
}

If x and y were two different types, then trying to add them would be an error.


#![allow(unused_variables)]
fn main() {
    let x: i32 = 3;
    let y: i64 = 4;

    let z = x + y;
}

Note that this is a compile-time error; we never get a chance to run this. Here is some of the compiler output

...
error[E0308]: mismatched types
  --> src/main.rs:12:17
   |
12 |     let z = x + y;
   |                 ^ expected `i32`, found `i64`

error[E0277]: cannot add `i64` to `i32`
  --> src/main.rs:12:15
   |
12 |     let z = x + y;
   |               ^ no implementation for `i32 + i64`
   |
   = help: the trait `std::ops::Add<i64>` is not implemented for `i32`

error: aborting due to 2 previous errors
...

One way to remedy this would be to use the as keyword to coerce one of the types into the other.


#![allow(unused_variables)]
fn main() {
    let x: i32 = 3;
    let y: i64 = 4;

    let z = (x as i64) + y;
}

Here, z would be inferred to be an i64, as it's the sum of two i64s. This is perfectly safe, as every i32 is expressible as an i64. If we went the other way, namely


#![allow(unused_variables)]
fn main() {
    let z = x + (y as i32);
}

then we have to be a bit careful. In this case, we're fine because 4 is obviously expressible as an i32. But not every i64 is expressible as an i32. And the as keyword is naïve, so it could quietly give us the wrong answer. We will say more about this when we discuss error handling.

Strings

Strings are complicated. ("Anyone who says differently is selling something.")

In Perl, a lot of the complications are hidden away. This is an example of Perl's DWIM approach. For the most part, strings in Perl behave as we expect and we don't really think about it too much.

In Rust, all of the complications of strings are right in our faces. Rust's approach is not DWIM, rather Rust prefers to be explicit about most everything. One consequence of this is that we end up with multiple types for strings!

Let's look at our "hello name" examples again. In Perl, we had

#!/usr/bin/env perl

use v5.28;
use warnings;

my $name = "Tim";

say "Hello, $name!";

while in Rust, we had

fn main() {
    let name = "Tim";

    println!("Hello, {}!", name);
}

Again, we see that Rust's let is kind of like Perl's my and it's apparently inferring the type of the string literal. Writing it explicitly, we'd have the following

fn main() {
    let name: &str = "Tim";

    println!("Hello, {}!", name);
}

So the type of name is &str. What is that? It's a string slice, which is not very similar to a Perl string. The static string literal is somewhere in memory and we just get a view into it. Rust has another type called String which is more akin to a Perl string. A Rust String is a dynamic chunk of memory holding our string, but to get one from a string literal we have to convert it. One way to do that is like so

fn main() {
    let name: String = String::from("Tim");

    println!("Hello, {}!", name);
}

Again, we could let Rust infer the type--- which truly feels redundant here--- and just write

fn main() {
    let name = String::from("Tim");

    println!("Hello, {}!", name);
}

but it still seems like a lot of work for just a string. But wait, there's more!

Decoded or not?

A String in Rust is a sequence of bytes that is guaranteed to be valid UTF-8. And a &str is a slice that always points to a valid UTF-8 sequence, so it can be used to view into a String as well as a static string literal. So these are akin to decoded strings in Perl.

In Perl, if we don't decode a string, explicitly or implicitly, then it's just a sequence of arbitrary bytes. The same thing in Rust would be a byte slice.


#![allow(unused_variables)]
fn main() {
    let name = b"Tim";
    println!("{:?}", name);
}

Running this would produce [84, 105, 109], where 84 is the 'T', 105 is the 'i', and 109 is the 'm'. So b"Tim" contains all of the data to make a string, but it's not really a string yet.

Characters

I guess now is a good time to mention that a character in Rust is not stored in a byte. A char is a single UTF-32 character, so it takes four bytes. So a string in Rust is not a sequence of characters! A String is a UTF-8 sequence, but a char is a UTF-32 value.

Foreign strings

The Rust standard library also contains some string types for dealing with sequences of bytes that do not decode into valid UTF-8, but are still considered strings in other contexts.

For things like path names, we have operating system strings, std::ffi::OSString and std::ffi::OSStr. The OSString is like String, but it could contain, say, a Windows-1252 string with values that are not valid UTF-8. The OSStr is analogous to str, so we usually see it as &OSStr just as we usually see &str.

Rust also has types just for going back and forth between C code, std::ffi::CString and std::ffi::CStr. In C, strings are null-terminated sequences of bytes. It's not inexpensive to convert those to and from Rust Strings, so we sometimes use Cstring and &Cstr instead.

Booleans

Rust has a Boolean type (bool) and things like if and while expect things of type bool only. Contrast this to Perl where things like if and while place whatever they're given in boolean context.

Additionally, everything in Rust must be initialized. There are no undefined values.

So in Perl we have a somewhat complicated set of rules about the truthiness and definedness of various expressions and values. In Rust, true is true, false is false, and anything else is going to have to get converted into a Boolean.

Compound Types

In Perl, we essentially have just three types: scalar, array, and hash. Each of these is indicated by its sigil.

my $scalar = 3;

my @array = (1, 2, 3);

my %hash = (alpha => 1, beta => 2, gamma => 3);

They are so flexible, though, that we accomplish enormous amounts of work with just these three things. This flexibility comes from the dynamic nature of Perl. A scalar holds a single value. An array holds a list of scalars. A hash holds a list of key-value pairs. There are no other restrictions.

my $scalar = "\N{TOP HAT}";

my @array = (1, "2", ['5', "banana", 3]);

my %hash = (alpha => 1, beta => \@array, gamma => $scalar);

The static nature of Rust means compound data types are more restricted. The Rust compiler needs to know the type of everything.

Arrays

Arrays in Rust are fixed length sequences of things that are all the same type.


#![allow(unused_variables)]
fn main() {
let array: [i32; 3] = [1, 2, 3];
}

What? That's not at all like an array in Perl! In truth, we don't often use arrays directly in Rust either; we more often access them through slices. This is similar to accessing Strings with &str.

You might have an occasion to use a Rust array while programming some time, but in the mean time just ignore the name similarity.

Vectors

When you think you need a Perl array, you probably need a Rust vector.


#![allow(unused_variables)]
fn main() {
let vector = vec![1, 2, 3];
}

These hold lists of things that are all of the same type too, but we can add and remove things from them much as we do in Perl.

my @array = (1, 2, 3);
push @array, 4;

#![allow(unused_variables)]
fn main() {
let mut vector = vec![1, 2, 3];
vector.push(4);
}

In Rust, everything is immutable by default, so we must explicitly declare a vector to be mutable if we wish to change it (e.g., push onto it).

Tuples

Rust's vectors have to be all the same type, though. If you need a list of things of different type, you might need a Rust tuple. These can't change size or change types, but they can hold something akin to a given Perl list.

my @array = (1, '2', "banana");

#![allow(unused_variables)]
fn main() {
let tuple = (1, '2', "banana");
}

This tuple has type (i32, char, &str). It's not comparable to tuples of other lengths or even to other triples of different component types.

HashMaps

When we need something like a Perl hash in Rust, we usually want a HashMap. We must pick a type for our keys and a type for our values, but other than that they are similar to use.

my %hash;
$hash{alpha} = 1;
$hash{beta} = 2;
$hash{gamma} = 3;

#![allow(unused_variables)]
fn main() {
let mut hash = HashMap::new();
hash.insert("alpha", 1);
hash.insert("beta", 2);
hash.insert("gamma", 3);
}

There are other data structures in Rust's std::collections which me might choose for things that we would probably just use a hash for in Perl. For example, in Perl we might use an existence hash (where we only care about the keys and the values are always 1), whereas in Rust we might choose a HashSet.

Algebraic Data Types

As mentioned earlier, Rust's type system is algebraic. We can create new types from combinations of existing types. We can make both products (product types) and coproducts (sum types). Rust's products are tuples or structures (keyword struct) and its coproducts are enumerations (keyword enum).

Structures

Structures group things together; we have both this and that.

For example, here is a structure from epochs-cli


#![allow(unused_variables)]
fn main() {
struct Datelike {
    source: String,
    viewed_as: View,
    epochs: HashMap<String, NaiveDateTime>,
}
}

Each instance of a Datelike struct contains all of those three things.

Enumerations

Enumerations offer a choice; we have either this or that.

For example, here is an enum from epochs-cli


#![allow(unused_variables)]
fn main() {
enum View {
    Decimal,
    Float,
    Hexadecimal,
    UUIDv1,
}
}

Each instance of a View enum contains exactly one of those four things.

Tuples

We've already talked about tuples a bit, but they belong here too as they are products. Perhaps the simplest product type is a pair of other types like (char, i32). It's kind of like a struct without the labels. We refer to the elements by number. If t is a pair, then t.0 is the first thing and t.1 is the second.

Unit

The empty tuple, (), is Rust's unit type. In Rust, everything is an expression; there are no statements. That is, everything returns a value. The unit type is the type returned when things "don't return anything."

Conclusion

Most type systems have things like tuples and structures, but many lack rich enumerations like Rust has. You may have used Perl's Types::Standard::Enum, which provides C-style enumerations. Rust's enumerations can also contain data, so they're a more proper coproduct.

A type system like Rust's can be used to make illegal states unrepresentable.

Generic Types

Rust also supports the notion of generic types. We can write code that supports a number of types by using a stand in, often just the letter T. This includes when defining structures and enumerations.

Rust does not have higher-kinded types1. If you don't know what that means, don't worry about it. If you do, then you might be looking for them. You can stop.


1

In Rust 1.65.0, generic associated types were stabilized. These are sort of generic over something that is itself generic, so it's rubbing up against one aspect of higher-kinded types.

Functions

Rust functions are like Perl subroutines. They don't have to be pure functions (mathematical functions).

Named functions in Rust are created with the keyword fn. We've already seen one of these; the "hello world" program used a main function.

fn main() {
    println!("Hello, World!");
}

In Perl, we don't have an explicit main, though it is there implicitly (we are in package main unless we say otherwise).

In Rust, there is also a separate syntax for closures. In Perl, we use keyword sub for both named and anonymous subroutines.

RustPerl
fnsub
closuressub
traitsroles

In Rust, function signatures are the one place where there is never any type inference. We always have to say the types of our arguments and of our return value. This is kind of nice, as it serves as a form of documentation.

If you've done any Haskell, you know that it will infer types in the function definitions as well. But having that information at hand is so valuable that Haskell programmers usually include a type signature declaration even though it's optional to do so.

In Haskell, it's a best practice. In Rust, it's required.

Hello Functions

Let's go back to our hello name example, but this time create a function to do the greeting. In Perl it might look like this.

#!/usr/bin/env perl

use v5.28;
use warnings;

greet("Tim");

sub greet {
    my $name = shift;
    say "Hello, $name!";
}

The same thing in Rust might look like this.

fn main() {
    greet("Tim");
}

fn greet(name: &str) {
    println!("Hello, {}!", name);
}

Here we see the type of the parameter that must be passed to greet in the function signature. Perl's subroutine signatures are still experimental1, but if we use those then the two examples look more similar.

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

greet("Tim");

sub greet($name) {
    say "Hello, $name!";
}

Here we see the Perl subroutine greet takes a single scalar as a parameter. The Rust function greet takes a single parameter also, but it must be a string slice (&str). No other type will do.

In Perl, we could add

greet("Tim");
greet(3);

and it would print

Hello, Tim!
Hello, 3!

In Rust, it wouldn't compile.

error[E0308]: mismatched types
 --> src/main.rs:3:11
  |
3 |     greet(3);
  |           ^ expected `&str`, found integer

We would have to first convert the integer to a string slice.

fn main() {
greet("Tim");
greet(&3.to_string());
}

fn greet(name: &str) {
   println!("Hello, {}!", name);
}

There's a tiny bit of magic here as we've actually passed it a reference to a string (&String) not a string slice (&str), but an entire string counts as a slice, so it's okay.

Hello, Tim!
Hello, 3!

There's an analogous situation with Rust's arrays, vectors, and slices.


1

As of Perl 5.36, subroutine signatures are no longer experimental.

Closures

In Rust, we have a separate syntax for closures. In Perl, we re-use sub. For example, we can create an anonymous subroutine and call it like so.

my $greet2 = sub($name) {say "Hello, $name!"};

$greet2->("Tim");

The subroutine itself doesn't have a name. $greet2 is an ordinary Perl scalar that holds a coderef. The same thing in Rust looks like this.

fn main() {
let greet2 = |name| {println!("Hello, {}!", name)};

greet2("Tim");
}

The name of the variable goes between those two pipes. You can think of both pipes as the lambda in lambda calculus (so |name| is like λ name). This is similar to the Ruby syntax, except in Rust the pipes go on the outside of the block and in Ruby they go on the inside. Ironically, this probably means that you are going to type it wrong a lot if you are familiar with Ruby, but you will learn it quickly and easily if you have never seen anything like it before. Seems kind of unfair!

We can also access variables in the outer scope. In Perl, an alternative way to write the above is

my $name = "Tim";
my $greet3 = sub {say "Hello, $name!"};
$greet3->();

Here the anonymous subroutine takes no argument. It is accessing the $name variable in the outer scope. In Rust, we have just the two pipes with nothing between.

fn main() {
let name = "Tim";
let greet3 = || {println!("Hello, {}!", name)};
greet3();
}

All of the above print

Hello, Tim!

Methods

In Rust, we don't have classes or inheritance, but we can attach methods to structs and enums. This looks an awful lot like object-oriented Perl. Say we had a rectangle object that knew how to find its own area. In Perl, we might write something like this.

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

package Rectangle {
    sub new($class, $length, $width) {
        my $self = {
            _length => $length,
            _width  => $width,
        };
        bless $self, $class;
        return $self;
    }

    sub area($self) {
        $self->{_length} * $self->{_width}
    }
}

my $r = Rectangle->new(2, 3);

say "Area is ", $r->area;

In Rust, we'd create a Rectangle struct first and then separately implement new and area methods for that struct in an impl block. We use a dot (.) in Rust where we use an arrow (->) in Perl, but otherwise it looks the same.

struct Rectangle {
    length: f64,
    width: f64,
}

impl Rectangle {
    fn new(length: f64, width: f64) -> Self {
        Self{length, width}
    }

    fn area(&self) -> f64 {
        self.length * self.width
    }
}

fn main() {
    let r = Rectangle::new(2.0, 3.0);

    println!("Area is {}", r.area());
}

Indeed, in both Rust and Perl, these method calls are just sugar for function calls. That is, just as in Perl, these two are the same

    say "Area is ", $r->area;

    say "Area is ", Rectangle::area($r);

in Rust, these two are the same.


#![allow(unused_variables)]
fn main() {
    println!("Area is {}", r.area());

    println!("Area is {}", Rectangle::area(&r));
}

Traits

Rust traits are more akin to Perl roles; they describe "does-a" rather than "is-a" relationships.

If we define a role like this in Perl

package Area {
    use Role::Tiny;
    requires qw(area);
}

then anything that does the Area role must have a method called area.

Similary, if we define a trait like this in Rust


#![allow(unused_variables)]
fn main() {
trait Area {
    fn area(&self) -> f64;
}
}

then anything that implements the Area trait, must have a method called area with that exact function signature.

For example, in Perl we could create a Circle and a Rectangle that both do the Area role like so

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

package Area {
    use Role::Tiny;
    requires qw(area);
}

package Circle {
    use Role::Tiny::With;
    with 'Area';
        
    sub new($class, $radius) {
        my $self = {
            _radius => $radius,
        };
        bless $self, $class;
        return $self;
    }

    sub area($self) {
        3.14159265358979 * $self->{_radius} ** 2
    }
}

package Rectangle {
    use Role::Tiny::With;
    with 'Area';
        
    sub new($class, $length, $width) {
        my $self = {
            _length => $length,
            _width  => $width,
        };
        bless $self, $class;
        return $self;
    }

    sub area($self) {
        $self->{_length} * $self->{_width}
    }
}

my $c = Circle->new(1);
say "Area of circle is ", $c->area;

my $r = Rectangle->new(2, 3);
say "Area of rectangle is ", $r->area;

Alternatively, here it is again using Perl's Object::Pad.

Similarly, in Rust we could create a Circle and a Rectangle that both implement the Area trait like so

trait Area {
    fn area(&self) -> f64;
}

struct Circle {
    radius: f64,
}

impl Circle {
    fn new(radius: f64) -> Self {
        Self{radius}
    }
}

impl Area for Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius.powi(2)
    }
}

struct Rectangle {
    length: f64,
    width: f64,
}

impl Rectangle {
    fn new(length: f64, width: f64) -> Self {
        Self{length, width}
    }
}

impl Area for Rectangle {
    fn area(&self) -> f64 {
        self.length * self.width
    }
}

fn main() {
    let c = Circle::new(1.0);
    println!("Area of circle is {}", c.area());

    let r = Rectangle::new(2.0, 3.0);
    println!("Area of rectangle is {}", r.area());
}

Running either of these produces

Area of circle is 3.14159265358979
Area of rectangle is 6

There's more to Rust's traits (they're sort of the key to all of the magic in Rust), but that'll do for now.

Hello Generics

Earlier we were looking at this greet function that only accepted a string slice.

fn main() {
    greet("Tim");
}

fn greet(name: &str) {
    println!("Hello, {}!", name);
}

If we wanted to greet integers, we'd need another function.

fn main() {
    greet(3);
}

fn greet(name: i64) {
    println!("Hello, {}!", name);
}

How do we write a single function that accepts either? That's where generics come in. The basic syntax for declaring a generic function looks like this.


#![allow(unused_variables)]
fn main() {
fn greet<T>(name: &T) {
    println!("Hello, {}!", name);
}
}

That T in angle brackets stands for any type. Then our variable name is a reference to whatever that type is. But this isn't going to work because we're trying to format that variable with {}. Earlier we mentioned that this depends on the Display trait. Thus our function isn't completely generic. We can't accept any type T. We can only accept types that satisfy the Display trait. We can do that by adding a trait bound like so. The Display trait is defined in the standard library, so first we use it.


#![allow(unused_variables)]
fn main() {
use std::fmt::Display;

fn greet<T: Display>(name: &T) {
    println!("Hello, {}!", name);
}
}

Now, it turns out that the Display trait implies Sized as well. But we don't care about that if we're just formatting, so we need to relax it like so.

use std::fmt::Display;

fn main() {
   greet("Tim");
   greet(&3);
}

fn greet<T: Display + ?Sized>(name: &T) {
    println!("Hello, {}!", name);
}

Now we can greet string slices, integer references, and anything else that implements Display.

use std::fmt::Display;

fn main() {
    greet("Tim");
    greet(&3);
}

fn greet<T: Display + ?Sized>(name: &T) {
    println!("Hello, {}!", name);
}

Running this gives

Hello, Tim!
Hello, 3!

If the trait bounds get too unwieldy, there is an alternative syntax with a where clause added afterwards.

use std::fmt::Display;

fn main() {
   greet("Tim");
   greet(&3);
}

fn greet<T>(name: &T)
where
    T: Display + ?Sized,
{
    println!("Hello, {}!", name);
}

TMTOWTDI!

Now, the Display trait was required for the {} in the println!. If we wanted to print it with {:?} instead, that would require the Debug trait.

use std::fmt::Debug;

fn main() {
    greet("Tim");
    greet(&3);
}

fn greet<T: Debug + ?Sized>(name: &T) {
    println!("Hello, {:?}!", name);
}

Running this yields

Hello, "Tim"!
Hello, 3!

The 3 printed the same, but notice the quotes around Tim. That can be handy. I often add quotes around things like filenames in Perl.

say "Writing players to file '$path'..." if $opt->verbose;

In Rust, I can format them with {:?} instead of {}.

fn main() {
let path = "players.json";
let verbose = true;
if verbose {
    println!("Writing players to file {:?}...", path);
}
}

Ownership

Perl manages our memory for us at run time. Rust does virtually everything at compile time, yet it is still memory safe. How does that work? It does this through its ownership model.

Rust's ownership model is what sets it apart from every other programming language I've used. Earlier, we said Rust's type system was the biggest practical difference for Perl programmers. While I believe that's true, Rust's ownership model is the biggest theoretical difference. It's what allows Rust to do so much at compile time. It's what makes Rust Rust!

Ownership Rules

There are three rules to Rust's ownership

  1. Every value has a variable called its owner.
  2. There can only be one owner at a time.
  3. When the owner goes out of scope, the value is dropped.

So scope determines when values are dropped. That's nice, because earlier we said that Rust's scoping rules work much like Perl's. We're well on our way to understanding already!

Move Semantics

By default, whenever we assign a value to a variable, pass it to a function, or return it from a function, ownership is transferred.

fn main() {
    let name = String::from("Tim"); // String is owned by name

    greet(name);                    // Ownership is transferred
}

fn greet(name: String) {
    println!("Hello, {}!", name);
}

This example works fine, but note that when we return from greet, name is no longer accessible in main. It has been dropped because ownership was transferred to the name in greet, which went out of scope when the function completed.

If we tried to call it again, it would fail.

fn main() {
    let name = String::from("Tim"); // String is owned by name

    greet(name);                    // Ownership is transferred

    greet(name);                    // use of moved value: `name`
}

fn greet(name: String) {
    println!("Hello, {}!", name);
}

Note that this is a compile-time error. We never get a chance to run this. The full compiler message is

$ cargo run
   Compiling ownership v0.1.0 (/home/tim/rust/ownership)
error[E0382]: use of moved value: `name`
 --> src/main.rs:5:11
  |
2 |     let name = String::from("Tim");
  |         ---- move occurs because `name` has type `String`, which does not implement the `Copy` trait
3 | 
4 |     greet(name);
  |           ---- value moved here
5 |     greet(name);
  |           ^^^^ value used here after move

error: aborting due to previous error

For more information about this error, try `rustc --explain E0382`.
error: could not compile `ownership`

To learn more, run the command again with --verbose.

Note the message, "move occurs because name has type String, which does not implement the Copy trait." That gives us a hint how we could fix this. If we make a copy of our string (String types have a clone method which does exactly this), then we could move that while retaining the original.

fn main() {
    let name = String::from("Tim"); // String is owned by name

    greet(name.clone());            // Make a copy and move that

    greet(name);                    // Original is still available
}

fn greet(name: String) {
    println!("Hello, {}!", name);
}

Borrowing

Well, move semantics sounds like a pain! We really have to explicitly clone things all the time? No, this is where borrowing comes in.

Rather than take ownership of a value, we can borrow it.

fn main() {
    let name = String::from("Tim");

    greet(&name);
    greet(&name);
}

fn greet(name: &String) {
    println!("Hello, {}!", *name);
}

That ampersand in &String means greet doesn't want to take ownership of the String, it just wants a reference to it. And because it's not the owner, the string is not dropped when the function finishes executing. So we can call it a second time with no problems.

Since the name inside greet is not a String, but a string reference, we dereference it when we use it with *name.

This is analogous to the following in Perl.

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

my $name = "Tim";

greet(\$name);

sub greet($ref) {
    say "Hello, ${$ref}!";
}

In fact, Rust will do the dereference for us. That is, this works just fine as well.

fn main() {
    let name = String::from("Tim");

    greet(&name);
    greet(&name);
}

fn greet(name: &String) {
    println!("Hello, {}!", name);
}

This is one of the few times Rust does not make us be explicit.

In Perl, the analogous thing

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

my $name = "Tim";

greet(\$name);

sub greet($name) {
    say "Hello, $name!";
}

would print out something like

Hello, SCALAR(0xDEADBEEF)!

Shared and Unique Borrows

There are actually two kinds of borrows in Rust. The above is a shared (immutable) borrow. We can read the value, but we cannot change it. If we need to do that, we must use a unique (mutable) borrow. (I really like the terms "shared" and "unique", but it seems "immutable" and "mutable" have won out. I guess because of the mut keyword.)

We can have multiple immutable borrows. Lots of things can read a value at the same time. We can only have one mutable borrow. Only one thing at a time can change a value. Moreover, if there is a mutable borrow, there can be no shared borrows. If we're writing a value, then no one should be reading it. Indeed, if there is a mutable borrow, not even the owner can read the value.

fn main() {
    let mut name = String::from("Hank");

    greet(&name);
    change(&mut name);
    greet(&name);
}

fn greet(name: &String) {
    println!("Hello, {}!", name);
}

fn change(name: &mut String) {
    *name = String::from("Dean");
}

In Perl, our references are always mutable. The above might look like this

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

my $name = "Hank";

greet($name);
change(\$name);
greet($name);

sub greet($name) {
    say "Hello, $name!";
}

sub change($ref) {
    $$ref = "Dean";
}

That's not really idiomatic in either language, but you get the idea.

The Rust compiler has a borrow-checker. It keeps track of all the borrows in our programs and notifies us when we've violated any of these rules.

Closures

When we talked about closures, we said they can access variables from the enclosing scope. There are three ways to do this: move and the two kinds of borrows. There are traits for these.

  • FnOnce => move (like self)
  • FnMut => borrow mutably (like &mut self)
  • Fn => borrow immutably (like &self)

Copy Types

Earlier we said, by default, ownership is transferred. If we don't want to use a borrow, we need to make a copy. Copy types do this automatically. Types such as i32 implement the Copy trait because it's just not worth the trouble of setting up a borrow. Why copy a reference to the stack when we can just copy our 32 bits to the stack?

So, for example, this works as we expect.

fn main() {
    let num = 3;
    greet(num);
    greet(num);
}

fn greet(num: i32) {
    println!("Hello, {}!", num);
}

This is just like the following in Perl.

#!/usr/bin/env perl

use v5.28;
use warnings;
use experimental qw(signatures);

my $num = 3;
greet($num);
greet($num);

sub greet($num) {
    say "Hello, $num!";
}

I didn't present a Copy type first--- even though it is more akin to what happens in Perl--- because I wanted to stress that this is the exception, not the rule. Move semantics are the norm in Rust; Copy types are the special case.

Data Races

Rust's ownership model gives memory safety guarantees like "no dangling pointers," and "no double-frees." It turns out, the things it does to ensure memory safety also prevent data races!

Earlier, we said that--- unlike Perl--- Rust gives us easy access to threads. But programming with multiple threads offers whole new challenges. If one thread wants to change a piece of memory, how do we ensure that no other thread can read it at the same time? Rust's ownership model already does exactly this!

There are other kinds of race conditions possible in Rust, but we don't have to worry about data races. Rust's ownership model is remarkable!

Ownership Borrowing Lifetimes

Error Handling

Error handling in Rust is very different from Perl. Rust has no null value like Perl's undef. Rust doesn't really have exceptions either, though we can panic! which terminates safely and unwinds the stack. Instead, Rust leverages its type system to create two very common enumerations: Option and Result.

Option

An option is either Some, and has some data, or it is None, and has no data.


#![allow(unused_variables)]
fn main() {
pub enum Option<T> {
    Some(T),
    None,
}
}

Since we can have all sorts of optional types, we use a generic type T. An option is either type Some(T), in which case our data is of type T, or it is type None and there is no data.

Result

A result is either Ok, and has some good data, or it is Err, and has some error data.


#![allow(unused_variables)]
fn main() {
pub enum Result<T, E> {
    Ok(T),
    Err(E),
}
}

Now there are two generic types, good data of type T and error data of type E.

Options

Say we wanted to write an integer divide that avoided crashing on division by zero. In Perl, we might write this.

sub safediv($x, $y) {
    return if $y == 0;
    int($x / $y);
}

This returns an undefined value when we try to divide by zero. We might call it like so

my $d = safediv(1, 2);
say $d if defined $d;

The value that we get back might be undef, so we check first before printing it.

In Rust, we might write this.


#![allow(unused_variables)]
fn main() {
fn safediv(x: i32, y: i32) -> Option<i32> {
    if y == 0 {
        return None;
    }
    Some(x/y)
}
}

This returns an Option<i32>. It will be None if we try to divide by zero. Otherwise it will have our integer wrapped in a Some. We might call it like so

fn main() {
    let option_d = safediv(1, 2);
    match option_d {
        Some(d) => println!("{}", d),
        None => {},
    }
}

fn safediv(x: i32, y: i32) -> Option<i32> {
   if y == 0 {
       return None;
   }
   Some(x/y)
}

Here we match on our Option. If it is a Some, then we destructure it as part of the match; we can access our integer as d. If it is None, then we do nothing.

Another way would be with if let.

fn main() {
    if let Some(d) = safediv(1, 2) {
        println!("{}", d);
    }
}

fn safediv(x: i32, y: i32) -> Option<i32> {
   if y == 0 {
       return None;
   }
   Some(x/y)
}

This implicitly does the destructuring match and only does what's inside if it succeeds.

We can also unwrap an option. If we unwrap a Some, we get what's inside. If we unwrap a None, it panics.

fn main() {
    let d = safediv(1, 2).unwrap();
    println!("{}", d);
}

fn safediv(x: i32, y: i32) -> Option<i32> {
   if y == 0 {
       return None;
   }
   Some(x/y)
}

This is kind of like this Perl

    my $d = safediv(1, 2) // die;
    say $d;

Similarly, we can expect, which is like unwrap with an added message.

fn main() {
    let d = safediv(1, 2).expect("Cannot divide by zero!");
    println!("{}", d);
}

fn safediv(x: i32, y: i32) -> Option<i32> {
   if y == 0 {
       return None;
   }
   Some(x/y)
}

This is kind of like this in Perl

    my $d = safediv(1, 2) // die "Cannot divide by zero!";
    say $d;

Results

Results are similar to options, but in addition to indicating that there is an error, we get a chance to say what kind of error it is. For example, we might write our safediv like this instead


#![allow(unused_variables)]
fn main() {
fn safediv(x: i32, y: i32) -> Result<i32, String> {
    if y == 0 {
        return Err(String::from("Division by zero!"));
    }
    Ok(x/y)
}
}

This returns a result which is either Ok with our integer or Err with our error. In this case the error is a String, but we can choose any type. Often we create a custom error type. I guess this is kind of analogous to string exceptions versus exception objects in Perl. But again, there are no exceptions here in Rust; a result is an ordinary enum with two different possibilities. As such, we can match on it, the same as with an option (or any other enum).

fn main() {
    match safediv(1, 2) {
        Ok(d) => println!("{}", d),
        Err(e) => println!("{}", e),
    }
}

fn safediv(x: i32, y: i32) -> Result<i32, String> {
   if y == 0 {
       return Err(String::from("Division by zero!"));
   }
   Ok(x/y)
}

And, alternatively, we can use if let.

fn main() {
    if let Ok(d) = safediv(1, 2) {
        println!("{}", d);
    }
}

fn safediv(x: i32, y: i32) -> Result<i32, String> {
   if y == 0 {
       return Err(String::from("Division by zero!"));
   }
   Ok(x/y)
}

Similarly, we can unwrap a result

fn main() {
    let d = safediv(1, 2).unwrap();
    println!("{}", d);
}

fn safediv(x: i32, y: i32) -> Result<i32, String> {
   if y == 0 {
       return Err(String::from("Division by zero!"));
   }
   Ok(x/y)
}

Or call expect on a result

fn main() {
    let d = safediv(1, 2).expect("Cannot divide by zero");
    println!("{}", d);
}

fn safediv(x: i32, y: i32) -> Result<i32, String> {
   if y == 0 {
       return Err(String::from("Division by zero!"));
   }
   Ok(x/y)
}

Question Mark

Lots of library functions--- including the standard library, as well as third-party libraries--- return either options or results, so you'll likely get used to using things like match, if let, while let, unwrap, and expect even before you make an Option or Result yourself.

Example: Open

For example, when we open a file in Perl, it returns an undefined value when it fails, so we usually write something like this

open my $fh, '<', $file or die "Cannot open file '$file': $!";

Upon success, our file handle is in $fh. Upon failure, we retrieve the reason for the failure from $!.

The similar maneuver in Rust

use std::fs::File;
use std::path::Path;
fn main() {
   let path = Path::new("no_such_file");
let result = File::open(&path);
   match result {
       Ok(f) => println!("The open succeeded\n{:#?}", f),
       Err(e) => println!("The open failed\n{:#?}", e),
   }
}

gives us a Result<File, Error>. Upon success, a std::fs::File is wrapped in an Ok. Among other things, it contains the file descriptor we need to read from the file. Upon failure, a std::io::Error is wrapped in an Err. Among other things, it contains the reason for the failure.

If we match on the result, we could inspect the File struct or the Error struct like so

use std::fs::File;
use std::path::Path;
fn main() {
   let path = Path::new("no_such_file");
   let result = File::open(&path);
match result {
    Ok(f) => println!("The open succeeded\n{:#?}", f),
    Err(e) => println!("The open failed\n{:#?}", e),
}
}

Again, we can call unwrap or expect on a Result and it will panic if it is Err, so perhaps a more direct analogue to

open my $fh, '<', $file or die "Cannot open file '$file': $!";

would be

use std::fs::File;
use std::path::Path;
fn main() {
   let path = Path::new("no_such_file");
let f = File::open(&path).expect("Cannot open file!");
   println!("The open succeeded\n{:#?}", f);
}

This unwraps our File on success and panics with our message on failure.

The ? (question mark) operator

Rust has a more idiomatic way of dealing with options and results. Perhaps this is best shown by example. Say we wanted to read the first eight bytes of a file (perhaps we want to check if it's a PNG file or something). We might write something like this

use std::fs::File;
use std::io;
use std::io::{Error, ErrorKind};
use std::io::prelude::*;
use std::path::Path;

fn main() {
   let path = Path::new("src/main.rs");

   let first_eight = read_eight_bytes(&path);

   dbg!(&first_eight);
}

fn read_eight_bytes(path: &Path) -> Result<[u8; 8], io::Error> {
    let result = File::open(path);

    let mut f = match result {
        Ok(f) => f,
        Err(e) => return Err(e),
    };

    let mut buffer = [0; 8];

    let result = f.read(&mut buffer[..]);

    let n = match result {
        Ok(n) => n,
        Err(e) => return Err(e),
    };

    if n == 8 {
        Ok(buffer)
    } else {
        Err(Error::new(ErrorKind::Other, "Could not read 8 bytes!"))
    }
}

We're given a path and we're returning a Result<[u8; 8], io::Error>. That is, we're going to return Ok with eight bytes or Err with the reason we couldn't.

First, we open the path, which might fail. If it does, we return its io::Error. If it succeeds, we unwrap the File and try to read eight bytes from it. If that fails, we return its io::Error. If it succeeds, we check that we got eight bytes. If so, return them in an Ok. If not, return a custom error.

Both of those matches have the same shape: unwrap the Ok or return early with the Err. This is so common, that we can replace it with a single question mark!

use std::fs::File;
use std::io;
use std::io::{Error, ErrorKind};
use std::io::prelude::*;
use std::path::Path;

fn main() {
   let path = Path::new("dirk-gently.png");
   let first_eight = read_eight_bytes(&path);
   dbg!(&first_eight);
}

fn read_eight_bytes(path: &Path) -> Result<[u8; 8], io::Error> {
    let mut f = File::open(path)?;

    let mut buffer = [0; 8];

    let n = f.read(&mut buffer[..])?;

    if n == 8 {
        Ok(buffer)
    } else {
        Err(Error::new(ErrorKind::Other, "Could not read 8 bytes!"))
    }
}

Not bad, eh? We're doing all the proper error checking, but it's mostly just the "happy path" showing in our code.

Also, Result<T, io::Error> is so common that there's a type alias for it, io::Result<T>. That is, we could replace


#![allow(unused_variables)]
fn main() {
fn read_eight_bytes(path: &Path) -> Result<[u8; 8], io::Error> {
}

with


#![allow(unused_variables)]
fn main() {
fn read_eight_bytes(path: &Path) -> io::Result<[u8; 8]> {
}

if we wanted.

Conclusion

It takes some getting used to perhaps, but Rust's error handling is really pretty nice. It may seem finicky, but at least we don't have to juggle undefined values or exceptions.

Iterators

One of the great things about being a Perl programmer is that we get to read Higher-Order Perl. And one of the great things about Higher-Order Perl is the explanation of iterators.

Rust iterators are a joy to use. They really make Rust fun. If you think a "low-level" systems language is going to be tedious to use, you're in for a treat. If you like to use things like map and grep in Perl, you're going to love Rust.

For loops

Earlier, I mentioned revisiting Rust's for loop. Let's do that now. It looks like this

fn main() {
    let names = vec!["Hank", "Dean", "Brock"];

    for name in names {
        println!("Hello, {}!", name);
    }
}

But this is really syntactic sugar for something like this

fn main() {
    let names = vec!["Hank", "Dean", "Brock"];

    let mut iterator = names.into_iter();

    while let Some(name) = iterator.next() {
        println!("Hello, {}!", name);
    }
}

That is, Rust for loops are not really a separate thing. They are just an alternate syntax for consuming iterators.

Rust has an Iterator trait that we can use to define our own iterators. Then we can use for loops on them.

Lazy

Rust iterators are lazy. That means they must be consumed before they do anything. If we write this

fn main() {
    let names = vec!["Hank", "Dean", "Brock"];

    let hellos = names.iter().map(|name| format!("Hello, {}!", name));
    
    println!("{:#?}", hellos);
}

then nothing happens. We've set the machine up, but we haven't turned the crank. We have to do something like collect it into a vector to do that.

fn main() {
    let names = vec!["Hank", "Dean", "Brock"];

    let hellos = names.iter().map(|name| format!("Hello, {}!", name)).collect::<Vec<_>>();
    
    println!("{:#?}", hellos);
}

Being lazy also means that iterators can be infinite.

Tests

In Perl, we put our tests in a t directory and then prove -l will run them. Similarly, Rust has a tests directory and cargo test will run the tests there.

RustPerl
testst
cargo testprove -l
use testing;use Test::Most;

But Rust does this mostly for integration tests. Unit tests are typically put right in the same file as the code.

For example, I wrote a Perl module called Time::Moment::Epoch which converts a bunch of different epoch times to Time::Moment times. It has a bunch of unit tests in t/Time-Moment-Epoch.t.

I wrote a similar Rust crate called epochs, which converts those same epoch times to chrono::NaiveDateTime times. It has similar unit tests at the bottom of the library file itself.

The #[cfg(test)] macro tells cargo to run those tests when it is called as cargo test, but to ignore them when it is called as cargo build or cargo run.

Update (2024-05-15): I talked about testing at the Rust & C++ Cardiff book club. We are reading Rust for Rustaceans together and I did chapter 6. You can view my slides, some of which have links that I did not follow during the talk.

Documentation

Rust uses markdown pretty much everywhere that we would use POD in Perl.

RustPerl
//#
//!, ///=pod
markdownPOD
doctestsTest::Doctest
rustdocperldoc

Rust uses // for regular line comments, but /// for markdown comments. We can even use markdown's triple backticks ``` to include Rust source code. When we do this, it automatically becomes a doctest! That is, cargo test --doc runs the code we include in markdown comments. This helps us keep our docs up to date (cargo test includes --doc by default). I think perhaps the worst error in Perl is a SYNOPSIS section in our POD that doesn't work because the code changed, but the POD didn't. It's annoyingly easy to do.

For example, here is the apfs function from the epochs crate.


#![allow(unused_variables)]
fn main() {
/// APFS time is the number of nanoseconds since the Unix epoch
/// (*cf.*, [APFS filesystem format](https://blog.cugu.eu/post/apfs/)).
///
/// ```
/// use epochs::apfs;
/// let ndt = apfs(1_234_567_890_000_000_000).unwrap();
/// assert_eq!(ndt.to_string(), "2009-02-13 23:31:30");
/// ```
pub fn apfs(num: i64) -> Option<NaiveDateTime> {
    epoch2time(num, 1_000_000_000, 0)
}
}

It contains a description of the function as well as a doctest. The compiler itself ignores it, like any other comment, but rustdoc will grab that description and run that test. Currently, rustdoc only generates HTML output that we must then view in a browser. There is no command line tool for looking up things in the docs like perldoc yet.

I've documented all of my functions in a similar way, so rustdoc generates this whole page for me. I didn't write that page, it was all generated from the source code when I uploaded it to crates.io!

Now, /// documents whatever immediately follows it, like the function above. There is also //! which goes after. That summary line for the whole crate came from one of those at the top.

If you want to learn more, you already have everything you need. The documentation that comes with Rust is truly outstanding. In particular, there is the book.

The Rust Programming Language (aka, "the book") is a free guide to the language that comes with it. It is remarkably well done. Whether you read it cover-to-cover or jump around haphazardly, you should definitely read it. You can view a local copy with rustup docs --book, even when you're offline. You can also buy an ink-on-paper copy if you prefer.

You can see lots of other great docs that come with Rust by running rustup docs.

Other books