Introduction
Many introductions to Rust already exist. Most of them are aimed at C++ programmers. That makes sense, but lots of folks are coming to Rust from other languages now.
My current1 day job is mostly Perl. It occurred to me that an introduction to Rust aimed at people who already know Perl could be useful.
Rust is exciting to Perl programmers for a number of reasons.
- We can write faster programs. Rust is generally more performant than Perl. Often much more performant.
- We can write more memory-efficient programs. Rust gives us much greater control over the memory we use than does Perl. Our Perl programs are often memory hogs.
- We can write multi-threaded programs. Rust provides easy access to threads. In Perl, we are largely restricted to processes.
- We can write Perl extension modules in Rust. Rust interacts with the C ABI really well. If we have a slow Perl module that we were thinking of re-writing in C, we can consider writing it in Rust instead.
- We can target WebAssembly in Rust. If we care about web programming, we might be able to use Rust in place of much of our JavaScript. And with the advent of WASI, there are more reasons to target WebAssembly; it's not just for the web anymore. For example, with Krustlet we can substitute WebAssembly artifacts for Docker containers in the Cloud. And now there is Spin!
Rust is a big, complicated language. Where do we begin? It feels like we have to know everything at once. As Lisa Passing put it, the learning curve can be more like a wall you hit. But she goes on to say, "once you've learned how to walk through this wall...you can walk through walls."
I want to walk through walls!
Having to know everything at once makes it hard to teach Rust as well. It seems like no matter where we start, we are always touching on concepts that we haven't covered yet. This is quite the opposite of Perl, where it's fairly easy to learn as we go. But perhaps making this one assumption--- that we all know Perl--- will help us navigate the complexities of Rust. I don't know if this is going to work, but I thought I'd try it.
-- Tim Heaney
Update: I no longer have this job. If you would like to hire me, please get in touch!
Installation
If you don't already have Rust installed, we should probably start there. It's not entirely necessary. You could explore for some time on the Rust Playground without installing anything. But eventually, you'll probably want to install it.
System Rust
One way to install Rust might be through your operating system. I am writing this on a Debian 10 machine, so I could install Rust with
$ sudo apt install rustc
or, better, install Cargo with
$ sudo apt install cargo
which has rustc
as a dependency.
Rustup
Another way to install Rust is with the amazing rustup. Just as we often install and manage multiple versions of Perl with perlbrew or plenv, we likely want to do the same for Rust. If we're doing this to ensure we always have the latest version, then it's even more important for Rust than for Perl. They release a new version of stable Perl about once a year. They release a new version of stable Rust every six weeks! That sounds like it could be painful, but it's usually no big deal. We run one command
$ rustup update stable
and a few seconds later, we have the latest Rust toolchain. I can handle that every six weeks.
Overview
Tools
As we've just seen, Rust includes some great tooling. Things like cargo
and rustup
make it a real pleasure to use. And we'll see more as we go.
Rust | Perl |
---|---|
rustc | perl |
cargo | cpanm , dzil , and more |
rustup | plenv , perlbrew |
rustfmt | perltidy |
clippy | perlcritic |
rustdoc | perldoc |
module | module |
crate | distribution |
crates.io | metacpan.org |
Rust Foundation | The Perl Foundation |
We've spent decades coming up with some of this stuff in Perl. In Rust, it's all here already!
Differences
As we'll see, there are a lot of huge differences between Perl and Rust.
Rust | Perl |
---|---|
static types | dynamic types |
strong types | weak types |
move, borrow | copy, reference |
immutable by default (mut ) | mutable |
private by default (pub ) | public |
expressions | statements, expressions |
Similarities
But some things will be familiar. Scope works pretty much the same. Both languages use the use
keyword and ::
in similar ways. And both have lots of "C-like" syntax in common.
Hello, Rust!
Let's start with the "hello world" program! This was originated by Brian Kernighan some 40 years ago. It's a brilliant piece of pedagogy in C. It's less interesting in Perl and Rust, but it will serve our purposes nonetheless. The "hello world" program in Perl might look like this
perl -E 'say "Hello, World!"'
Here we call perl
directly, feeding it our Perl source code to be immediately interpreted. Alternatively, we could put our code in a file
#!/usr/bin/env perl
use v5.28;
use warnings;
say "Hello, World!";
and then feed that file to perl
. If the file were called hello
, then we could call perl
directly
$ perl hello
Hello, World!
or through the shebang line
$ ./hello
Hello, World!
In Rust, the "hello world" program looks like this
fn main() { println!("Hello, World!"); }
To run it, we must first compile it. If it were in a file called hello.rs,
we could call the rust compiler directly with
$ rustc hello.rs
This would create another file, hello
which is an executable
$ ./hello
Hello, World!
Our Rust source code is no longer required. We have a stand-alone executable, specific to our machine.
$ file hello
hello: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=05fddacb48ac6fa3ac26ff897c61f31941a04c4d, with debug_info, not stripped
However, if we wanted it to run on some other machine, we might have to recompile it for that platform.
Contrast that with Perl. Our Perl source code can be run by any perl
on any platform. Perl itself is a big ole C program that must be compiled to a perl
executable for each platform. But once we have it (often it comes with the operating system, so we don't even have to compile it), we can run any Perl code from anywhere.
So that's the first big difference in our workflow. We run Perl immediately as we are developing, but we must first compile our Rust.
Hello Cargo
In truth, we rarely call the Rust compiler directly like that. Rust comes with a package manager called cargo
which does all kinds of things for us, including calling rustc
.
In fact, cargo
will write the "hello world" program! If we use "cargo new" to create a new project
$ cargo new hello
Created binary (application) `hello` package
Then we get a directory called "hello" with a few things already in it
$ tree hello
hello
├── Cargo.toml
└── src
└── main.rs
1 directory, 2 files
If we look in that src/main.rs
file, we see the "hello world" program
$ cat hello/src/main.rs
fn main() {
println!("Hello, world!");
}
If we change to that directory, then "cargo run" will first call the compiler and then run the resulting executable
$ cd hello
$ cargo run
Compiling hello v0.1.0 (/home/tim/hello)
Finished dev [unoptimized + debuginfo] target(s) in 0.28s
Running `target/debug/hello`
Hello, world!
So this is more like a typical workflow in Rust. Not bad, right? Cargo really takes the rough edges off. It does a whole lot more too, so we'll be seeing it again.
Output
Let's go back to our "hello world" programs, in Perl
#!/usr/bin/env perl
use v5.28;
use warnings;
say "Hello, World!";
and Rust
fn main() { println!("Hello, World!"); }
We see that println!
in Rust is like say
in Perl. There are similar things for print
and warn
.
Rust | Perl |
---|---|
print! | print |
println! | say |
eprint! | warn |
eprintln! | warn |
But for anything more complicated than a simple string like "Hello, World!" we need to format things first. In that sense, maybe a better analogy is to Perl's printf
and sprintf
Rust | Perl | TMTOWTDI |
---|---|---|
print! | print | printf |
println! | say | |
eprint! | warn | |
eprintln! | warn | |
format! | interpolation | sprintf |
Instead of interpolating the values of variables as in Perl
my $name = 'Tim';
say "Hello, $name!";
we format them like so
#![allow(unused_variables)] fn main() { let name = "Tim"; println!("Hello, {}!", name); }
The brackets are a placeholder for the value of name. To insert more variables, add more brackets.
#![allow(unused_variables)] fn main() { let name = "Tim"; let salutation = "Mr."; println!("Hello, {} {}!", salutation, name); }
(I don't think this is common elsewhere, but here in Baltimore I am much more likely to be called "Mr. Tim" than either "Mr. Heaney" or "Tim"; especially by younger people.) Anyway, there are lots more options, but you get the idea.
One of my favorite ways to show output while I'm working on Perl is with Data::Printer. We can achieve something similar with Rust's formatting.
#![allow(unused_variables)] fn main() { let name = "Tim"; let salutation = "Mr."; println!("Hello, {:?} {}!", salutation, name); }
When we run this, we get something slightly different. The {}
formatted the variable according to something called its Display
trait. But the {:?}
formats it according to its Debug
trait. It doesn't make much difference here, but in practice I find it indispensable.
I often use Data::Printer
in Perl to print out the values of complex data structures. It does that with clever introspection at run time. We can't do anything like that in Rust, but if we make sure the data structures we care about have a Debug trait (and there is a procedural macro that does exactly this for us), then we can just print them out with {:?}
. If they're very complex, we can print them out with {:#?}
, which is the same thing only prettier (arguably more like Data::Printer).
Rust | Perl |
---|---|
println!("{:#?}", foo) | use DDP; p $foo |
I've been using println!
everywhere (which is generally true in practice as well), but this format syntax is true for all of the above.
#![allow(unused_variables)] fn main() { // Format the string and return it. let greeting = format!("Hello, {} {}!", salutation, name); // Print the line to stderr, rather than stdout. eprintln!("Hello, {} {}!", salutation, name); }
There is also a handy little dbg!
macro, which prints out the variable name and value, along with the file name and line number, to stderr
Rust | Perl |
---|---|
dbg!(foo) | warn "\$foo = $foo" |
Update (2022-01-13): With the release of Rust 1.58, we can put captured identifiers in format strings! This is kind of like interpolation in Perl!
#![allow(unused_variables)] fn main() { let name = "Tim"; let salutation = "Mr."; println!("Hello, {salutation} {name}!"); }
Macros
What are macros? When we see a "function" with an exclamation point in its name like println!
, format!
, and dbg!
, we know it is not actually a function, but a macro.
Macros are how we do metaprogramming in Rust. A metaprogram is a program that generates a program. In Perl, we do this with things like source filters, Devel::Declare, Moose, and eval
. In Rust, we have macros.
Macros are not Rust code, per se. Rather, they generate Rust code. They happen before the compiler gets to see the code. As such, they can do things that functions can't. For example, you may have noticed that println!
is variadic. We've already called it with one, two, and three arguments.
#![allow(unused_variables)] fn main() { println!("Hello, World!"); println!("Hello, {}!", name); println!("Hello, {} {}!", salutation, name); }
You may not have noticed, because that's normal for Perl. But Rust does not have variadic functions. We couldn't make a println
function like this. But we can do it with a macro.
There's a lot more to macros, but for now it's probably enough to know that the exclamation mark indicates we're looking at a macro rather than a function. Sort of like the sigils in Perl!
Control Flow
Flow of control in Rust will look pretty familiar. In Rust, we don't use round brackets on the condition in if
, while
, and for
as we do in Perl.
Rust | Perl |
---|---|
if | if , ternary operator |
if ! | unless |
while | while |
loop | while(1) |
for | for , foreach |
continue | next |
break | last |
'label: | LABEL: |
In Rust, everything is an expression. There are no statements. That includes if
, so we don't need a separate ternary operator.
In Perl, we might rewrite this if
statement
my $name;
if ($formal) {
$name = 'Timothy';
} else {
$name = 'Tim';
}
with a ternary operator.
my $name = $formal ? 'Timothy' : 'Tim';
Many eschew the ternary operator, but I think that's better code. We replace a separate variable declaration, an if
statement, and two assignment statements with a single assignment.
In Rust, we would probably never write the first version
#![allow(unused_variables)] fn main() { let name; if formal { name = "Timothy"; } else { name = "Tim"; } }
because the second version
#![allow(unused_variables)] fn main() { let name = if formal { "Timothy" } else { "Tim" }; }
just seems more natural. If if
is an expression, then the first version is really going out of its way to ignore what it's returning just to repeat the name =
assignment twice.
Note that neither the Perl nor the Rust versions need to stay on one line.
my $name = $formal
? 'Timothy'
: 'Tim';
#![allow(unused_variables)] fn main() { let name = if formal { "Timothy" } else { "Tim" }; }
Loops
Loops are pretty much the same. If we want an infinite loop, we say loop
instead of "while true".
while (1) {
# do stuff
}
#![allow(unused_variables)] fn main() { loop { // do stuff } }
We don't have a C-style three-part for-loop in Rust as we do in Perl.
for (my $i = 1; $i <= 10; $i++) {
say $i;
}
That's okay. I almost never use it in Perl, so I almost never miss it in Rust.
Rust's for
is more like Perl's foreach
back when we distinguished between for
and foreach
.
for my $i (1..10) {
say $i;
}
fn main() { for i in 1..=10 { println!("{}", i); } }
If we needed a three-part for-loop, we would have to write the analogous while-loop with the three separate parts.
my $i = 1;
while ($i <= 10) {
say $i;
$i++;
}
fn main() { let mut i = 1; while i <= 10 { println!("{}", i); i += 1; } }
I might have more to say about Rust's for
later when we talk about iterators. It's really quite interesting how it works.
Maybelet
Rust also has if let
and while let
, which we will talk about more after we've discussed error handling. Paul Evans described them beautifully at FOSDEM 2021. At about 24:30 of this video, he proposes if(maybelet
and while(maybelet
for Perl 2025. That's pretty much how and why if let
and while let
work in Rust.
Scope
Scope in Rust works much like scope in modern Perl. Every block (if
, for
, while
, &c.) creates a new scope. We can add a bare block for a new scope in exactly the same way ({}
).
These two do the same things for the same reasons.
my $first_name = "Dean";
my $last_name = "Venture";
{
my $first_name = "Hank";
say "$first_name $last_name";
}
say "$first_name $last_name";
#![allow(unused_variables)] fn main() { let first_name = "Dean"; let last_name = "Venture"; { let first_name = "Hank"; println!("{} {}", first_name, last_name); } println!("{} {}", first_name, last_name); }
Both print
Hank Venture
Dean Venture
We first print "Hank Venture" because the first name is Hank in the new scope, but we can still see the Venture in the outer scope. Then we print "Dean Venture" because the inner scope is over and first name is still Dean in the outer scope.
One difference is that Rust encourages shadowing of variables (re-using a name in the same scope), but Perl does not.
my $first_name = "Dean";
my $first_name = "Hank"; # Warning!
#![allow(unused_variables)] fn main() { let first_name = "Dean"; let first_name = "Hank"; // first_name is *rebound* to "Hank". }
Code like this throws a warning in Perl, but is often seen in Rust code. Note that the re-used symbol gets a completely new binding, so it can even change type.
#![allow(unused_variables)] fn main() { let first_name = "Dean"; let first_name = String::from("Hank"); }
Here first_name
changes from a &str
to a String
.
Pattern Matching
Rust also has match
which is like match from many functional programming languages. It is very much dependent on types. Perl's attempt at a switch syntax with smart-matching is perhaps the closest analogy. Indeed, I think smart-matching failed because it tried to use information about the types of the operands and it didn't really have this information. Unlike every other operator in Perl, the smart-match operator was not the boss.
Update (2022-05-29): I talked about pattern matching at the Rust & C++ Cardiff book club. We are reading The Rust Programming Language together and I did the bit on chapter 18 (it starts about 50 minutes in). Also, you can view my slides.
Regular Expressions
When we think about pattern matching in Perl, we probably think of regular expressions. Regular expressions aren't just built in to Perl, they are part of its DNA.
Rust has no built in regular expressions, nor even any in the standard library. To employ regular expressions in a Rust program, we have to import a library from crates.io.
But before reaching for a regular expression library, consider other solutions. We use regular expressions for all sorts of things in Perl because they are so fast and easy. A simple string match like
my $s = 'Timothy';
say $s if $s =~ /^Tim/;
might better be done in Rust with a string method
#![allow(unused_variables)] fn main() { let s = "Timothy"; if s.starts_with("Tim") { println!("{s}"); } }
No regular expression required.
But when we do wish to use a regular expressions library in Rust, there are several to choose from. Probably the most common choice is regex,
which provides regular expressions like Perl's re::engine::RE2
regexes, not its built-ins. If we need any of Perl's fancier features like backtracking or look-arounds, we will need to choose a different library.
Here is a Perl regex I used in is_epoch
# Version 1 UUID's have timestamps in them. For example,
# 33c41a44-6cea-11e7-907b-a6006ad3dba0 => 1e76cea33c41a44
# -------- ---- --- $+{high}$+{mid}$+{low}
# low mid high
# 8 4 3
# and 1e76cea33c41a44 => 2017-07-20T01:24:40.472634Z
my $UUIDv1 = qr{(?<low>[0-9A-Fa-f]{8}) -?
(?<mid>[0-9A-Fa-f]{4}) -?
1 # this means version 1
(?<high>[0-9A-Fa-f]{3}) -?
[0-9A-Fa-f]{4} -?
[0-9A-Fa-f]{12} }mxs;
then later I use it like so
if ($arg =~ /^$UUIDv1$/) {
...
}
A similar Rust regex I used in epochs-cli looks like this
#![allow(unused_variables)] fn main() { static RE: Lazy<Regex> = Lazy::new(|| { Regex::new( r"(?x) ([0-9A-Fa-f]{8}) -? ([0-9A-Fa-f]{4}) -? ([0-9]{1}) ([0-9A-Fa-f]{3}) -? [0-9A-Fa-f]{4} -? [0-9A-Fa-f]{12} ", ) .unwrap() }); }
then later I used it like this
#![allow(unused_variables)] fn main() { if let Some(cap) = RE.captures(text) { ... } }
We've already talked about the regex
crate, but what's that Lazy business? Compiling a regex is not inexpensive, so we only want to do it once at compile time. This is similar to what qr
does in Perl. We'll do that with help from the once_cell
crate. That is, at the top of this program we have
#![allow(unused_variables)] fn main() { use once_cell::sync::Lazy; use regex::Regex; }
which provides the Lazy
and Regex
which appear in the code above.
So there is a bit more ceremony involved in using regular expressions in Rust, but it's not too bad. And we use them less often in Rust because of the rich set of string methods at our disposal, so it's really not a problem.
Types
Perl is dynamically typed. Rust is statically typed. In practice, I think this is the biggest difference for the programmer. In general, we have to be much more aware of the types of variables in Rust than we do in Perl.
Also, Perl is weakly typed, whereas Rust is strongly typed.
Rust | Perl |
---|---|
static | dynamic |
strong | weak |
That means that Rust is not going to do any casting or automatic conversion of types the way that Perl does.
Turning that table inside-out, we see that Rust and Perl couldn't be more different
static | dynamic | |
---|---|---|
strong | Rust | |
weak | Perl |
Let's fill in the rest of the table with other languages you may know
static | dynamic | |
---|---|---|
strong | Rust | Python |
weak | C | Perl |
Python is dynamic, like Perl, but its types are pretty strong. For example, we can't just use the string "3" as a number in Python as we do in Perl; we have to convert it first.
$ perl -E 'say "3" + 4'
7
$ python -c 'print("3" + 4)'
Traceback (most recent call last):
File "<string>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects
$ python -c 'print(int("3") + 4)'
7
C is static, like Rust, but its types are very weak. You want to add a character to an integer? Have at it!
char c = '3';
int i = 4;
printf("%d\n", c + i);
This may not print what you expected, but it will compile and print something.
So Rust types are both static, like C, and strong, like Python? That sounds like it might be a pain to use. And indeed Rust is a good deal more finicky than Perl. It will take some getting used to for sure.
The good news is, Rust has a really nice type system! Unlike the type systems found in most imperative programming languages, Rust has algebraic data types --- something usually only found in functional programming languages.
Rust also has type inference, so we can often leave off the type and Rust will figure it out.
These two properties combined mean Rust's types become more of a tool we use, rather than a burden we endure.
Numbers
In Perl, something like "3" + 4
works because the +
operator places each of its arguments in numeric context. In Rust, the types of the operands must match because there's a different +
operator for each type.
In Perl, the operators are in charge. In Rust, the operands are. (More specifically, the types of the operands.)
And we don't just mean strings and numbers; Rust has lots of different numeric types. There are signed integers, unsigned integers, and floating-point numbers, each in various sizes. We cannot do arithmetic until all the types are the same.
The names of the types are letters i
, u
, and f
followed by the number of bits. There are also types for the native size on the current platform. So we have
signed integers: i8, i16, i32, i64, i128, isize
unsigned integers: u8, u16, u32, u64, u128, usize
floating-point numbers: f32, f64
That's fourteen kinds of numbers, each with its own +
operation. Yikes!
In practice, it's not as bad as you might think. This Perl code
my $x = 3;
my $y = 4;
my $z = $x + $y;
say $z;
looks nearly the same in Rust
#![allow(unused_variables)] fn main() { let x = 3; let y = 4; let z = x + y; println!("{}", z); }
We have let
instead of my
and all of the dollar signs are missing, but that's about it. So where are all the types we were so worried about? In Perl, the dollar signs at least tell us we have scalars. In Rust, we get nothing. In fact, everything is implied. First, those numeric literals (the 3 and 4) have defaults. Since we didn't specify, Rust assumes they are 32-bit signed integers. We could have written them with the type appended, like so.
#![allow(unused_variables)] fn main() { let x = 3i32; let y = 4i32; }
From there, the types of x and y are inferred. Since we're stuffing i32's in them, they must be of type i32. And then the type of z is inferred. Summing two i32's gives another i32 and we're stuffing that in z, so z must be of type i32.
Incidentally, we can put underscores in numeric literals anywhere we want, just as in Perl. That's handy for big numbers, like 1_000_000
in both Perl and Rust. But in Rust, it's also common to use it for these type annotations.
#![allow(unused_variables)] fn main() { let x = 3_i32; let y = 4_i32; }
Alternatively, we could specify the types of the variables, like so.
#![allow(unused_variables)] fn main() { let x: i32 = 3; let y: i32 = 4; }
We could even do both, but that's starting to look silly.
#![allow(unused_variables)] fn main() { let x: i32 = 3_i32; let y: i32 = 4_i32; }
If x and y were two different types, then trying to add them would be an error.
#![allow(unused_variables)] fn main() { let x: i32 = 3; let y: i64 = 4; let z = x + y; }
Note that this is a compile-time error; we never get a chance to run this. Here is some of the compiler output
...
error[E0308]: mismatched types
--> src/main.rs:12:17
|
12 | let z = x + y;
| ^ expected `i32`, found `i64`
error[E0277]: cannot add `i64` to `i32`
--> src/main.rs:12:15
|
12 | let z = x + y;
| ^ no implementation for `i32 + i64`
|
= help: the trait `std::ops::Add<i64>` is not implemented for `i32`
error: aborting due to 2 previous errors
...
One way to remedy this would be to use the as
keyword to coerce one of the types into the other.
#![allow(unused_variables)] fn main() { let x: i32 = 3; let y: i64 = 4; let z = (x as i64) + y; }
Here, z would be inferred to be an i64, as it's the sum of two i64s. This is perfectly safe, as every i32 is expressible as an i64. If we went the other way, namely
#![allow(unused_variables)] fn main() { let z = x + (y as i32); }
then we have to be a bit careful. In this case, we're fine because 4 is obviously expressible as an i32. But not every i64 is expressible as an i32. And the as
keyword is naïve, so it could quietly give us the wrong answer. We will say more about this when we discuss error handling.
Strings
Strings are complicated. ("Anyone who says differently is selling something.")
In Perl, a lot of the complications are hidden away. This is an example of Perl's DWIM approach. For the most part, strings in Perl behave as we expect and we don't really think about it too much.
In Rust, all of the complications of strings are right in our faces. Rust's approach is not DWIM, rather Rust prefers to be explicit about most everything. One consequence of this is that we end up with multiple types for strings!
Let's look at our "hello name" examples again. In Perl, we had
#!/usr/bin/env perl
use v5.28;
use warnings;
my $name = "Tim";
say "Hello, $name!";
while in Rust, we had
fn main() { let name = "Tim"; println!("Hello, {}!", name); }
Again, we see that Rust's let
is kind of like Perl's my
and it's apparently inferring the type of the string literal. Writing it explicitly, we'd have the following
fn main() { let name: &str = "Tim"; println!("Hello, {}!", name); }
So the type of name
is &str
. What is that? It's a string slice, which is not very similar to a Perl string. The static string literal is somewhere in memory and we just get a view into it. Rust has another type called String
which is more akin to a Perl string. A Rust String
is a dynamic chunk of memory holding our string, but to get one from a string literal we have to convert it. One way to do that is like so
fn main() { let name: String = String::from("Tim"); println!("Hello, {}!", name); }
Again, we could let Rust infer the type--- which truly feels redundant here--- and just write
fn main() { let name = String::from("Tim"); println!("Hello, {}!", name); }
but it still seems like a lot of work for just a string. But wait, there's more!
Decoded or not?
A String
in Rust is a sequence of bytes that is guaranteed to be valid UTF-8. And a &str
is a slice that always points to a valid UTF-8 sequence, so it can be used to view into a String
as well as a static string literal. So these are akin to decoded strings in Perl.
In Perl, if we don't decode a string, explicitly or implicitly, then it's just a sequence of arbitrary bytes. The same thing in Rust would be a byte slice.
#![allow(unused_variables)] fn main() { let name = b"Tim"; println!("{:?}", name); }
Running this would produce [84, 105, 109]
, where 84 is the 'T', 105 is the 'i', and 109 is the 'm'. So b"Tim"
contains all of the data to make a string, but it's not really a string yet.
Characters
I guess now is a good time to mention that a character in Rust is not stored in a byte. A char
is a single UTF-32 character, so it takes four bytes. So a string in Rust is not a sequence of characters! A String
is a UTF-8 sequence, but a char
is a UTF-32 value.
Foreign strings
The Rust standard library also contains some string types for dealing with sequences of bytes that do not decode into valid UTF-8, but are still considered strings in other contexts.
For things like path names, we have operating system strings, std::ffi::OSString
and std::ffi::OSStr
. The OSString
is like String
, but it could contain, say, a Windows-1252 string with values that are not valid UTF-8. The OSStr
is analogous to str
, so we usually see it as &OSStr
just as we usually see &str
.
Rust also has types just for going back and forth between C code, std::ffi::CString
and std::ffi::CStr
. In C, strings are null-terminated sequences of bytes. It's not inexpensive to convert those to and from Rust Strings, so we sometimes use Cstring
and &Cstr
instead.
Booleans
Rust has a Boolean type (bool
) and things like if
and while
expect things of type bool
only. Contrast this to Perl where things like if
and while
place whatever they're given in boolean context.
Additionally, everything in Rust must be initialized. There are no undefined values.
So in Perl we have a somewhat complicated set of rules about the truthiness and definedness of various expressions and values. In Rust, true
is true, false
is false, and anything else is going to have to get converted into a Boolean.
Compound Types
In Perl, we essentially have just three types: scalar, array, and hash. Each of these is indicated by its sigil.
my $scalar = 3;
my @array = (1, 2, 3);
my %hash = (alpha => 1, beta => 2, gamma => 3);
They are so flexible, though, that we accomplish enormous amounts of work with just these three things. This flexibility comes from the dynamic nature of Perl. A scalar holds a single value. An array holds a list of scalars. A hash holds a list of key-value pairs. There are no other restrictions.
my $scalar = "\N{TOP HAT}";
my @array = (1, "2", ['5', "banana", 3]);
my %hash = (alpha => 1, beta => \@array, gamma => $scalar);
The static nature of Rust means compound data types are more restricted. The Rust compiler needs to know the type of everything.
Arrays
Arrays in Rust are fixed length sequences of things that are all the same type.
#![allow(unused_variables)] fn main() { let array: [i32; 3] = [1, 2, 3]; }
What? That's not at all like an array in Perl! In truth, we don't often use arrays directly in Rust either; we more often access them through slices. This is similar to accessing Strings with &str
.
You might have an occasion to use a Rust array while programming some time, but in the mean time just ignore the name similarity.
Vectors
When you think you need a Perl array, you probably need a Rust vector.
#![allow(unused_variables)] fn main() { let vector = vec![1, 2, 3]; }
These hold lists of things that are all of the same type too, but we can add and remove things from them much as we do in Perl.
my @array = (1, 2, 3);
push @array, 4;
#![allow(unused_variables)] fn main() { let mut vector = vec![1, 2, 3]; vector.push(4); }
In Rust, everything is immutable by default, so we must explicitly declare a vector to be mutable if we wish to change it (e.g., push onto it).
Tuples
Rust's vectors have to be all the same type, though. If you need a list of things of different type, you might need a Rust tuple. These can't change size or change types, but they can hold something akin to a given Perl list.
my @array = (1, '2', "banana");
#![allow(unused_variables)] fn main() { let tuple = (1, '2', "banana"); }
This tuple has type (i32, char, &str)
. It's not comparable to tuples of other lengths or even to other triples of different component types.
HashMaps
When we need something like a Perl hash in Rust, we usually want a HashMap. We must pick a type for our keys and a type for our values, but other than that they are similar to use.
my %hash;
$hash{alpha} = 1;
$hash{beta} = 2;
$hash{gamma} = 3;
#![allow(unused_variables)] fn main() { let mut hash = HashMap::new(); hash.insert("alpha", 1); hash.insert("beta", 2); hash.insert("gamma", 3); }
There are other data structures in Rust's std::collections which me might choose for things that we would probably just use a hash for in Perl. For example, in Perl we might use an existence hash (where we only care about the keys and the values are always 1), whereas in Rust we might choose a HashSet.
Algebraic Data Types
As mentioned earlier, Rust's type system is algebraic. We can create new types from combinations of existing types. We can make both products (product types) and coproducts (sum types). Rust's products are tuples or structures (keyword struct
) and its coproducts are enumerations (keyword enum
).
Structures
Structures group things together; we have both this and that.
For example, here is a structure from epochs-cli
#![allow(unused_variables)] fn main() { struct Datelike { source: String, viewed_as: View, epochs: HashMap<String, NaiveDateTime>, } }
Each instance of a Datelike struct contains all of those three things.
Enumerations
Enumerations offer a choice; we have either this or that.
For example, here is an enum from epochs-cli
#![allow(unused_variables)] fn main() { enum View { Decimal, Float, Hexadecimal, UUIDv1, } }
Each instance of a View enum contains exactly one of those four things.
Tuples
We've already talked about tuples a bit, but they belong here too as they are products. Perhaps the simplest product type is a pair of other types like (char, i32)
. It's kind of like a struct without the labels. We refer to the elements by number. If t
is a pair, then t.0
is the first thing and t.1
is the second.
Unit
The empty tuple, ()
, is Rust's unit type. In Rust, everything is an expression; there are no statements. That is, everything returns a value. The unit type is the type returned when things "don't return anything."
Conclusion
Most type systems have things like tuples and structures, but many lack rich enumerations like Rust has. You may have used Perl's Types::Standard::Enum, which provides C-style enumerations. Rust's enumerations can also contain data, so they're a more proper coproduct.
A type system like Rust's can be used to make illegal states unrepresentable.
Generic Types
Rust also supports the notion of generic types. We can write code that supports a number of types by using a stand in, often just the letter T. This includes when defining structures and enumerations.
Rust does not have higher-kinded types1. If you don't know what that means, don't worry about it. If you do, then you might be looking for them. You can stop.
In Rust 1.65.0, generic associated types were stabilized. These are sort of generic over something that is itself generic, so it's rubbing up against one aspect of higher-kinded types.
Functions
Rust functions are like Perl subroutines. They don't have to be pure functions (mathematical functions).
Named functions in Rust are created with the keyword fn
. We've already seen one of these; the "hello world" program used a main function.
fn main() { println!("Hello, World!"); }
In Perl, we don't have an explicit main, though it is there implicitly (we are in package main
unless we say otherwise).
In Rust, there is also a separate syntax for closures. In Perl, we use keyword sub
for both named and anonymous subroutines.
Rust | Perl |
---|---|
fn | sub |
closures | sub |
traits | roles |
In Rust, function signatures are the one place where there is never any type inference. We always have to say the types of our arguments and of our return value. This is kind of nice, as it serves as a form of documentation.
If you've done any Haskell, you know that it will infer types in the function definitions as well. But having that information at hand is so valuable that Haskell programmers usually include a type signature declaration even though it's optional to do so.
In Haskell, it's a best practice. In Rust, it's required.
Hello Functions
Let's go back to our hello name example, but this time create a function to do the greeting. In Perl it might look like this.
#!/usr/bin/env perl
use v5.28;
use warnings;
greet("Tim");
sub greet {
my $name = shift;
say "Hello, $name!";
}
The same thing in Rust might look like this.
fn main() { greet("Tim"); } fn greet(name: &str) { println!("Hello, {}!", name); }
Here we see the type of the parameter that must be passed to greet in the function signature. Perl's subroutine signatures are still experimental1, but if we use those then the two examples look more similar.
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
greet("Tim");
sub greet($name) {
say "Hello, $name!";
}
Here we see the Perl subroutine greet
takes a single scalar as a parameter. The Rust function greet
takes a single parameter also, but it must be a string slice (&str
). No other type will do.
In Perl, we could add
greet("Tim");
greet(3);
and it would print
Hello, Tim!
Hello, 3!
In Rust, it wouldn't compile.
error[E0308]: mismatched types
--> src/main.rs:3:11
|
3 | greet(3);
| ^ expected `&str`, found integer
We would have to first convert the integer to a string slice.
fn main() { greet("Tim"); greet(&3.to_string()); } fn greet(name: &str) { println!("Hello, {}!", name); }
There's a tiny bit of magic here as we've actually passed it a reference to a string (&String
) not a string slice (&str
), but an entire string counts as a slice, so it's okay.
Hello, Tim!
Hello, 3!
There's an analogous situation with Rust's arrays, vectors, and slices.
As of Perl 5.36, subroutine signatures are no longer experimental.
Closures
In Rust, we have a separate syntax for closures. In Perl, we re-use sub
. For example, we can create an anonymous subroutine and call it like so.
my $greet2 = sub($name) {say "Hello, $name!"};
$greet2->("Tim");
The subroutine itself doesn't have a name. $greet2
is an ordinary Perl scalar that holds a coderef. The same thing in Rust looks like this.
fn main() { let greet2 = |name| {println!("Hello, {}!", name)}; greet2("Tim"); }
The name of the variable goes between those two pipes. You can think of both pipes as the lambda in lambda calculus (so |name|
is like λ name
). This is similar to the Ruby syntax, except in Rust the pipes go on the outside of the block and in Ruby they go on the inside. Ironically, this probably means that you are going to type it wrong a lot if you are familiar with Ruby, but you will learn it quickly and easily if you have never seen anything like it before. Seems kind of unfair!
We can also access variables in the outer scope. In Perl, an alternative way to write the above is
my $name = "Tim";
my $greet3 = sub {say "Hello, $name!"};
$greet3->();
Here the anonymous subroutine takes no argument. It is accessing the $name
variable in the outer scope. In Rust, we have just the two pipes with nothing between.
fn main() { let name = "Tim"; let greet3 = || {println!("Hello, {}!", name)}; greet3(); }
All of the above print
Hello, Tim!
Methods
In Rust, we don't have classes or inheritance, but we can attach methods to structs and enums. This looks an awful lot like object-oriented Perl. Say we had a rectangle object that knew how to find its own area. In Perl, we might write something like this.
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
package Rectangle {
sub new($class, $length, $width) {
my $self = {
_length => $length,
_width => $width,
};
bless $self, $class;
return $self;
}
sub area($self) {
$self->{_length} * $self->{_width}
}
}
my $r = Rectangle->new(2, 3);
say "Area is ", $r->area;
In Rust, we'd create a Rectangle struct first and then separately implement new
and area
methods for that struct in an impl
block. We use a dot (.
) in Rust where we use an arrow (->
) in Perl, but otherwise it looks the same.
struct Rectangle { length: f64, width: f64, } impl Rectangle { fn new(length: f64, width: f64) -> Self { Self{length, width} } fn area(&self) -> f64 { self.length * self.width } } fn main() { let r = Rectangle::new(2.0, 3.0); println!("Area is {}", r.area()); }
Indeed, in both Rust and Perl, these method calls are just sugar for function calls. That is, just as in Perl, these two are the same
say "Area is ", $r->area;
say "Area is ", Rectangle::area($r);
in Rust, these two are the same.
#![allow(unused_variables)] fn main() { println!("Area is {}", r.area()); println!("Area is {}", Rectangle::area(&r)); }
Traits
Rust traits are more akin to Perl roles; they describe "does-a" rather than "is-a" relationships.
If we define a role like this in Perl
package Area {
use Role::Tiny;
requires qw(area);
}
then anything that does the Area
role must have a method called area
.
Similary, if we define a trait like this in Rust
#![allow(unused_variables)] fn main() { trait Area { fn area(&self) -> f64; } }
then anything that implements the Area
trait, must have a method called area
with that exact function signature.
For example, in Perl we could create a Circle and a Rectangle that both do the Area role like so
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
package Area {
use Role::Tiny;
requires qw(area);
}
package Circle {
use Role::Tiny::With;
with 'Area';
sub new($class, $radius) {
my $self = {
_radius => $radius,
};
bless $self, $class;
return $self;
}
sub area($self) {
3.14159265358979 * $self->{_radius} ** 2
}
}
package Rectangle {
use Role::Tiny::With;
with 'Area';
sub new($class, $length, $width) {
my $self = {
_length => $length,
_width => $width,
};
bless $self, $class;
return $self;
}
sub area($self) {
$self->{_length} * $self->{_width}
}
}
my $c = Circle->new(1);
say "Area of circle is ", $c->area;
my $r = Rectangle->new(2, 3);
say "Area of rectangle is ", $r->area;
Alternatively, here it is again using Perl's Object::Pad.
Similarly, in Rust we could create a Circle and a Rectangle that both implement the Area trait like so
trait Area { fn area(&self) -> f64; } struct Circle { radius: f64, } impl Circle { fn new(radius: f64) -> Self { Self{radius} } } impl Area for Circle { fn area(&self) -> f64 { std::f64::consts::PI * self.radius.powi(2) } } struct Rectangle { length: f64, width: f64, } impl Rectangle { fn new(length: f64, width: f64) -> Self { Self{length, width} } } impl Area for Rectangle { fn area(&self) -> f64 { self.length * self.width } } fn main() { let c = Circle::new(1.0); println!("Area of circle is {}", c.area()); let r = Rectangle::new(2.0, 3.0); println!("Area of rectangle is {}", r.area()); }
Running either of these produces
Area of circle is 3.14159265358979
Area of rectangle is 6
There's more to Rust's traits (they're sort of the key to all of the magic in Rust), but that'll do for now.
Hello Generics
Earlier we were looking at this greet function that only accepted a string slice.
fn main() { greet("Tim"); } fn greet(name: &str) { println!("Hello, {}!", name); }
If we wanted to greet integers, we'd need another function.
fn main() { greet(3); } fn greet(name: i64) { println!("Hello, {}!", name); }
How do we write a single function that accepts either? That's where generics come in. The basic syntax for declaring a generic function looks like this.
#![allow(unused_variables)] fn main() { fn greet<T>(name: &T) { println!("Hello, {}!", name); } }
That T in angle brackets stands for any type. Then our variable name
is a reference to whatever that type is. But this isn't going to work because we're trying to format that variable with {}
. Earlier we mentioned that this depends on the Display
trait. Thus our function isn't completely generic. We can't accept any type T. We can only accept types that satisfy the Display
trait. We can do that by adding a trait bound like so. The Display
trait is defined in the standard library, so first we use
it.
#![allow(unused_variables)] fn main() { use std::fmt::Display; fn greet<T: Display>(name: &T) { println!("Hello, {}!", name); } }
Now, it turns out that the Display
trait implies Sized
as well. But we don't care about that if we're just formatting, so we need to relax it like so.
use std::fmt::Display; fn main() { greet("Tim"); greet(&3); } fn greet<T: Display + ?Sized>(name: &T) { println!("Hello, {}!", name); }
Now we can greet string slices, integer references, and anything else that implements Display
.
use std::fmt::Display; fn main() { greet("Tim"); greet(&3); } fn greet<T: Display + ?Sized>(name: &T) { println!("Hello, {}!", name); }
Running this gives
Hello, Tim!
Hello, 3!
If the trait bounds get too unwieldy, there is an alternative syntax with a where
clause added afterwards.
use std::fmt::Display; fn main() { greet("Tim"); greet(&3); } fn greet<T>(name: &T) where T: Display + ?Sized, { println!("Hello, {}!", name); }
TMTOWTDI!
Now, the Display
trait was required for the {}
in the println!
. If we wanted to print it with {:?}
instead, that would require the Debug
trait.
use std::fmt::Debug; fn main() { greet("Tim"); greet(&3); } fn greet<T: Debug + ?Sized>(name: &T) { println!("Hello, {:?}!", name); }
Running this yields
Hello, "Tim"!
Hello, 3!
The 3 printed the same, but notice the quotes around Tim. That can be handy. I often add quotes around things like filenames in Perl.
say "Writing players to file '$path'..." if $opt->verbose;
In Rust, I can format them with {:?}
instead of {}
.
fn main() { let path = "players.json"; let verbose = true; if verbose { println!("Writing players to file {:?}...", path); } }
Ownership
Perl manages our memory for us at run time. Rust does virtually everything at compile time, yet it is still memory safe. How does that work? It does this through its ownership model.
Rust's ownership model is what sets it apart from every other programming language I've used. Earlier, we said Rust's type system was the biggest practical difference for Perl programmers. While I believe that's true, Rust's ownership model is the biggest theoretical difference. It's what allows Rust to do so much at compile time. It's what makes Rust Rust!
Ownership Rules
There are three rules to Rust's ownership
- Every value has a variable called its owner.
- There can only be one owner at a time.
- When the owner goes out of scope, the value is dropped.
So scope determines when values are dropped. That's nice, because earlier we said that Rust's scoping rules work much like Perl's. We're well on our way to understanding already!
Move Semantics
By default, whenever we assign a value to a variable, pass it to a function, or return it from a function, ownership is transferred.
fn main() { let name = String::from("Tim"); // String is owned by name greet(name); // Ownership is transferred } fn greet(name: String) { println!("Hello, {}!", name); }
This example works fine, but note that when we return from greet
, name
is no longer accessible in main
. It has been dropped because ownership was transferred to the name
in greet
, which went out of scope when the function completed.
If we tried to call it again, it would fail.
fn main() { let name = String::from("Tim"); // String is owned by name greet(name); // Ownership is transferred greet(name); // use of moved value: `name` } fn greet(name: String) { println!("Hello, {}!", name); }
Note that this is a compile-time error. We never get a chance to run this. The full compiler message is
$ cargo run
Compiling ownership v0.1.0 (/home/tim/rust/ownership)
error[E0382]: use of moved value: `name`
--> src/main.rs:5:11
|
2 | let name = String::from("Tim");
| ---- move occurs because `name` has type `String`, which does not implement the `Copy` trait
3 |
4 | greet(name);
| ---- value moved here
5 | greet(name);
| ^^^^ value used here after move
error: aborting due to previous error
For more information about this error, try `rustc --explain E0382`.
error: could not compile `ownership`
To learn more, run the command again with --verbose.
Note the message, "move occurs because name
has type String
, which does not implement the Copy
trait." That gives us a hint how we could fix this. If we make a copy of our string (String
types have a clone
method which does exactly this), then we could move that while retaining the original.
fn main() { let name = String::from("Tim"); // String is owned by name greet(name.clone()); // Make a copy and move that greet(name); // Original is still available } fn greet(name: String) { println!("Hello, {}!", name); }
Borrowing
Well, move semantics sounds like a pain! We really have to explicitly clone things all the time? No, this is where borrowing comes in.
Rather than take ownership of a value, we can borrow it.
fn main() { let name = String::from("Tim"); greet(&name); greet(&name); } fn greet(name: &String) { println!("Hello, {}!", *name); }
That ampersand in &String
means greet
doesn't want to take ownership of the String, it just wants a reference to it. And because it's not the owner, the string is not dropped when the function finishes executing. So we can call it a second time with no problems.
Since the name
inside greet
is not a String
, but a string reference, we dereference it when we use it with *name
.
This is analogous to the following in Perl.
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
my $name = "Tim";
greet(\$name);
sub greet($ref) {
say "Hello, ${$ref}!";
}
In fact, Rust will do the dereference for us. That is, this works just fine as well.
fn main() { let name = String::from("Tim"); greet(&name); greet(&name); } fn greet(name: &String) { println!("Hello, {}!", name); }
This is one of the few times Rust does not make us be explicit.
In Perl, the analogous thing
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
my $name = "Tim";
greet(\$name);
sub greet($name) {
say "Hello, $name!";
}
would print out something like
Hello, SCALAR(0xDEADBEEF)!
Shared and Unique Borrows
There are actually two kinds of borrows in Rust. The above is a shared (immutable) borrow. We can read the value, but we cannot change it. If we need to do that, we must use a unique (mutable) borrow. (I really like the terms "shared" and "unique", but it seems "immutable" and "mutable" have won out. I guess because of the mut
keyword.)
We can have multiple immutable borrows. Lots of things can read a value at the same time. We can only have one mutable borrow. Only one thing at a time can change a value. Moreover, if there is a mutable borrow, there can be no shared borrows. If we're writing a value, then no one should be reading it. Indeed, if there is a mutable borrow, not even the owner can read the value.
fn main() { let mut name = String::from("Hank"); greet(&name); change(&mut name); greet(&name); } fn greet(name: &String) { println!("Hello, {}!", name); } fn change(name: &mut String) { *name = String::from("Dean"); }
In Perl, our references are always mutable. The above might look like this
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
my $name = "Hank";
greet($name);
change(\$name);
greet($name);
sub greet($name) {
say "Hello, $name!";
}
sub change($ref) {
$$ref = "Dean";
}
That's not really idiomatic in either language, but you get the idea.
The Rust compiler has a borrow-checker. It keeps track of all the borrows in our programs and notifies us when we've violated any of these rules.
Closures
When we talked about closures, we said they can access variables from the enclosing scope. There are three ways to do this: move and the two kinds of borrows. There are traits for these.
- FnOnce => move (like
self
) - FnMut => borrow mutably (like
&mut self
) - Fn => borrow immutably (like
&self
)
Copy Types
Earlier we said, by default, ownership is transferred. If we don't want to use a borrow, we need to make a copy. Copy
types do this automatically. Types such as i32
implement the Copy
trait because it's just not worth the trouble of setting up a borrow. Why copy a reference to the stack when we can just copy our 32 bits to the stack?
So, for example, this works as we expect.
fn main() { let num = 3; greet(num); greet(num); } fn greet(num: i32) { println!("Hello, {}!", num); }
This is just like the following in Perl.
#!/usr/bin/env perl
use v5.28;
use warnings;
use experimental qw(signatures);
my $num = 3;
greet($num);
greet($num);
sub greet($num) {
say "Hello, $num!";
}
I didn't present a Copy
type first--- even though it is more akin to what happens in Perl--- because I wanted to stress that this is the exception, not the rule. Move semantics are the norm in Rust; Copy
types are the special case.
Data Races
Rust's ownership model gives memory safety guarantees like "no dangling pointers," and "no double-frees." It turns out, the things it does to ensure memory safety also prevent data races!
Earlier, we said that--- unlike Perl--- Rust gives us easy access to threads. But programming with multiple threads offers whole new challenges. If one thread wants to change a piece of memory, how do we ensure that no other thread can read it at the same time? Rust's ownership model already does exactly this!
There are other kinds of race conditions possible in Rust, but we don't have to worry about data races. Rust's ownership model is remarkable!
Error Handling
Error handling in Rust is very different from Perl. Rust has no null value like Perl's undef
. Rust doesn't really have exceptions either, though we can panic!
which terminates safely and unwinds the stack. Instead, Rust leverages its type system to create two very common enumerations: Option
and Result
.
Option
An option is either Some
, and has some data, or it is None
, and has no data.
#![allow(unused_variables)] fn main() { pub enum Option<T> { Some(T), None, } }
Since we can have all sorts of optional types, we use a generic type T. An option is either type Some(T)
, in which case our data is of type T, or it is type None
and there is no data.
Result
A result is either Ok
, and has some good data, or it is Err
, and has some error data.
#![allow(unused_variables)] fn main() { pub enum Result<T, E> { Ok(T), Err(E), } }
Now there are two generic types, good data of type T and error data of type E.
Options
Say we wanted to write an integer divide that avoided crashing on division by zero. In Perl, we might write this.
sub safediv($x, $y) {
return if $y == 0;
int($x / $y);
}
This returns an undefined value when we try to divide by zero. We might call it like so
my $d = safediv(1, 2);
say $d if defined $d;
The value that we get back might be undef
, so we check first before printing it.
In Rust, we might write this.
#![allow(unused_variables)] fn main() { fn safediv(x: i32, y: i32) -> Option<i32> { if y == 0 { return None; } Some(x/y) } }
This returns an Option<i32>
. It will be None
if we try to divide by zero. Otherwise it will have our integer wrapped in a Some
. We might call it like so
fn main() { let option_d = safediv(1, 2); match option_d { Some(d) => println!("{}", d), None => {}, } } fn safediv(x: i32, y: i32) -> Option<i32> { if y == 0 { return None; } Some(x/y) }
Here we match
on our Option
. If it is a Some
, then we destructure it as part of the match; we can access our integer as d
. If it is None
, then we do nothing.
Another way would be with if let
.
fn main() { if let Some(d) = safediv(1, 2) { println!("{}", d); } } fn safediv(x: i32, y: i32) -> Option<i32> { if y == 0 { return None; } Some(x/y) }
This implicitly does the destructuring match and only does what's inside if it succeeds.
We can also unwrap
an option. If we unwrap a Some
, we get what's inside. If we unwrap a None
, it panics.
fn main() { let d = safediv(1, 2).unwrap(); println!("{}", d); } fn safediv(x: i32, y: i32) -> Option<i32> { if y == 0 { return None; } Some(x/y) }
This is kind of like this Perl
my $d = safediv(1, 2) // die;
say $d;
Similarly, we can expect
, which is like unwrap with an added message.
fn main() { let d = safediv(1, 2).expect("Cannot divide by zero!"); println!("{}", d); } fn safediv(x: i32, y: i32) -> Option<i32> { if y == 0 { return None; } Some(x/y) }
This is kind of like this in Perl
my $d = safediv(1, 2) // die "Cannot divide by zero!";
say $d;
Results
Results are similar to options, but in addition to indicating that there is an error, we get a chance to say what kind of error it is. For example, we might write our safediv like this instead
#![allow(unused_variables)] fn main() { fn safediv(x: i32, y: i32) -> Result<i32, String> { if y == 0 { return Err(String::from("Division by zero!")); } Ok(x/y) } }
This returns a result which is either Ok
with our integer or Err
with our error. In this case the error is a String
, but we can choose any type. Often we create a custom error type. I guess this is kind of analogous to string exceptions versus exception objects in Perl. But again, there are no exceptions here in Rust; a result is an ordinary enum
with two different possibilities. As such, we can match on it, the same as with an option (or any other enum).
fn main() { match safediv(1, 2) { Ok(d) => println!("{}", d), Err(e) => println!("{}", e), } } fn safediv(x: i32, y: i32) -> Result<i32, String> { if y == 0 { return Err(String::from("Division by zero!")); } Ok(x/y) }
And, alternatively, we can use if let
.
fn main() { if let Ok(d) = safediv(1, 2) { println!("{}", d); } } fn safediv(x: i32, y: i32) -> Result<i32, String> { if y == 0 { return Err(String::from("Division by zero!")); } Ok(x/y) }
Similarly, we can unwrap a result
fn main() { let d = safediv(1, 2).unwrap(); println!("{}", d); } fn safediv(x: i32, y: i32) -> Result<i32, String> { if y == 0 { return Err(String::from("Division by zero!")); } Ok(x/y) }
Or call expect on a result
fn main() { let d = safediv(1, 2).expect("Cannot divide by zero"); println!("{}", d); } fn safediv(x: i32, y: i32) -> Result<i32, String> { if y == 0 { return Err(String::from("Division by zero!")); } Ok(x/y) }
Question Mark
Lots of library functions--- including the standard library, as well as third-party libraries--- return either options or results, so you'll likely get used to using things like match
, if let
, while let
, unwrap
, and expect
even before you make an Option
or Result
yourself.
Example: Open
For example, when we open a file in Perl, it returns an undefined value when it fails, so we usually write something like this
open my $fh, '<', $file or die "Cannot open file '$file': $!";
Upon success, our file handle is in $fh
. Upon failure, we retrieve the reason for the failure from $!
.
The similar maneuver in Rust
use std::fs::File; use std::path::Path; fn main() { let path = Path::new("no_such_file"); let result = File::open(&path); match result { Ok(f) => println!("The open succeeded\n{:#?}", f), Err(e) => println!("The open failed\n{:#?}", e), } }
gives us a Result<File, Error>
. Upon success, a std::fs::File is wrapped in an Ok
. Among other things, it contains the file descriptor we need to read from the file. Upon failure, a std::io::Error is wrapped in an Err
. Among other things, it contains the reason for the failure.
If we match on the result, we could inspect the File struct or the Error struct like so
use std::fs::File; use std::path::Path; fn main() { let path = Path::new("no_such_file"); let result = File::open(&path); match result { Ok(f) => println!("The open succeeded\n{:#?}", f), Err(e) => println!("The open failed\n{:#?}", e), } }
Again, we can call unwrap
or expect
on a Result
and it will panic if it is Err
, so perhaps a more direct analogue to
open my $fh, '<', $file or die "Cannot open file '$file': $!";
would be
use std::fs::File; use std::path::Path; fn main() { let path = Path::new("no_such_file"); let f = File::open(&path).expect("Cannot open file!"); println!("The open succeeded\n{:#?}", f); }
This unwraps our File on success and panics with our message on failure.
The ? (question mark) operator
Rust has a more idiomatic way of dealing with options and results. Perhaps this is best shown by example. Say we wanted to read the first eight bytes of a file (perhaps we want to check if it's a PNG file or something). We might write something like this
use std::fs::File; use std::io; use std::io::{Error, ErrorKind}; use std::io::prelude::*; use std::path::Path; fn main() { let path = Path::new("src/main.rs"); let first_eight = read_eight_bytes(&path); dbg!(&first_eight); } fn read_eight_bytes(path: &Path) -> Result<[u8; 8], io::Error> { let result = File::open(path); let mut f = match result { Ok(f) => f, Err(e) => return Err(e), }; let mut buffer = [0; 8]; let result = f.read(&mut buffer[..]); let n = match result { Ok(n) => n, Err(e) => return Err(e), }; if n == 8 { Ok(buffer) } else { Err(Error::new(ErrorKind::Other, "Could not read 8 bytes!")) } }
We're given a path and we're returning a Result<[u8; 8], io::Error>
. That is, we're going to return Ok
with eight bytes or Err
with the reason we couldn't.
First, we open the path, which might fail. If it does, we return its io::Error. If it succeeds, we unwrap the File and try to read eight bytes from it. If that fails, we return its io::Error. If it succeeds, we check that we got eight bytes. If so, return them in an Ok. If not, return a custom error.
Both of those matches have the same shape: unwrap the Ok
or return early with the Err
. This is so common, that we can replace it with a single question mark!
use std::fs::File; use std::io; use std::io::{Error, ErrorKind}; use std::io::prelude::*; use std::path::Path; fn main() { let path = Path::new("dirk-gently.png"); let first_eight = read_eight_bytes(&path); dbg!(&first_eight); } fn read_eight_bytes(path: &Path) -> Result<[u8; 8], io::Error> { let mut f = File::open(path)?; let mut buffer = [0; 8]; let n = f.read(&mut buffer[..])?; if n == 8 { Ok(buffer) } else { Err(Error::new(ErrorKind::Other, "Could not read 8 bytes!")) } }
Not bad, eh? We're doing all the proper error checking, but it's mostly just the "happy path" showing in our code.
Also, Result<T, io::Error>
is so common that there's a type alias for it, io::Result<T>
. That is, we could replace
#![allow(unused_variables)] fn main() { fn read_eight_bytes(path: &Path) -> Result<[u8; 8], io::Error> { }
with
#![allow(unused_variables)] fn main() { fn read_eight_bytes(path: &Path) -> io::Result<[u8; 8]> { }
if we wanted.
Conclusion
It takes some getting used to perhaps, but Rust's error handling is really pretty nice. It may seem finicky, but at least we don't have to juggle undefined values or exceptions.
Iterators
One of the great things about being a Perl programmer is that we get to read Higher-Order Perl. And one of the great things about Higher-Order Perl is the explanation of iterators.
Rust iterators are a joy to use. They really make Rust fun. If you think a "low-level" systems language is going to be tedious to use, you're in for a treat. If you like to use things like map
and grep
in Perl, you're going to love Rust.
For loops
Earlier, I mentioned revisiting Rust's for loop. Let's do that now. It looks like this
fn main() { let names = vec!["Hank", "Dean", "Brock"]; for name in names { println!("Hello, {}!", name); } }
But this is really syntactic sugar for something like this
fn main() { let names = vec!["Hank", "Dean", "Brock"]; let mut iterator = names.into_iter(); while let Some(name) = iterator.next() { println!("Hello, {}!", name); } }
That is, Rust for
loops are not really a separate thing. They are just an alternate syntax for consuming iterators.
Rust has an Iterator trait that we can use to define our own iterators. Then we can use for
loops on them.
Lazy
Rust iterators are lazy. That means they must be consumed before they do anything. If we write this
fn main() { let names = vec!["Hank", "Dean", "Brock"]; let hellos = names.iter().map(|name| format!("Hello, {}!", name)); println!("{:#?}", hellos); }
then nothing happens. We've set the machine up, but we haven't turned the crank. We have to do something like collect
it into a vector to do that.
fn main() { let names = vec!["Hank", "Dean", "Brock"]; let hellos = names.iter().map(|name| format!("Hello, {}!", name)).collect::<Vec<_>>(); println!("{:#?}", hellos); }
Being lazy also means that iterators can be infinite.
Tests
In Perl, we put our tests in a t
directory and then prove -l
will run them. Similarly, Rust has a tests
directory and cargo test
will run the tests there.
Rust | Perl |
---|---|
tests | t |
cargo test | prove -l |
use testing; | use Test::Most; |
But Rust does this mostly for integration tests. Unit tests are typically put right in the same file as the code.
For example, I wrote a Perl module called Time::Moment::Epoch which converts a bunch of different epoch times to Time::Moment times. It has a bunch of unit tests in t/Time-Moment-Epoch.t.
I wrote a similar Rust crate called epochs, which converts those same epoch times to chrono::NaiveDateTime times. It has similar unit tests at the bottom of the library file itself.
The #[cfg(test)]
macro tells cargo to run those tests when it is called as cargo test
, but to ignore them when it is called as cargo build
or cargo run
.
Update (2024-05-15): I talked about testing at the Rust & C++ Cardiff book club. We are reading Rust for Rustaceans together and I did chapter 6. You can view my slides, some of which have links that I did not follow during the talk.
Documentation
Rust uses markdown pretty much everywhere that we would use POD in Perl.
Rust | Perl |
---|---|
// | # |
//! , /// | =pod |
markdown | POD |
doctests | Test::Doctest |
rustdoc | perldoc |
Rust uses //
for regular line comments, but ///
for markdown comments. We can even use markdown's triple backticks ```
to include Rust source code. When we do this, it automatically becomes a doctest! That is, cargo test --doc
runs the code we include in markdown comments. This helps us keep our docs up to date (cargo test
includes --doc
by default). I think perhaps the worst error in Perl is a SYNOPSIS
section in our POD that doesn't work because the code changed, but the POD didn't. It's annoyingly easy to do.
For example, here is the apfs
function from the epochs crate.
#![allow(unused_variables)] fn main() { /// APFS time is the number of nanoseconds since the Unix epoch /// (*cf.*, [APFS filesystem format](https://blog.cugu.eu/post/apfs/)). /// /// ``` /// use epochs::apfs; /// let ndt = apfs(1_234_567_890_000_000_000).unwrap(); /// assert_eq!(ndt.to_string(), "2009-02-13 23:31:30"); /// ``` pub fn apfs(num: i64) -> Option<NaiveDateTime> { epoch2time(num, 1_000_000_000, 0) } }
It contains a description of the function as well as a doctest. The compiler itself ignores it, like any other comment, but rustdoc
will grab that description and run that test. Currently, rustdoc only generates HTML output that we must then view in a browser. There is no command line tool for looking up things in the docs like perldoc yet.
I've documented all of my functions in a similar way, so rustdoc generates this whole page for me. I didn't write that page, it was all generated from the source code when I uploaded it to crates.io!
Now, ///
documents whatever immediately follows it, like the function above. There is also //!
which goes after. That summary line for the whole crate came from one of those at the top.
Next
If you want to learn more, you already have everything you need. The documentation that comes with Rust is truly outstanding. In particular, there is the book.
The Rust Programming Language (aka, "the book") is a free guide to the language that comes with it. It is remarkably well done. Whether you read it cover-to-cover or jump around haphazardly, you should definitely read it. You can view a local copy with rustup docs --book
, even when you're offline. You can also buy an ink-on-paper copy if you prefer.
You can see lots of other great docs that come with Rust by running rustup docs
.
Other books
-
I really liked Programming Rust: Fast, Safe Systems Development by Jim Blandy and Jason Orendorff. I read the first edition, but now there's a second edition with a third author, Leonora Tindall.
- Jim Blandy was on the CoRecursive podcast twice.
- Jason Orendorff gave a really fun talk at Rust Belt Rust 2017.
- Leonora Tindall was on the Building with Rust podcast.
-
I started reading Rust in Action by Tim McNamara in the electronic early-access version, but it's out now. It takes a project-based approach, which is a little different.
- Tim McNamara gave a nice talk at Linux Conf Australia 2020.