- Chapter 1. Getting Started
- Chapter 2. A real (small) program
- Chapter 3. Normal Programming Stuff
- Chapter 4. Ownership
- Chapter 5. Structs
- Chapter 6. Enums and
match
- Chapter 7. Crates, packages and modules
- Chapter 8. Vectors, Strings and Hash Map Collections
- Chapter 9. Errors
- Chapter 10. Reference lifetimes, generics and traits
- Chapter 11. Testing
My journey learning the Rust programming language. Examples are probably not my own, but might be.
curl --proto 'https' --tlsv1.3 https://sh.rustup.rs -sSf | sh
This command is used to download and install Rust via curl
, a command-line tool for transferring data with URLs. It fetches a script and executes it immediately with sh
, the Unix shell.
Let's break down this command:
-
curl
: The command itself, a tool for transferring data from or to a server, using the https protocol. -
--proto 'https'
: This option tellscurl
to use only the HTTPS protocol. It restrictscurl
from attempting to use any other protocol that might normally be attempted in other circumstances. -
--tlsv1.3
: Specifies thatcurl
should use TLSv1.3 as the cryptographic protocol for secure communication.
TLS (Transport Layer Security) v1.3 is the latest version that provides security improvements over previous versions.
-
https://sh.rustup.rs
: This is the URL from whichcurl
will fetch data. In this case, it's a script provided by the Rust language maintainers to installrustup
, the Rust toolchain installer. -
-sSf
: These are options combined together and passed tocurl
:-s
or--silent
: Silent mode. Don't show progress meter or error messages. Makes Curl mute.-S
or--show-error
: When used with-s
, it makescurl
show an error message if it fails.-f
or--fail
: Tellscurl
to fail silently on server errors (when HTTP servers return a 4xx or 5xx error), preventing scripts or other erroneous data from being executed or processed if the requested URL points to an error page.
-
| sh
: This part is known as a pipe (|
). It takes the output of the preceding command (in this case, the script downloaded bycurl
) and passes it as input to thesh
command, which is the command interpreter (or shell) that executes the script.
The overall command fetches the rustup
installation script securely using HTTPS and TLSv1.3, and if successful, passes the script directly to the shell for execution. The use of -sSf
ensures that the operation proceeds quietly but will show an error if something goes wrong, helping to maintain the cleanliness of the output and the security of the operation.
For more, see the Rust installation docs.
Let's create the classic "Hello, world!" program.
The origin of "Hello, World!" can be traced back to the seminal book The C Programming Language by Brian Kernighan and Dennis Ritchie, published in 1978. This book, often referred to simply as "K&R," was instrumental in popularizing the C programming language and served as its de facto standard for years.
Assuming you're in a directory in which you're tracking your Rust learning, mkdir hello_world && cd hello_world
to create a directory hello_world
and move into it immediately after creation.
Now create a file main.rs
inside of this directory. From now on, we'll use vIM since it's the editor I use.
vim main.rs
Now inside that file, type the following:
fn main() {
println!("Hello, world!");
}
Format it:
rustfmt main.rs
And compile then run it:
rustc main.rs && ./main
You should see Hello, world!
printed in your terminal.
Read more about what's happening under the hood in the docs.
Most Rust developers (Rustaceans) use the language's built-in package manager and build system called Cargo to build "real" programs within Rust.
So let's move from what we were working on to try out cargo, making the directory hello_cargo
in the process: cd ../ && cargo new hello_cargo && cd hello_cargo
.
Now open the Cargo.toml
file via vim Cargo.toml
. You'll see something like:
[package]
name = "hello_cargo"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
As of April 1st, 2024, there are "three Rust editions are available: Rust 2015, Rust 2018, and Rust 2021. This book is written using Rust 2021 edition idioms." - read the docs.
The [dependencies]
section heading is where you'll add dependencies for your program, which you'll most certainly have writing anything of substance in the real world.
Close this file and notice that there's a src/main.rs
path/file that the cargo new
command created. This is the same Hello, world!
program as before. Let's build it with cargo
:
cargo build
Now let's run it:
./target/debug/hello_cargo
Let's do the same thing in one command:
cargo run
Use
cargo check
to compile the code without outputing an executable, as it is much faster than a full compilation of the project!
Releases are simple and only require the --release
flag:
cargo build --release
Now you'll find the executable in ./target/release/hello_cargo
instead of ./target/debug/hello_cargo
.
Rather than directly rehash what the official Rust book covers, let's dive directly into the full code example.
Start by running cargo new guessing_game && cd guessing_game
.
Then open src/main.rs
and add the following code.
use rand::Rng;
use std::cmp::Ordering;
use std::io;
fn main() {
println!("👋 Welcome to 'Guess the number!'");
let secret_number = rand::thread_rng().gen_range(1..=100);
loop {
println!("Input your guess:");
let mut guess = String::new();
io::stdin()
.read_line(&mut guess)
.expect("Failed to read line");
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
match guess.cmp(&secret_number) {
Ordering::Less => println!("⬆️ Guess a bigger number, yours was too small!"),
Ordering::Greater => println!("⬇️ Guess a smaller number, yours was too big!"),
Ordering::Equal => {
println!("🎯 Spot on!");
break;
}
}
}
}
If you run this you'll get an error because rand
is not a dependency yet.
To make a random number that we have to guess we should add the rand
crate from the Rust team.
Run the following or edit your Cargo.toml
directly:
cargo add rand@=0.8.5
Compile and run the program with the usual cargo run
command.
The first three lines import dependencies for the program.
use rand::Rng;
use std::cmp::Ordering;
use std::io;
Both cmp::Ordering
and io
are from the Standard Library, as you can see using the std::
to import them.
We import the Rng
trait from the rand
crate we previously added as a dependency.
Inside the main
function we then print out a welcome message and use the Rng
trait from the rand
crate to generate a random integer between 1 and 100 called secret_number
.
Next you see the loop
keyword which creates an infinite loop for the user to input a guess
and compare that guess with the secret_number
created above.
Inside the loop we first instantiate the guess
variable as a string, which we then read from the standard input:
let mut guess = String::new();
io::stdin()
.read_line(&mut guess)
.expect("Failed to read line");
Now our guess
variable is a string and we need to compare that to the secret_number
variable, a numeric type. You can see this if you hover over secret_number
in your editor.
To covert guess
to an unsigned 32-bit integer (this is 1-100, afterall), we then do the following:
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
For now we'll leave out an explanation of the trim
and parse
methods, as it's somewhat self-explanatory.
Next we use the cmp
module's Ordering
enum to match the user input guess
with the generated random number secret_number
:
match guess.cmp(&secret_number) {
Ordering::Less => println!("⬆️ Guess a bigger number, yours was too small!"),
Ordering::Greater => println!("⬇️ Guess a smaller number, yours was too big!"),
Ordering::Equal => {
println!("🎯 Spot on!");
break;
}
}
So let's dig into some of this code further and alter a few things, for the sake of example.
TO start, comment away the following code, save your file, and attempt to compile the program.
//let guess: u32 = match guess.trim().parse() {
// Ok(num) => num,
// Err(_) => continue,
//};
You'll notice an error right away that looks something like the below:
cargo run
Compiling guessing_game v0.1.0 (/home/jason/repos/learning-rust/guessing_game)
error[E0308]: mismatched types
--> src/main.rs:24:25
|
24 | match guess.cmp(&secret_number) {
| --- ^^^^^^^^^^^^^^ expected `&String`, found `&{integer}`
| |
| arguments to this method are incorrect
|
= note: expected reference `&String`
found reference `&{integer}`
note: method defined here
--> /home/jason/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/cmp.rs:815:8
|
815 | fn cmp(&self, other: &Self) -> Ordering;
| ^^^
For more information about this error, try `rustc --explain E0308`.
error: could not compile `guessing_game` (bin "guessing_game") due to 1 previous error
This tells you that you have error[E0308]: mismatched types
and then spits those out exactly:
24 | match guess.cmp(&secret_number) {
| --- ^^^^^^^^^^^^^^ expected `&String`, found `&{integer}`
= note: expected reference `&String`
found reference `&{integer}`
Expected &String
, found ${integer}
. in match guess.cmp(&secret_number)
.
Ah yes, that's right, without converting guess
into an integer (which we commented away) we get a comparison error because &secret_number
is a reference to an integer ${integer}
!
Notice how amazing Rust is with strong static typing, forcing you to fix the error before the program will run. And to boot, its errors are incredibly clear compared with many other statically typed languages, such as C and C++.
Now while we're on it, this guess
variable was declared twice. What's up with that? Well Rust has a feature called shadowing which allows the developer to reuse the variable name.
Let's mess with guess
some more. When you shadow the guess
variable, annotating its type to u32
, what happens to the type of secret_number
?
First, hover over secret_number
and notice that your editor says something like
// size = 4, align = 0x4
let secret_number: u32
Notice that this is the same type as guess
after you annotate it. Now change the annotation on guess
to i32
and hover over secret_number
again (you may need to save your file).
You get something like the following:
// size = 4, align = 0x4
let secret_number: i32
Notice that Rust automagically inferred the type change. Cool, right?
Staying on the guess
shadowing, notice that we have a match and then some curly brackets with Ok
and Err
inside of them.
This is how we handle errors in Rust. parse
returns a Result
type which is an enum that has the variants Ok
and Err
. If the user input is a valid number then it will match Ok
and simply return the number. If the user input is not a valid number the Err(_)
catchall simply returns continue
, which means ignore the error and loop again (asking the user for another number).
Here I suggest a thorough reading of Chapter 3 of The Rust Programming Language, as I won't dive too in-depth on anything covered. This chapter mainly covers standard idioms modern languages have on a high-level basis, setting up further exploration later.
For now, let's record a few facts about the Rust programming language that are useful at this stage.
By default, variables are immutable in the Rust programming language. If you want to change them, you need to specify that to the compiler using mut
.
For example,
let x: u32 = 42;
x = 41; // will not compile since you cannot re-assign to the immutable x
Here's how to make that mutable:
let mut x: u32 = 42;
x = 43;
Rust has constants as well. Simply use
const
andAN_ALL_CAPS_VARIABLE_NAME
.
As mentioned before, the language allows developers to shadow variable. If you simply reuse let
the variable can be shadowed.
Shockingly, the following will compile and the compiler will forget the first value of x
.
let x = 42;
let x = 42 + 42; // 2 * (meaning of life)
The variable x
is still immutable and the compiler will complain if we assign to it without using let
.
Rust has integers, floating point numbers, booleans and a character type. Read about those via the docs.
Rust will throw a compiler error when you compile your program for debug
that will overflow, e.g. when using cargo run
.
Here's what that looks like:
let overflower: u8 = 4200;
println!("{}", overflower);
cargo run
...
error: literal out of range for `u8`
...
Yet again, the lane bumpers save us. However, what happens for release builds? Rust performs two's complement wrapping. All that means is that the value wraps around to 0 after the max has been reached.
For example, for a u8
type, 256 becomes 0, 257 becomes 1 and 258 becomes 2. Easy.
Do not rely on this behavior to write your programs. That's considered a design error.
String literals and character literals aren't the same. char
literals start with a single quote while string literals a double quote.
The char
type in Rust represents a Unicode scalar value so you can display accented letters, Japaneses, Chinese, Arabic and emojis. Anything UTF-8.
As usual, the tuple type is immutable and can hold different types in the same tuple.
struct Apple;
struct Orange;
struct ResponsiblySourcedTrout;
let tup: (Apple, Orange, u32, ResponsiblySourcedTrout) = (Apple, Orange, 3, ResponsiblySourcedTrout);
The array is a fixed size array you are probably used to from other languages. Here's what that looks like with some syntax sugar, too.
let my_array = [1, 2, 3];
let my_array = [3; 1028]; // 1028 elements all with the value of 3
Rust is an expression-based language. We'll skip over basic "statements" like let x = 42
and jump right to the meat on the bone.
An "expression" is a fundamental concept that represents a sequence of operations that computes a value. Expressions can consist of literals, variable references, operators, function calls, and control flow constructs among other components. Unlike statements, which perform actions but do not necessarily return a value, expressions always evaluate to a value and can be a part of other expressions.
Let's gloss over a few examples. Here's a simple one, where the block that evaluates to 42 is the expression.
let the_meaning_of_life = {
42
};
And if we use the println!
macro to print it, that's an expression too!
println!("{}", the_meaning_of_life)
Of note, expressions don't have semicolons after them. Adding a semicolon makes it a statement. Nuff said.
Ownership is a concept that is somewhat unusual and not present in other languages. Of the ones I know well -- Python, C++, JavaScript (TypeScript), Matlab, R -- I've been exposed to something like this only with respect to Smart Pointers within the C++ standard library.
So what is ownership?
"Ownership is a set of rules that govern how a Rust program manages memory. [Rust] memory is managed through a system of ownership with a set of rules that the compiler checks. If any of the rules are violated, the program won't compile. None of the features of ownership will slow down your program while it's running." - Page 59 of the Rust Programming Languages, 2nd Edition
Without diving into the details of the modern computing machine, a Rust programmer needs to pay attention to and consider the stack versus the heap.
Let's hit up two heuristic ways to think about these things.
The stack is like a literal stack of empty, ready-to-use pizza boxes. Assume you make pizzas and the boxes are next to you. You just pulled a piping hot pepperoni pie out of the oven and need to put it somewhere. Well you grab a box right off the top of the pile next to you and plop that fresh pie right in it.
The stack is LIFO - last-in, first-out (like the pizza boxes on top).
The heap is like a giant apple bobbing bucket full of water where all the apples are your data. When you allocated to the heap it's like putting your data in the bucket and it's floating. But imagine that it has an address carved into it like 0x2001FFE4
.
0x2001FFE4
is super boring, right? I agree. But if we didn't use hexadecimal, we'd run out of literal bit space for numbers quickly. Hell, 0x2001FFE4 is 537,001,956 in decimal representation. Every single thing every single program does has one of these. Imagine how quickly that runs up.
Because the allocator can just store something on the top, the stack is much faster at storing data than the heap, where the allocator has to search for a spot in the bucket to put its apples (figuratively, of course).
I like to think of things like this: if I can statically allocate space it's vastly faster. But sometimes we need to dynamically allocate space, such as with user input.
Yes, writing a program to input what things Donald Trump says would use a lot of heap allocation.
Rust automatically returns memory once the variable that owns that memory goes out of scope. It has no garbage collector and you don't need to manage the memory yourself.
Yes, this is like bowling with the bumpers off but you never go into the gutter.
You might recognize a pattern like this from C++ smart pointers called RAII - Resource Acquisition Is Initialization.
I know, I know...That's a Microsoft link. Use linux. Fck MSFT. Nah, they're cool. Cálmate! Satya made the company awesome.
Taking all the details out of it, which you should totally read from the actual proper rust book, let's dive into what you have to know.
-
Primitive data types, like
let x = 5
orlet y = [3; 5]
, are on the stack and get copied when you do things like assign them to other variables. -
Complex and dynamic data types, on the other hand, are
drop
ed at the end of their lifetime, since they live on the heap.
// Stack-allocated variables
let x: i32 = 10; // `x` is an integer stored directly on the stack.
let y: bool = true; // `y` is a boolean value, also stored on the stack.
// Heap-allocated
let mut vec: Vec<i32> = Vec::new(); // `vec` is a struct with a pointer, length, and capacity, all of which are stored on the stack. The actual data of the vector is stored on the heap.
vec.push(42); // Add an element to the heap-allocated vector.
println!("Heap-allocated vector accessed through a stack-allocated struct: {:?}", vec);
Read this chapter thrice.
Creating a reference is called borrowing in Rust. It's really, really important. You do it with placing an ampersand before a variable, e.g.
fn myfunc(my_string: &str) {}
It's said best in the book so here goes:
"As in real life, if a person owns something, you can borrow it from them. When you’re done, you have to give it back. You don’t own it."
You can't modify things you borrow, you can only use them, unless the owner knows (and agrees that) you're going to modify things.
Here's an example of a mutable reference:
let mut nose = String::from("my nose");
operate_on_nose(&mut nose);
fn operate_on_nose(something_on_which_to_perform_plastic_surgery: &mut String) {
something_on_which_to_perform_plastic_surgery.push(" is different now");
}
A unique type of reference in Rust is the slice which is a representation of a sequence of elements that are contiguous.
You might be familiar with this concept from Python or other languages. Let's look at syntax and save the rest for later. In the meantime, read the actual book on the topic.
// Slice example
let s = String::from("hello world");
let hello = first_word(&s);
fn first_word(s: &str) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[..i];
}
}
&s[..]
}
Structs hold related attribute data where the attributes have names, similar to tuples but named, so that it's clear what the value(s) of each attribute means.
Here's what a struct looks like:
#[derive(Debug)] // this is for printing using "{:?}" or "{:#?}"
struct DisplayAd {
start_timestamp: i64,
budget: u32,
title: String,
copy: String,
call_to_action: String,
media_asset_urls: Vec<String>,
button_text: String,
target_url: String,
}
The DisplayAd
struct above can be used like the below.
use chrono;
fn main() {
let start_timestamp: i64 = chrono::Utc::now().timestamp();
let mut my_ad = DisplayAd {
start_timestamp,
budget: 5000,
title: String::from("My first ad"),
copy: String::from("Buy whatever I'm selling. It's great!"),
call_to_action: String::from("On sale today only!"),
button_text: String::from("Buy now"),
target_url: String::from("https://tincre.com/agency"),
media_asset_urls: vec![String::from("https://https://res.cloudinary.com/tincre/video/upload/v1708121578/nfpwzh1oslr8qhdyotzs.mov")],
}
}
There are a few things happening in the above. First of all, the usage of mutability mut
is arbitrary and not required. Secondly, we're using field init shorthand syntax to list the parameter/field name start_timestamp
.
We can only use field init shorthand when the variable name and the struct field name are exactly the same.
We can use some more shorthand syntax for instantiating another DisplayAd
, which has some quirks we'll cover.
let mut my_ad2 = DisplayAd {
start_timestamp,
budget: 1250,
title: my_ad.title,
call_to_action: my_ad.call_to_action,
button_text: String::from("Don't use it"),
target_url: String::from("https://truthsocial.com"),
..my_ad // no comma after this
}
Now firstly, notice that we used my_ad.title
, the title
field from the first my_ad
DisplayAd
instantiation. Importantly, when we do this, the ownership for my_ad.title
is moved to my_ad2.title
. That means you can't use my_ad
anymore!
Secondly, at the very end we use struct update syntax to add the remaining items from my_ad
that we didn't specify. This must come last and cannot have a trailing comma.
We can also instantiate structs without field names. For example,
struct Coordinates(f64, f64);
fn main() {
let location = Coordinates(19.3937, 99.1746);
}
These can be useful if you want a tuple that comes with all the other goodies of structs.
One of the goodies that structs provide is the ability to place a method expression inside of them, just like a function but only within the context of the struct definition.
This is available for traits and enums, too.
This can help massively with readability. Here's an example, assuming our DisplayAd
struct from above.
const AVG_CPM: f64 = 3.2;
impl DisplayAd {
fn calculate_estimated_impressions(&self) -> f64 {
(self.budget as f64 / AVG_CPM) * 1000
}
}
fn main {
println!("{my_ad2.calculate_estimated_impressions()} impressions are expected for spend of {my_ad2.budget} USD")
}
We call these associated functions in the Rust language. It's common for a struct to implement a new
function that creates the struct. All the usual borrowing/ownership rules apply.
In covering enums we'll stick with our ads modeling from above. So what's an enum, you ask?
An enum (short for enumeration) in Rust allows you to define a type by enumerating its possible values. Each of these possible values is known as a variant. Variants of an enum can carry data (similar to fields in a struct) and can have different types and amounts of associated data.
Basically, use enums when you want to model the context of your data and can enumerate it. Then use structs to actually hold that data.
Revisiting our DisplayAd
struct from above, the Ad
enum below shows how we might use it and other structs.
enum Ad {
Display(DisplayAd),
Hover(HoverAd),
Feed(FeedAd),
Video(VideoAd),
InlineText(InlineTextAd),
}
Here we've added other *Ad
structs and enumerated them inside an enum. So we can use the Ad
enum and reason about what kind of ad we're dealing with, having the data separate from the actual reasoning mechanism itself.
Note: you don't have to use structs to store data inside an enum. You can store it directly. Here's what that looks like:
enum Fruit {
Apple(String),
Grapes(String),
}
fn main {
let washington_apple = Fruit::Apple(String::from("Washington"));
let green_grapes = Fruit::Grapes(String::from("Green"));
let red_grapes = Fruit::Grapes(String::from("Red"));
}
Back on our struct to model types of Ads, one advantage is that we can write methods, like with structs, but that operate on all the different types of ads.
And inside those methods we can use an extremely powerful control flow in Rust called match
, which allows you to execute code based on pattern matches, made up:
- Literals
- Destructured arrays, enums, structs, or tuples
- Variables
- Wildcards
- Placeholders
This is directly from the Rust book section.
So let's use methods with match
to do some setup for our Ad
s.
impl Ad {
fn init(&self) {
match self {
Ad::Display(ad) => {
println!("initializing: {:#?}", ad);
send_notification(&ad.title)
}
Ad::Hover(ad) => {
println!("initializing: {:#?}", ad);
send_notification(&ad.title)
}
Ad::Feed(ad) => {
println!("initializing: {:#?}", ad);
send_notification(&ad.title)
}
Ad::Video(ad) => {
println!("initializing: {:#?}", ad);
send_notification(&ad.title)
}
Ad::InlineText(ad) => {
println!("initializing: {:#?}", ad);
send_notification(&ad.title)
}
}
}
}
fn main {
let start_timestamp: i64 = chrono::Utc::now().timestamp();
let my_display_ad = Ad::Display(DisplayAd {
start_timestamp,
budget: 5000,
title: String::from("My first ad"),
copy: String::from("Buy whatever I'm selling. It's great!"),
call_to_action: String::from("On sale today only!"),
button_text: String::from("Buy now"),
target_url: String::from("https://tincre.com/agency"),
media_asset_urls: vec![String::from("https://https://res.cloudinary.com/tincre/video/upload/v1708121578/nfpwzh1oslr8qhdyotzs.mov")],
});
my_display_ad.init();
// and another ad
let my_text_ad = Ad::InlineText(InlineTextAd {
start_timestamp,
budget: 5000,
title: String::from("My first ad"),
copy: String::from("Buy whatever I'm selling. It's great!"),
call_to_action: String::from("On sale today only!"),
target_url: String::from("https://tincre.com/agency"),
});
my_text_ad.init();
}
We created an init
method on the Ad
enum type that match
es the corresponding ad struct that actually holds our data. Now we have two ads ready to rock and initialized custom to the kind of ad each represents.
Rust has a built-in enum called Option<T>
to represent the presence or absence of value. It is designed to avoid null references, a common source of errors in other programming languages. It has two variants: Some(T)
, indicating the presence of a value of type T
, and None
, indicating the absence of a value.
Along with control flow like match
this can be very useful. For example,
fn did_eat_fruit(fruit: Option<&str>) -> bool {
match fruit {
None => false,
_ => true,
}
}
fn main() {
let apple = Some("Apple");
let banana = None;
let monkey_eating_status = if did_eat_fruit(apple) {
"ate"
} else {
"did not eat"
};
println!("The monkey {monkey_eating_status}.");
let monkey_eating_status = if did_eat_fruit(banana) {
"ate"
} else {
"did not eat"
};
println!("The monkey {monkey_eating_status}.");
}
If we didn't handle the None
case in the function the compiler would have screamed at us before we compiled and caused a runtime bug. This is a fantastic safety feature of Rust. It forces the developer to handle the type explicitly, always.
Aside from a cursory review and demonstration here, we'll dive into these concepts in more depth with an actual project, later.
In a nutshell, Rust's module system consists of (these are directly from the docs):
- Packages: A Cargo feature that lets you build, test, and share crates
- Crates: A tree of modules that produces a library or executable
- Modules and use: Let you control the organization, scope, and privacy of paths
- Paths: A way of naming an item, such as a struct, function, or module
The smallest compiled piece of code Rust can consider is called a crate, which can contain modules, which can be defined in other files or places in the codebase that get compiled with the crate.
There are two types of crates, binary and library crates. A package can, and often does, have both.
A binary crate has a main
function that you can use to actually run an executable, typically located inside <my-project-name>/src/main.rs
.
You create a package using a binary crate via the cargo new
command, e.g. cargo new <my-project-name>
.
Library crates define functionality to be used elsewhere. Often these are published to crates.io to be shared publicly.
You create a package with a library crate using the --lib
flag with Cargo's new
command, e.g. cargo new <my-project-name> --lib
.
This creates a file under <my-project-name>/src/lib.rs
.
The Rust standard library has a number of collections available for use, data structures that store data on the heap and can be grown or shrunk during runtime.
Vectors are datastructures that store multiple values next to each other in memory. You should use them when you have a list of things to store.
Below are some snippets demonstrating how to use them.
// create and push to the vector
let mut my_vec: Vec<u8> = Vec::new();
my_vec.push(0);
my_vec.push(1);
my_vec.push(1);
my_vec.push(0);
println!("{:?}", my_vec);
Use the convenience macro:
let my_vec2: Vec<u8> = vec![0, 1, 1, 0];
Vectors can also take types stored in enums.
#[derive(Debug)]
pub struct TextAd {
ad_text: String,
budget: u32,
target_url: String,
}
#[derive(Debug)]
pub struct VideoAd {
ad_title: String,
budget: u32,
target_url: String,
}
impl TextAd {
pub fn new(ad_text: String, budget: u32, target_url: String) -> TextAd {
TextAd {
ad_text,
budget,
target_url,
}
}
}
impl VideoAd {
pub fn new(ad_title: String, budget: u32, target_url: String) -> VideoAd {
VideoAd {
ad_title,
budget,
target_url,
}
}
}
#[derive(Debug)]
pub enum Ad {
Text(TextAd),
Video(VideoAd),
}
fn main() {
let mut ads = vec![
Ad::Video(VideoAd::new(
String::from("Test video title"),
1000,
String::from("https://tincre.com"),
)),
Ad::Text(TextAd::new(
String::from("Test text"),
1250,
String::from("https://tincre.com/agency"),
)),
];
println!("{:?}", ads);
}
Strings in Rust may seem strange to those coming from dynamic languages such as Python or JavaScript. If coming from C-family languages, the way Rust treats strings may seem refreshing, as C-family developers consistently deal with the complexities a "string" presents.
Rust has the primitive char
type which is defined to represent a Unicode scalar value. It is always 4 bytes long and its syntax is represented by two enclosed single quotes '
, e.g. 'c'
.
However, the char
type is not how strings are represented in Rust; a String is better thought of as a vector, in fact, a vec<u8>
with some extras and restrictions.
You can create a string using the familiar ::new
or the string-specific ::from
functions, if you'd like to create your string from a string literal directly.
fn main() {
let mut s1 = String::new();
s1.push_str("Hello, ");
let s2 = String::from("world!");
}
String literals also have a
to_string()
method developers can use to return a String.
Similarly to Vec<T>
a String can have modifiable sizes and contents. It's important to remember that borrow and move operations apply here. The example below demonstrates this, asuming our s1 and s2
variables from directly above.
fn main() {
let s3 = s1 + &s2; // note s1 has been moved and is no longer be usable
println!("{}", s3);
//println!("{}", s1);
}
If you uncomment the //println! in your editor you'll see something like the screenshot below.
fn main() {
let mut s = format!("{s3} You are crazy, {s2}");
let not_owned = "blah";
s.push_str(not_owned);
println!("Pushed: {}", s);
println!("This isn't owned: {}", not_owned);
}
You can also slice a string to get particular bytes:
fn main() {
let slice_entire = &s3[..];
// Borrow a reference to part of the String
let slice_part = &s3[0..5];
println!("Entire slice: {}", slice_entire); // Prints "Hello, world!"
println!("Part of slice: {}", slice_part); // Prints "Hello"
}
Be very careful using ranges to index strings because these can become out of bounds.
In particular with regard to slicing, you can't index a String in Rust.
If you want to operate on pieces of String collections you should use iterators, of which there are two chars
and bytes
.
fn main() {
for c in not_owned.chars() {
println!("Character: {}", c);
}
for b in not_owned.bytes() {
println!("Bytes: {}", b);
}
}
Key-value storage in Rust is typically accomplished with the HashMap
standard libary collection. Different from Vector
s and String
s you need to first use
the collection.
For example,
use std::collections::HashMap;
Specifically, HashMap<K, V>
maps keys (of type K
) to values (type V
) using the DoS-proof hashing function SipHash, created in 2012 after a slew of attacks on hash tables.
We create hash maps via the standard ::new
constructor.
use std::collections::HashMap;
fn main() {
let mut prices = HashMap::new();
}
Let's insert some stock ticker symbols and fake prices.
fn main() {
let stock_ticker_1 = "AAPL";
prices.insert(stock_ticker_1, 163.23);
prices.insert(stock_ticker_1, 163.23);
prices.insert("GILD", 66.76);
}
These are real tickers for Apple (AAPL) and Gilead Sciences (GILD).
Now let's extract those values and do something with them, like print to the console.
fn main() {
let ticker_symbol = "GILD";
let gild_price = prices.get(&ticker_symbol).copied().unwrap_or(0.0);
println!("{}: {}", ticker_symbol, gild_price);
}
When you update values in a Rust HashMap
you need to choose what you want to happen.
You can choose to overwrite the value, insert a standard value or do nothing if there's something there already, or modify the value present in some way.
Overwriting values is simple. The hash map simply takes the last value given, in the "overwriting" case.
fn main() {
prices.insert("GILD", 66.77);
}
A common pattern is to insert a default value only when a value is not present, otherwise leaving the current value alone.
For our ticker case, imagine that there's a vector of enums that hold a string timestamp and a HashMap
of our tickers plus the price. This might be a nice way to organize data from various exchanges available.
In this case if we want, we can set a default value for the price, e.g. 0.
Note setting default 0s would be terrible practice in actual financial engineering applications!
fn main() {
prices.entry("AAPL").or_insert(0.0);
}
Now let's add a penny to a price.
fn main() {
let appl_price = prices.entry("AAPL").or_insert(0.0);
*aapl_price += 0.01;
}
The Rust book provides three suggested projects since collections have been reviewed, as these tools allow developers to make much more complex programs.
-
Given a list of integers, use a vector and return the median (when sorted, the value in the middle position) and mode (the value that occurs most often; a hash map will be helpful here) of the list.
-
We'll also include the mean here, both arithmetic and geometric.
- Convert strings to pig latin. The first consonant of each word is moved to the end of the word and “ay” is added, so “first” becomes “irst-fay.” Words that start with a vowel have “hay” added to the end instead (“apple” becomes “apple-hay”). Keep in mind the details about UTF-8 encoding!
- Using a hash map and vectors, create a text interface to allow a user to add ticker symbols to a portfolio in a fund. For example, “Add AAPL to Alpha Fund I” or “Add GILD to Global Value Fund II.” Then let the user retrieve a list of all tickers in a portfolio or all tickers in the fund by portfolio name, sorted alphabetically.
I edited the original project suggestion to be about stocks in portfolios in a fund, rather than employees in a department in a company.
Rust employs a modern approach to handling errors by splitting them into recoverable and unrecoverable error types.
Normally, languages have exceptions but do not distinguish from these two. However, requiring that developers handle errors at compile time leads to more robust programs that are better for the end-user.
One of Rust's remarkable features is its ability to minimize undefined behavior which you might be familiar with from C-family languages. Handling unrecoverable errors with the panic!
macro is one way that the language achieves this software safety.
When rust panics it terminates the program and fully unwinds the stack. Though there are ways to stop this stack-unwinding behavior, we'll skip that for now.
panic!
is just a macro so let's call it.
fn main() {
panic!("Crash the program!");
}
Here's what this displays in the terminal when we run the program:
RUST_BACKTRACE=1 cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/errors_uncrecoverable`
thread 'main' panicked at src/main.rs:2:5:
Crash the program!
stack backtrace:
0: rust_begin_unwind
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:647:5
1: core::panicking::panic_fmt
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/panicking.rs:72:14
2: errors_uncrecoverable::main
at ./src/main.rs:2:5
3: core::ops::function::FnOnce::call_once
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
When the compiler encounters this macro it stops the program and unwinds the stack from top to bottom, cleaning up memory that the program used along the way. In my opinion, this is vastly better than with C-family languages that leaves the behavior undefined and often requires the operating system to clean up the broken pieces of the program's runtime.
Let's create another type of error and watch it panic, with the backtrace included. For this, we'll divide by zero, undefined behavior even in mathematics!
#[allow(unconditional_panic)]
fn main() {
let y = 0;
let should_panic = 1 / y;
println!("{}", should_panic);
}
The linter will catch this if you are using a modern version of Rust (I'm using 1.77.1) so we need to disable it with
#[allow(unconditional_panic)
.
The panic:
RUST_BACKTRACE=1 cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/errors_uncrecoverable`
thread 'main' panicked at src/main.rs:6:24:
attempt to divide by zero
stack backtrace:
0: rust_begin_unwind
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:647:5
1: core::panicking::panic_fmt
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/panicking.rs:72:14
2: core::panicking::panic
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/panicking.rs:144:5
3: errors_uncrecoverable::main
at ./src/main.rs:6:24
4: core::ops::function::FnOnce::call_once
at /rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
As you can see, our attempt to divide by zero didn't work very well, however, the panic saved us from gumming up the operating system with dirty bits in memory.
To be clear, panic!
should not be used within a program to protect against run time errors explicitly.
Rust's other error type is called recoverable where a function propagates handle-able errors up to the caller via the Result<T, E>
enum that you may recall from earlier.
As a refresher, that enum is defined as follows:
enum Result<T, E> {
Ok(T),
Err(E),
}
In particular, recoverable errors are appropriately used when the caller can be expected to handle the error and do something useful in response. This is in contrast to panic!
from above, which should be used when it is reasonable for the program to completely quit, limiting the caller's ability to handle or do anything further.
Rust's matching system was designed to handle errors safely in a first-class manner.
Let's look at the following function which propagates its errors up to the caller and how matching makes this easy to use and read.
use std::io::{self, };
fn validate_username(username: &str) -> Result<String, io::Error> {
match username.is_ascii() {
true => Ok(username.to_string()),
_ => Err(io::Error::new(
io::ErrorKind::InvalidInput,
"Username must be ASCII",
)),
}
}
The function above checks if a username is valid ASCII and returns an okay Result
with the username as a string or an error indicating the problem.
Naively we could just handle this and print out the issue to the user, e.g.
fn main() {
let username = "my_username";
match validate_username(username) {
Ok(_) => {
print!("{} username is good to go!", username);
}
Err(e) => {
print!("{} username is not valid: {}", username, e);
}
}
}
Good output looks like
❯ cargo run
Compiling errors_recoverable v0.1.0 (/home/jason/repos/learning-rust/errors_recoverable)
Finished dev [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/errors_recoverable`
my_username username is good to go!
And bad output like
❯ cargo run
Compiling errors_recoverable v0.1.0 (/home/jason/repos/learning-rust/errors_recoverable)
Finished dev [unoptimized + debuginfo] target(s) in 0.09s
Running `target/debug/errors_recoverable`
my_username ❤️ username is not valid: Username must be ASCII
That's okay but imagine that we're actually taking this input from the user inside of a loop. Wouldn't it be nice to do something such as
fn main() {
let username = "my_username ❤️";
match validate_username(username) {
Ok(_) => {
print!("{} username is good to go!", username);
}
Err(e) => match e.kind() {
io::ErrorKind::InvalidInput => {
// ask the user for another user name
println!("Handling InvalidInput error:");
println!("{} username is not valid: {}", username, e);
}
_ => {
println!("{} username is not valid: {}", username, e);
}
},
}
}
See how we handle that InvalidInput
error type explicitly?
❯ cargo run
Compiling errors_recoverable v0.1.0 (/home/jason/repos/learning-rust/errors_recoverable)
Finished dev [unoptimized + debuginfo] target(s) in 0.09s
Running `target/debug/errors_recoverable`
Handling InvalidInput error:
my_username ❤️ username is not valid: Username must be ASCII
And it shows the additional input error on the console.
The ?
operator allows us to handle much of the boilerplate associated with propagating errors up the call stack.
Take, for example, the following simplification.
use std::fs::File;
use std::io::prelude::*;
use std::io;
struct Info {
name: String,
age: i32,
rating: i32,
}
fn write_info(info: &Info) -> io::Result<()> {
// Early return on error
let mut file = match File::create("my_best_friends.txt") {
Err(e) => return Err(e),
Ok(f) => f,
};
if let Err(e) = file.write_all(format!("name: {}\n", info.name).as_bytes()) {
return Err(e)
}
if let Err(e) = file.write_all(format!("age: {}\n", info.age).as_bytes()) {
return Err(e)
}
if let Err(e) = file.write_all(format!("rating: {}\n", info.rating).as_bytes()) {
return Err(e)
}
Ok(())
}
Turns into the following with the ?
operator.
use std::fs::File;
use std::io::prelude::*;
use std::io;
struct Info {
name: String,
age: i32,
rating: i32,
}
fn write_info(info: &Info) -> io::Result<()> {
let mut file = File::create("my_best_friends.txt")?;
// Early return on error
file.write_all(format!("name: {}\n", info.name).as_bytes())?;
file.write_all(format!("age: {}\n", info.age).as_bytes())?;
file.write_all(format!("rating: {}\n", info.rating).as_bytes())?;
Ok(())
}
The above examples are directly from the Standard Library documentation.
Anything that returns a Result<T, E>
type can use the question mark operator.
So where should you use the two? The Rust book covering the topic has a fantastic rule-of-thumb:
"It’s advisable to have your code panic when it’s possible that your code could end up in a bad state."
Tests, examples and quick prototypes should generally use the panic!
macro so that you get rapid feedback. You can also use .unwrap
and .expect
methods to achieve this result.
But production code, especially library code, should typically propagate an error if you can think of how to handle it. That said, one rare instance where you should use unrecoverable errors is when you have more information - via human context - than the compiler.
This example from the Rust book outlines where a human has more information than the compiler does.
use std::net::IpAddr;
let home: IpAddr = "127.0.0.1"
.parse()
.expect("Hardcoded IP address should be valid");
We use expect
here because we know that "127.0.0.1"
is a valid IP address but there's no way for the compiler to know that.
To better organize and simplify validating contracts for your functions you should use custom types. This leverages the Rust type system at compile time, saving a lot of boilerplate error handling you otherwise would need to implement.
Remember our ads from earlier? Let's use those to show what we're talking about here by adding some validation to the inputs.
#[derive(Debug)]
pub struct VideoAd {
ad_title: String,
budget: u32,
target_url: String,
}
impl VideoAd {
pub fn new(ad_title: String, budget: u32, target_url: String) -> VideoAd {
if budget < 50 {
panic!("The budget must be at least $50 USD! Got {}", budget);
}
if !target_url.starts_with("https://") {
panic!(
"The target_url must start with the https:// protocol. Got {}",
target_url,
);
}
VideoAd {
ad_title,
budget,
target_url,
}
}
}
Now we've added some validation to the VideoAd
type. If users input a budget less than 50 or a target_url
that doesn't start with https://
we stop the program in its tracks with a useful compiler error.
In further refactoring, we'd probably want to alter the return type of new
to return a Result<VideoAd, io::Error>
and properly return an error with the messages, rather than panicking.
Reference lifetimes, generics and traits are three language features that make Rust extremely extensible, as well as uniquely safe, for a low-level language.
Code reuse is fundamental principal of quality software engineering. It reduces the error surface, speeds up debugging and allows others to better understand the code they're reading. Generics and traits are critical to being able to accomplish this in Rust.
Lastly, specifying how long a reference lives using function parameters and return values is incredibly useful, though likely new to most developers as this feature of the Rust language is somewhat unique versus other languages.
Put simply, reference lifetimes are the scope in which a reference lives. And as an amazing feature, Rust allows developers to define how long references to some addresses in memory live. And when there's ambiguity, Rust actually requires developers to annotate reference lifetimes.
Remember when we claimed Rust was made for safety? This is one of its defining features in achieving that objective by preveting dangling references. Let's dig into an example of a dangler.
// will not compile
fn main() {
let first;
{
let first_second = "Hello";
first = &first_second;
}
println!("{}, world!", first);
}
You should see an error in your editor that says something similar to the following, when hovering over the first = &first_second;
line:
`first_second` does not live long enough
borrowed value does not live long enough (rustc E0597)
──────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597
And the compiler is even more helpful when hovering over the last bracket defining the inner scope, which tells us exactly what's going on.
`first_second` dropped here while still borrowed (rustc E0597)
──────────────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597
Ultimate the code won't compile because we attempt to use the reference to a memory location in the inner scope in the outer scope.
Fixing this is simple; remove the inner scope.
fn main() {
let first;
let first_second = "Hello";
first = &first_second;
println!("{}, world!", first);
}
Annotating lifetimes simplify tells the Rust compiler how multiple references should interact with one another.
The syntax for annotations starts with an apostraphe '
and are by convention very short, e.g. 'a
. You can use them in function signatures, parameters and return statement annotations.
For example, the function below takes two vectors of integers and compares which has the greater sum over all their elements.
fn bigger_sum<'a>(first: &'a Vec<i32>, second: &'a Vec<i32>) -> &'a Vec<i32> {
let sum_first: i32 = first.iter().sum();
let sum_second: i32 = second.iter().sum();
if sum_first > sum_second {
first
} else {
second
}
}
The function above says a few things to the compiler. It says that for some lifetime 'a
, defined by <'a>
:
- the parameters
first
andsecond
each must live at least as long as'a
, and - the reference returned from the function will live at least as long as
'a
.
Using our function from above, let's explore how this can be used and the types of errors to expect when misused.
fn main() {
let first: Vec<i32> = vec![1, 2, 3, 4];
let second: Vec<i32> = vec![-1, 2, 3, 4];
bigger_sum(&first, &second);
}
The above works without a hitch, and is obvious that it does; afterall, first
and second
clearly have the same lifespan.
fn main() {
let second: Vec<i32> = vec![-1, 2, 3, 4];
{
let third: Vec<i32> = vec![2, 3, 4, 5];
bigger_sum(&second, &third);
}
}
In the above we clearly have different lifetimes, i.e. third
has an inner scope that is clearly smaller than second
. The compiler substitutes the smaller of the lifetimes necessary into 'a
.
Now let's cause some breakage.
fn main() {
let second: Vec<i32> = vec![-1, 2, 3, 4];
let should_be_second: &Vec<i32>;
{
let forth: Vec<i32> = vec![2, 3, 4];
should_be_second = bigger_sum(&second, &forth);
}
println!("{:?}", should_be_second);
}
In the function above, compiler errors start on the declaration of forth
:
binding `forth` declared here (rustc E0597)
────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597
Happen again on the usage of &forth
in the call to bigger_sum
:
`forth` does not live long enough
borrowed value does not live long enough (rustc E0597)
──────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597
And lastly the compiler warns us again on the inner-scoped bracket close:
`forth` dropped here while still borrowed (rustc E0597)
───────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0597
Functions cannot return lifetimes that have nothing to do with the parameter lifetimes. The below will fail miserably:
fn failing_lifetime_function<'a>(x: &i32,) -> &'a i32 {
let result: i32 = 42;
&result
}
Your compiler should say something about the &result
reference because it has nothing to do with the lifetime of parameter x
.
cannot return reference to local variable `result`
returns a reference to data owned by the current function (rustc E0515)
───────────────────────────────────────────────────────────────────────
https://doc.rust-lang.org/error-index.html#E0515
Lifetime annotations on function method parameters are called input lifetimes and those on return values output lifetimes. Importantly, a developer does not always have to define lifetimes because Rust performs lifetime elision for common, deterministic behavior.
The rules Rust uses are as follows:
- input lifetimes - assign a different lifetime to each parameter, e.g.
fn ltime(x: &i32, y: &str)
getsfn ltime<'a, 'b>(x: &'a i32, y: &'b str)
. - output lifetimes - if a function has a single parameter all lifetimes are assigned the same way, i.e.
fn ltime(x: &i32)
isfn ltime<'a>(x: &'a i32) -> &'a i32
. - output lifetimes - if one input is
&self
or&mut self
all lifetimes are assignedself
. If you sit and consider this rule, it makes a lot of sense, because this implies we're working on&self
so everything needs to live at least as long as the reference toself
.
Generic types allow us to build structs, enums and function signatures that take multiple concrete types, greatly reducing code boilerplate, helping developers adhere to the DRY - don't repeat yourself - paradigm.
This makes code safer, easier to understand, maintain and debug.
Naming a type parameter in Rust is flexible; they can be anything you want. That said, we'll traditionally use and start with T
. Let's see this in action
with our bigger_sum
function from the earlier section on reference lifetimes.
In particular, let's refactor that function to take Vectors of T
. Here's what it looked like before.
fn bigger_sum<'a>(first: &'a Vec<i32>, second: &'a Vec<i32>) -> &'a Vec<i32> {
// find sums of each
let sum_first: i32 = first.iter().sum();
let sum_second: i32 = second.iter().sum();
// return vector with larger sum
if sum_first > sum_second {
println!("The first vector has a larger sum: {}", sum_first);
first
} else {
println!("The second vector has a larger sum: {}", sum_second);
second
}
}
Now we're going to add some traits, which we'll cover in detail over the next section, and refactor one line to make this bad boy work with pretty much any non-floating point number.
fn bigger_sum<'a, T>(first: &'a Vec<T>, second: &'a Vec<T>) -> &'a Vec<T>
where
T: 'a + std::iter::Sum<&'a T> + std::cmp::PartialOrd + std::fmt::Display,
{
// find sums of each
let sum_first: T = first.iter().sum();
let sum_second: T = second.iter().sum();
// return vector with larger sum
if sum_first > sum_second {
println!("The first vector has a larger sum: {}", sum_first);
first
} else {
println!("The second vector has a larger sum: {}", sum_second);
second
}
}
What we need to focus on here is the T
where we allow the first
and second
input parameters to be of type T
. In addition, we now allow the sum_first
and sum_second
function variables to take type T
. Lastly, we return a reference to a Vector of type T
.
We did this by specifying in the signature next to the reference lifetime T
using a comma and instead of i32
in the parameters and function replacing those with T
. If you're familiar with C++ or other languages that use generics, this syntax should be somewhat familiar to you.
In addition to the now generic types we added traits that restrict the type T
to types T
that have only certain capabilities. Those capabilities are listed after the where T:
clause:
'a
std::iter::Sum<&'a T>
std::cmp::PartialOrd
std::fmt::Display
In the next section we'll dig into these in more detail.
Now we can do things like the below, where we compare Vectors that store types like u16
, usize
and i32
from before.
We can use anything that has the
std::cmp::PartialOrd
trait!
fn main() {
let first: Vec<u16> = vec![1, 2, 3, 4];
let second: Vec<u16> = vec![1, 2, 3, 4, 5];
bigger_sum(&first, &second);
{
let third: Vec<u16> = vec![2, 3, 4, 5];
bigger_sum(&second, &third);
}
{
let third: Vec<usize> = vec![2, 3, 4, 5];
let forth: Vec<usize> = vec![2, 3, 4, 5];
bigger_sum(&third, &forth);
}
Structs in Rust can use generic types so that they are more flexible. Let's we're making an interactive video game ad for a draft beer restaurant.
First, for funsies, let's define our types of beer:
enum BeerType {
IPA,
Kolsch,
Lager,
Stout,
Sour,
}
Not exhaustive but that lineup should please pretty much any beer lover.
Next we'll use this in a struct to model a pint glass, a.k.a. the thing you drink beer from.
struct PintGlass<T> where T: std::cmp::PartialOrd {
beer: BeerType,
price: T,
is_empty: bool,
}
We've used the generic T
to allow the instantiator of the PintGlass
struct flexibility in using pretty much any integer type to model the price.
If you've ever built accounting paradigms into software it's a bad idea to use floating point numbers for prices. Watch this famous movie and consider why.
Now we can use it like so, allowing for even the strangest pricing models. Let's assume the below drinks are from a place called "VC Bar".
fn main() {
let first_pint = PintGlass {
beer: BeerType::IPA,
price: 5,
is_empty: true,
};
let second_pint = PintGlass {
beer: BeerType::Stout,
price: 6,
is_empty: true,
};
// there's a deal with the third pint that the restaurant pays the customer
// 1 unit of currency
let third_pint = PintGlass {
beer: BeerType::Kolsch,
price: -1,
is_empty: true,
};
// then because the customer is drunk they double charge them
// let's call this establishment "VC Bar"
let forth_pint = PintGlass {
beer: BeerType::Lager,
price: 12,
is_empty: false,
};
}
Assuming the above is modeling one individual sitting at this miserably misleading establishment, the PintGlass
struct proves shockingly flexible, thanks to generic types.
Now we can add a method or two to make the PintGlass
struct even more powerful.
Let's add a set_to_empty
method on the struct.
impl<T> PintGlass<T>
where
T: std::cmp::PartialOrd,
{
fn set_to_empty(&mut self) {
self.is_empty = true;
}
}
Notice how we need to add the generic type to the impl<T>
and restate the traits. Now we can use it like so (slightly modifying the above to make forth_pint
mutable.
fn main() {
forth_pint.set_to_empty();
// though shady, the business model obviously works
let fifth_pint = PintGlass {
beer: BeerType::IPA,
price: 12,
is_empty: false,
};
}
Now we're able to fully capture a misleading business model that makes a ton of money while endangering its customers, while maintaining code safety ourselves.
Isn't that why you're learning Rust?
Now we can finally dive into how to restrict generic types so that they're actually useful. Traits in Rust simply outline what things a type can do - functionality specific to a type.
We saw this when we implemented the PintGlass
struct and specified the std::cmp::PartialOrd
for type T
.
We did this so that we could do things such as the following.
fn main() {
let pints = vec![first_pint, second_pint, third_pint, forth_pint, fifth_pint];
let mut total_sales: i32 = 0;
for pint in pints.iter() {
total_sales += pint.price;
}
println!("The customer has paid {} to get black out drunk", total_sales);
}
Now thus far, we've only used traits, not defined them. Let's do the latter now by adding a trait Display
that will define a print
method so that we can output the contents of a PintGlass
.
Based on our earlier example, maybe it should be named "puke"!
To implement a trait you need to first define it and then add it to the impl
block for the struct you want.
Define it like this.
pub trait Display {
fn print(&self);
}
Then we'll add it using a impl
plus for
, for example.
impl<T> Display for PintGlass<T>
where
T: std::cmp::PartialOrd + std::fmt::Display,
{
fn print(&self) {
println!(
"{}",
format!(
"Beer {:?}, price {}, is empty? {}",
self.beer, self.price, self.is_empty
)
);
}
}
// add the Debug trait to your BeerType so it can be printed...
#[derive(Debug)]
enum BeerType {
...
}
Now you can use the print
method in calling code.
fn main() {
pints[4].print();
}
Now if we want to also make a WineGlass
struct, we can add a Display
trait to each type and use it with the print
method, so that each type has its own way of printing how we'd like to represent that particular struct.
Note: In real life, we'd want to implement the standard library
Display
trait, not our own!
We probably want to have a basic default that at least says what struct we're printing. Here's how to do that, by adding to the trait definition.
pub trait Display {
fn print(&self) {
println!("Some type of glass");
}
}
Now all of our structs that implement any methods for the Display
trait will have the print
method by default.
We saw ealier in our implementation of the PintGlass
struct the usage of a where
clause, which is how we specify Trait Bounds in Rust. These allow us to specify the types and their traits that are allowed.
In particular, only those types with the implemented traits are allowed, when specified by the trait bound.
struct PintGlass<T>
where
T: std::cmp::PartialOrd,
{
beer: BeerType,
price: T,
is_empty: bool,
}
In the PintGlass
struct, type T
must have the std::cmp::PartialOrd
trait.
We can also specify a trait bound for a return type, however, we'll cover this in more detail in a proceeding chapter.
For now, returning a simple type looks like the below, for example.
fn return_something_with_display_trait() -> impl Display {
PintGlass {
beer: BeerType::IPA,
price: 12,
is_empty: false,
}
}
Testing is probably the most important task in which a programmer engages during the software crafting process.
Though it is certainly no silver bullet for guaranteeing software quality, it is a fantastic way to show the presence of bugs for known/expected behavior. In addition, modern tooling can automatically display test coverage ratios in nearly any language.
Unit testing in Rust is quite straightforward. One thing that may be somewhat foreign from other languages is that test code is typically included alongside the module function code in Rust.
In particular, Rust uses an attribute #[cfg(test)]
that specifies to the compiler to conditionally compile the code marked beneath it. This way, test and functionality code are kept close to one another but that test code is not included when the code is compiled.
Note: It's typical to use that test marker with test setup code and actual testing code.
Because Rust language design promotes the use of separation of concerns, you'll write most of your unit tests in src/lib.rs
files.
So let's write some trivial code for testing purposes, make it fail and then make it pass. Welcome to the wheel of development.
Let's add the test first.
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn multiply_works() {
assert_eq!(multiply(&2, &2), 4);
}
}
This won't work yet because we don't have the multiply
function even declared.
// src/lib.rs
fn multiply(lhs: &i32, rhs: &i32) -> i32 {
1
}
Now run the tests with cargo test
. Bask in that failure. And now let's clean it up with a correct implementation for multiply
.
// src/lib.rs
fn multiply(lhs: &i32, rhs: &i32) -> i32 {
lhs * rhs
}
Now run those tests again and bask in your passing glory. Your output should look something like the following.
29% ❯ cargo test
Finished test [unoptimized + debuginfo] target(s) in 0.00s
Running unittests src/main.rs (target/debug/deps/test_examples-b2e0b55724fa4029)
running 1 test
test tests::test_multiply ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Though it's not recommended, because your functionality should live in another module and be tested there, you can certainly test code living in the main
function.
fn multiply(lhs: &i32, rhs: &i32) -> i32 {
lhs * rhs
}
fn main() {
let result = multiply(&2, &2);
println!("result: {}", result);
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_multiply() {
assert_eq!(multiply(&2, &2), 4);
}
}
Run it with cargo test
and watch that glorious passage.
Rust has many built-in tools you can and should use for organizing and communicating information from your test runs. It's common to have a CICD step in your build and/or development process that runs these and communicating what went wrong, when and where is critical. Rust makes this simple.
Since your tests are just Rust code you can and should make them panic!
when you have code that must happen or can't happen, as in, you expect it to panic.
All you have to do is mark it with #[should_panic]
.
#[cfg(test)]
mod tests {
#[test]
#[should_panic]
fn ima_panic() {
panic!("Make the test run stop and this test fail!");
}
}
Running this will produce a huge FAIL and stop the remaining tests from executing.
Other macros we can use include assert!
, which checks truthiness, assert_ne!
, which checks if something isn't equal, and of course, our already used assert_eq!
macro which checks if something is equal.
Let's see how we might use each of these.
#[cfg(test)]
mod tests {
#[test]
fn this_is_not_equal() { // this will pass
assert_ne!(multiply(&2, &2), 5);
}
#[test]
fn this_is_true() { // this will pass
assert!(multiply(&2, &2) == 4);
}
}
One of the issues large code bases face is testing complexity. When something fails we need to communicate what was supposed to happen and why it failed.
Rust has a convenient ability to do that with its built-in test infrastructure.
#[test]
fn this_is_not_equal() { // this will pass
assert_ne!(multiply(&2, &2), "2 * 2 != 5", 5);
}
#[test]
fn this_is_not_true() { // this will not pass
assert!(multiply(&2, &2) == 5, "2 * 2 != 5");
}
Now when we run the above with cargo test
we get output that says what went on:
failures:
---- tests::this_is_not_true stdout ----
thread 'tests::this_is_not_true' panicked at src/lib.rs:29:9:
2 * 2 != 5
Testing Rust code that returns a Result<T, E>
is extremely convenient, as an Err
value returned will fail calling code. Let's see this in action.
fn a_function_using_result() -> Result<bool, String> {
Ok(true)
}
#[test]
fn test_using_result() -> Result<(), String> {
let ran_successfully = a_function_using_result()?; // Err value will fail here
if ran_successfully {
Ok(())
} else {
Err("Something terrible happened and we got an error. Spam your own phone number, developer, until someone picks up. Yell at them.".into())
}
}
Integration tests should test how your code interactions and are external to your library code. Normally, we use a tests
directory for these at the same level as your src
directory. Cargo looks for this when you run the test command.
Ideally you want to mimic how an external client or developer will use this code when writing integration tests.
Let's make that directory and add some integration testing code to it, which will be exactly the same as our unit test code at this point.
mkdir tests
// tests/integration.rs
use test_examples;
#[test]
fn it_multiplies_two_numbers() {
assert_eq!(test_examples::multiply(&2, &2), 4);
}
Notice how we pull in test_examples
, the name of this crate? This is necessary because each file inside the tests
example is considered its own crate by the compiler. Secondly, we don't need the cfg
configuration flag because this is a known directory for unit tests.
Let's run this and see what happens.
❯ cargo test
Finished test [unoptimized + debuginfo] target(s) in 0.00s
Running unittests src/lib.rs (target/debug/deps/test_examples-8cbec1295e96061c)
running 1 test
test tests::test_multiply ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running unittests src/main.rs (target/debug/deps/test_examples-0965484333ee197a)
running 1 test
test tests::test_multiply ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running tests/integration.rs (target/debug/deps/integration-90f1e0a063430ba2)
running 1 test
test it_multiplies_two_numbers ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Doc-tests test_examples
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Passing like a cop with flatulence.