2025-10-17, 2025-10-18
Exploring Rust Traits
Ferris the mascot crab is here to help, hopefully, 'cause I'm gonna need it.
Welcome to my first blog in my Rust Rabbit Holes series! In these I'll just try implementing something relatively simple and then pull on the various threads to see where they lead.
I'm a Rust noob. I've dabbled for a few years and written a small-but-non-trivial project, but I'm far, far from expert.
So I'm going to just mess around with the language a bit and see what it takes to do what. And see what I learn along the way. And if anything is wrong in this blog, it's because I'm human. 😊 Corrections and additional comments welcome, of course.
For this exploration, I had in mind to just compare buffered and unbuffered file I/O. But then it got into traits a bit. And I ran with that.
Reader skill level: beginner, like me.
Traits
If you aren't familiar with the concept of Rust traits, you can very loosely think of them as "interfaces" or "abstract classes" in other languages. They're a description of methods you must have in order to implement a particular trait.
For example, to implement the Read
trait in Rust, your
struct
must implement a read()
function with the proper signature.
There are plenty more details in the Rust Book.
Buffered and Unbuffered Reads
At a low level, the OS typically exposes an interface that just lets you
synchronously read a chunk of bytes into memory. On Unix-likes, this is
the read()
syscall.
This is inconvenient for a number of reasons, two of which are:
-
Each syscall is expensive, and if you do a lot of tiny reads, that's overhead. It's better to do one big actual OS read, and then all your smaller reads can take place in user space without that overhead.
-
The OS doesn't allow you to do fun things like "read a single line of data" because its interface is so simplistic. You'd have to micromanage that yourself. A buffered I/O system, since it already grabbed a big chunk of input, can scan ahead for newlines and return single lines.
In C, the stdio.h
I/O functions are all typically buffered. If you
want unbuffered, you either tell the C I/O subsystem to go unbuffered,
or you can call the OS directly.
In Rust, both modes are present in the standard library. In the modern style, you first open a file in unbuffered mode, and then you wrap that up in some buffering functionality if you want that.
Unbuffered Reads in Rust
Rust does unbuffered I/O with its struct File
type. It
implements the
Read
trait which
gives us this method, among others:
fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
That's a hard parse if you're new to Rust, but the gist is that you're
going to pass in a mutable reference to an array of bytes in buf
(mutable so that read()
can put bytes in it!) and it returns a
Result
object telling you the number of bytes that have been read.
(Incidentally, this is exactly how Unix read()
syscall works.)
So after you call read()
on your file object, hopefully that buf
parameter gets filled up with some bytes and the returned Result
holds
the number of bytes that we read.
Here's a bit of a demo that does this:
use std::fs::File;
use std::error::Error;
use std::io::Read;
fn unbuffered_demo() -> Result<(), Box<dyn Error>> {
const BUFSIZE: usize = 128;
// Open the file
let mut file = File::open("input.txt")?;
// Buffer to read bytes into
let mut buf = [0u8; BUFSIZE];
// Call file.read repeatedly, trying to get BUFSIZE bytes
loop {
let bytes_read = file.read(&mut buf)?;
// Check for end-of-file
if bytes_read == 0 {
break;
}
// Otherwise, convert bytes to a string
let str_val = std::str::from_utf8(&buf[..bytes_read])?;
print!("{}", str_val);
}
Ok(())
}
Interesting that we had to bring in the std::io::Read
trait even
though it doesn't appear in our code. It's because File
implements it
that we have to do that, apparently.
Also check out that std::str::from_utf8()
call. I'm assuming the bytes
in that files are UTF-8 encoded
(they are in my input.txt
). I pass in &buf[..bytes_read]
which is a
slice of the buffer just composed of how ever many bytes I read.
Hopefully the number of bytes is BUFSIZE
, but the file is unlikely to
be an exact multiple of that so the last read is short.
When there is a short read, values in the [u8]
array past the end of
the read just have whatever was in there from the previous read. So we
slice those off.
Finally, the function return type is interesting:
fn unbuffered_demo() -> Result<(), Box<dyn Error>> {
That's the classic Result
, but what's the rest of it? My function
returns Ok(())
in the good case. But we have all those ?
operators
all over the place. Those basically say, "Unwrap the return value from
this function. If it's good, assign it to the variable on the left. If
it's bad, return with the error."
But all those functions with ?
after technically return different
types, so how do we handle them all?
Well, all of those error types implement the std::error::Error
trait.
So we can treat them all like Error
. Except that they aren't the same
types, and we won't know which exact type was returned until runtime, so
we have to use dynamic
dispatch when returning
it—that's the dyn
keyword you see.
Not only that, but we don't know how big the particular runtime error
object will be, so it has to be Box
ed. Rust needs to know the size of
everything at compile time, and a Box
is basically just a pointer to
something on the
heap, and
pointers are fixed size.
I was also wondering how the called functions know to Box
their errors
for our function because it seems like normally they wouldn't.
Apparently this happens through the magic of the
From
trait,
but I didn't pursue that rabbit hole this time.
Boxing errors like this is not the only way to handle multiple error types, and arguably isn't the best way (whatever that means). Rust by Example has more information.
OK, so that's unbuffered. How do we get buffered I/O?
Buffered Reads in Rust
In order to get buffered reads, we need to wrap our File
up in a
BufReader
.
The short of it looks like this:
// Open the file
let file = File::open("input.txt")?;
// Make a new buffered reader on the file
let mut reader = BufReader::new(file);
And now we have a reader
object that has all kinds of higher-level
BufReader
methods
on it. We can now use these to read the data instead of that
lower-level read()
method. In particular, it implements the BufRead
trait that gives
you more (and easier) control over what you read.
Notably, it has this method:
fn read_line(&mut self, buf: &mut String) -> Result<usize> { ... }
That will read a line at a time! That's fun! Let's write some code that
does it. I'll just copy the unbuffered code and modify it to use the
BufReader
for reading, instead:
use std::fs::File;
use std::error::Error;
use std::io::{BufRead, BufReader, Read};
fn buffered_demo() -> Result<(), Box<dyn Error>> {
// Open the file
let file = File::open("input.txt")?;
// Make a new buffered reader on the file
let mut reader = BufReader::new(file);
// Storage for the line we'll read
let mut line = String::new();
loop {
// Read a line!
let bytes_read = reader.read_line(&mut line)?;
// Check for end-of-file
if bytes_read == 0 {
break;
}
print!("{}", line);
line.clear();
}
Ok(())
}
Looks awfully similar to what we had before. There are a couple interesting things in this buffered version, though.
We're not reading into a byte array, but instead into a
String
.
This is nice because String
s are basically
Vec
s under the
hood so they grow and shrink as needed. No more fixed-size buffer to
manage!
In the unbuffered version, the File
variable had to be mut
, and here
it doesn't. That's because in the unbuffered version we were calling
file.read()
, which does a mutable borrow &mut self
as we see in its
signature.
In this version, we simply pass ownership of the file to the
BufReader
, and it presumably tucks it away in its struct
somewhere.
And then when we call reader.read_line()
is does its own &mut self
borrow and then then calls its file.read()
internally. I think.
We do have that line.clear()
bit at the end. Turns out read_line()
appends data on the end of the line
variable. So if we want to print
them one at a time, we actually have to clear the string so it's empty
for the next line.
I found myself wondering if that end-of-file check could be rolled into
a while let
similar to the idiomatic C while loop:
// C code
while ((bytes_read = read_line(line)) > 0) {
puts(line);
}
So in Rust that would be something like:
// INVALID Rust code
while (let bytes_read = reader.read_line(&mut line)?) > 0 {
print!("{}", line);
line.clear();
}
But that won't build due to the way let
in this instance is actually
doing pattern matching to look for an Ok()
. I speculate.
So are we stuck in terms of making our buffered reader prettier? Not quite!
There are a few ways to do this beyond what I thought of. You can
while
loop on a block (which I didn't realize was possible):while { let bytes_read = reader.read_line(&mut line)?; bytes_read > 0 } { print!("{line}"); line.clear(); }
[HT Joe at OSU-Cascades]
Digging around the BufRead
trait, we find a method called
lines()
.
This method returns an iterator that we can use to loop over the lines
of the file, and it'll quit when done!
fn buffered_demo_iter() -> Result<(), Box<dyn Error>> {
// Open the file
let file = File::open("input.txt")?;
// Make a new buffered reader on the file
let reader = BufReader::new(file);
for line_result in reader.lines() {
println!("{}", line_result?);
}
Ok(())
}
Much cleaner!
One difference is that we're getting a Result
type back in
line_result
from the for
loop. So we have to unwrap that, which I do
in the println!()
.
And it's println!()
this time, instead of print!()
. Apparently the
trailing newline is stripped off when using the lines()
iterator.
Here's another option, since
reader.lines()
is an iterator. We can just repeatedly call.next()
to extract lines:let mut lines = reader.lines(); while let Some(line) = lines.next() { println!("{}", line?); }
[HT Joe at OSU-Cascades]
Now—wait a moment—nothing is mut
in our for
-loop variant! But
clearly at some deep level the file
has to be mut
to read()
out of
it. Let's dig. What's reader.lines()
do?
fn lines(self) -> Lines<Self>
Since it's just self
, it's actually taking ownership of the
reader
BufReader
. Which means we can't use the reader
after the
loop. In this case that's fine.
But if we wanted to use reader
after the loop, we'd have to change the
call to .lines()
to be against a mutable borrow of reader
(and it
would have to be mut
.
let mut reader = BufReader::new(file);
for line_result in (&mut reader).lines() {
That works and we can then use reader
after the loop is complete if we
want.
lines()
is still taking ownership, but it's taking ownership of a
mutable reference to the reader
. So we could still use reader
later.
(Am I saying this correctly?)
However, this does not work:
let reader = BufReader::new(file);
for line_result in (&reader).lines() {
Why not?
I made the following toy program to mess with it where I implemented traits for various borrows:
struct Bar {
x: usize,
}
impl Foo for Bar { // Ownership
fn foo(self) {
println!("{}", self.x)
}
}
impl Foo for &Bar { // Borrow
fn foo(self) {
println!("{}", self.x)
}
}
impl Foo for &mut Bar { // Mutable borrow
fn foo(self) {
println!("{}", self.x)
}
}
trait Foo {
fn foo(self); // Ownership of self
}
fn main() {
let mut b = Bar { x: 12 };
(&mut b).foo(); // works
(&b).foo(); // works
b.foo(); // works
//b.foo(); // error (b is moved by the previous call)
}
And that works as long as I have impl Foo
for the various forms of
Bar
. So that means we need to have an impl BufRead
for all the
supported types, and apparently &BufReader
isn't one of them...?
Looking in the
BufRead
implementors section of the docs, we see the following (among others that aren't pertinent):impl<R: ?Sized + Read> BufRead for BufReader<R> // 1 impl<B: BufRead + ?Sized> BufRead for &mut B // 2
So it's defined for:
- Any
BufReader
that implementsRead
. That's myreader.lines()
.- Any
&mut
to a thing that implementsBufRead
. That's my(&mut reader).lines()
.But that's it. Nothing in there that implements it for a
&BufReader
. So(&reader).lines()
doesn't work.[HT Brian May]
There might be more to it than this, but that's my new mental model so far.
Implementing the Read
Trait
If we look at what BufReader::new()
takes, we see this:
impl<R: Read> BufReader<R>
pub fn new(inner: R) -> BufReader<R>
So that first line is saying "In the implementation of BufReader
for
any type R
as long as R
implements the Read
trait..."
And the second is saying "We'll have a function new
that takes an
argument of type R
(again R
must implement the Read
trait) and
returns a new BufReader
of that type."
It's generic! We can use any type to create our BufReader
as long as
it implements the Read
trait.
And I have one! For this, we'll have a DataProvider
object that has an
array of bytes and implements the Read
trait. If it implements that,
it will be good to use with BufReader
.
struct DataProvider {
data: &'static [u8],
offset: usize,
}
impl DataProvider {
fn new(data: &'static [u8]) -> DataProvider {
DataProvider {
data,
offset: 0,
}
}
}
All it's going to do it keep track of the current offset and copy bytes
out when its read()
method is called. We just have to make sure we
match the function signature in the Read
trait.
impl Read for DataProvider {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
let start = self.offset;
let mut end = self.offset + buf.len();
let maximum = self.data.len();
if end > maximum {
end = maximum;
}
let bytes_copied = end - start;
self.offset += bytes_copied;
let src_bytes = &self.data[start..start + bytes_copied];
buf[..bytes_copied].copy_from_slice(src_bytes);
Ok(bytes_copied)
}
}
A couple fun things there. We let src_bytes
be a slice of our array
from the starting point (which was the old offset
to the ending (which
might be the end of the data, or might be the end of the destination
buffer).
And then we use the copy_from_slice()
method to copy those bytes from
the source into the destination buffer. And notably, we might only want
to copy over some of buf
(in case we have fewer bytes remaining than
buf
is big). And that's why the receiver is the slice
buf[..bytes_copied]
.
And then we return Ok(bytes_copied)
, and bytes_copied
will be 0
once we reach the end of our data, indicating the equivalent of
"end-of-file".
We can use it just like this:
fn provider_demo() -> std::io::Result<()> {
let dp = DataProvider::new(b"line 0\nline 1\nline 2\nline 3\n");
let reader = BufReader::new(dp);
for line_result in reader.lines() {
println!("{}", line_result?);
}
Ok(())
}
Works!
Adding my two
usize
s forend
can overflow. This will either panic in a debug build or wrap around in a release build. Clamping it breaks the logic, so the best thing to do would be achecked_add()
that returns anOption
and return anError
if overflow happened.[HT Jonathan from Deschutes Tech Guild]
Technically line_result
is std::io::Result<String>
. But I didn't
want to return some Ok(String::from(""))
nastiness. So I changed it to
Result<()>
which is fine.
However I was still confused over why I'm allowed to do this when that's not the
Result
that we get fromline_result
. Apparently it's due to the fact that the?
really only has anything to do with the error portion of the result, and those are all compatible.[HT Jonathan from Deschutes Tech Guild]
To-Do List
- How the
From
trait works. - How to implement iterators.
- Everything else.