Go Back

Learning distributed systems with BitTorrent

My journey building a BitTorrent client from scratch using Rust

Learning distributed systems with BitTorrent

I’m a simple man, one day I woke up thinking “How does BitTorrent works?” and just decided to learn the protocol and implement it, just for fun and curiosity.

I thought that I would spend around 1 week to build a minimalistic client… took me about 3 months.

The BitTorrent protocol is quite old, it was released in 2001, and Bitcoin was released in 2009.

We had other crypto-currencies before Bitcoin, and other distributed file-systems before BitTorrent, but they all died. However, these two protocols are survivors, they only survived because they are decentralized.

Why

So, why did I built another BitTorrent client? The problem with most clients is that they are infested with ADs that track you, malware, slowed download rate (on purpose), and you have to pay them if you want to get rid of these things.

Of course, not all clients are like that, and I have used some good clients (like transmission). But what saddened me is that I couldn’t find a client that has support for a terminal UI, and since I use the terminal all of the time on the computer, I want everything on the terminal.

The journey

I started the development really ignorant about the distributed systems world and it’s problems. So I have faced first-hand fundamental issues like the CAP Theorem.

I also did things that are really common in distributed systems without even knowing about it, like the heartbeat detection

I have faced many blockers and almost quit multiple times. But there is a specific feeling that I just can’t describe (aesthetic satisfaction?) when you are making progress and seeing all the different nodes working that always kept me going until the end.

And finally, I have learned a lot, a lot low level stuff that I truly lacked.

How the protocol works

This is only a high-level and summarized explanation, refer to the official documentation

Metainfo and Tracker

What we call a “.torrent file” is a metainfo file. This file contains information necessary to download the files, such as: piece_length, trackers, etc.

Trackers are servers that will give you a list of IPs of the peers in the network for a specific torrent.

Nowadays trackers are not really needed as most clients implement DHT. This makes the protocol even more decentralized.

Pieces and files

The files are divided between multiple pieces, peers download/upload pieces of files. And those pieces are further divided into blocks. Sometimes the pieces do not align with file boundaries, this makes the calculation of the pieces/blocks a complete mess.

/// Visualization of files of a Torrent
/// --------------------------------------
/// | f: 10 | f: 32768                   |
/// ----------------------------------p---
/// | b: 10 | b: 16384   | b: 16274  |110|
/// --------------------------------------
/// f: file
/// b: block length
/// p: piece

This reads as:

Peers will exchange these blocks, and not the full pieces. A few problems that comes with this:

Peers

All the peers have a state to keep track of. By state I mean the relationship of the peers between themselves and the torrent. What pieces does Peer A have? Is Piece B interested in Peer A? etc.

This is what happens when a Torrent download starts:

Results

I have built the client in Rust, a safe and fast language. It’s MIT license and ADs free, and more:

You can access the client.