Building services in rust

@kjuulh

2024-02-19

Building business services might sound like a boring topic, and in some instances it can be. But from my point of view, building business services is a test for how ready a language or tool is, to achieve mainstream appeal. With this article I hope to show that Rust is right around that point, if not already there, so lets jump right in.

But what is a business service

First of all we should define some criteria of what a business services is, otherwise we have nothing to measure against, even if we're gonna use half-baked fluffy metrics to decide on anyways.

A business service, is a long running application, capable of running multiple different endpoints at once, maybe an http stack on a port, grpc on another, sending logs somewhere, calling external services, putting stuff on a queue, you name it. It is a multi facetted application that serves an external need of some sort and is basically a shell around some business logic.

A business service can be anything from a microservice, which serves one of the above, or monolith, it doesn't really matter as those are orthogonal metrics, i.e. they are about scale not capabilities, how much damned stuff you can cram in a box, and how many engineers to page once it goes down.

Most importantly of all a business service should be testable, it should be relatively self-serving once not receiving direct maintenance other than patch upgrades, and do its absolute damnest to fulfill its requirements; serve a business need.

To me the most important is test-ability, can it serve:

  • Unittests
  • Integrationtests
  • Acceptests

If these above are cumbersome to do, then the language isn't ready for mainstream usage. It doesn't matter how capable, how fast, or how secure it is, if normal engineers can't write code in the language without arcane knowledge then it isn't ready.

So to sum up the criteria:

  • Ergonomics
  • How easy is it to manage external dependencies
  • Testability

As you can probably tell, these are not some of rusts core values, maybe except for ergonomics, but I'll show that it is still possible to do great work in it.

Rust as a Service

Lets start from the top and go through a few architectural patterns that are common in business software, such as handling dependency injection, interfaces and concrete types, strategy pattern, etc. And which tools you need to rely on to achieve them.

Dependency injection ~hell~

Dependency management or injection as it is normally called for services, is simply a way for a function to take in some abstraction from outside and use its functionality, without having to deal with the complexities of how it actually implements said functionality. It is also extremely useful for testing a piece by itself.

I come from an object-oriented background as such that is usually how I go about solving these issues, especially as Rusts functional programming model have some ergonomic downsides that makes it difficult to do dependency injection using it (for reasons I won't go into here).

Usually you use dependency injection via a constructor

pub struct MealPlannerAPI {
	meal_planner: MealPlannerDatabase
}

impl MealPlannerAPI {
	pub fn new(meal_planner: MealPlannerDatabase) -> Self {
		Self {
			meal_planner
		}
	}

	pub async fn book_meal(&self) -> Result<()> {
		self.meal_planner.book_meal(/* some input */).await?;

		Ok(())
	}
}

Quite simply we take in some struct or trait in the new function, which serves as our constructor. And we can now just call the book_meal on the meal_planner inner type. This has a few benefits. If the input is a trait, we can mock it out, or we can use macro to mock a struct and swap a concrete value with it (even if I don't recommend it, but more on that later).

Lets say for now that MealPlannerDatabase is a trait

#[async_trait]
pub trait MealPlannerDatabase {
	pub async fn book_meal(&self) -> Result<()>;
}

pub struct MealPlannerPsqlDatabase {
	psql: sqlx::ConnPool<Postgres>
}

impl MealPlannerPsqlDatabase {
	pub fn new(psql: sqlx::ConnPool<Postgres>) -> Self {
		Self {
			psql
		}
	}
}

#[async_trait]
impl MealPlannerDatabase for MealPlannerPsqlDatabase {
	pub async fn book_meal(&self) -> Result<()> {
		self.psql.query!("INSERT ... INTO ...").execute(&mut self.psql).await?;
		Ok(())
	}
}

This is a small example, which we'll make use off later, but notice that we've split up the implementation up into two parts, the interface (contract) and the concrete type (implementation). This helps us in a few ways, i.e. we can swap out the implementation for either a different database in this case, or a mock if we want to test the MealPlannerAPI.

We can also add the mockall trait to our trait to automatically get mocks generated. This is quite convenient, but comes with some downsides in that it can reduce the feature set that you would normally have available. For example you cannot use impl in functions.

The keen eyed among you may notice that the above code wouldn't actually compile. I.e. you cannot take a trait as input to a function without a pointer, this is because we don't know the size of said trait (it may be any of the possible implementations or none), as such we need some abstraction around it. Secondly, the database might have some requirements that it needs to be called exclusively, so it may need an Arc or a Mutex, which we didn't deal with either.

For that we'll make use of the facade pattern. I.e. we're gonna create a facade, such that our external code doesn't have to deal at all with us having a trait, a mutex, arc whatever. The only thing that matters is that it can depend on the functionality without too much hassle.


#[derive(Clone)]
pub struct MealPlannerDatabase(Arc<dyn traits::MealPlannerDatabase>);

impl MealPlannerDatabase {
	// Options we want to expose
	pub fn psql(psql: sqlx::ConnPool<Postgres>) -> Self {
		Self(Arc::new(MealPlannerPsqlDatabase::new(psql)))
	}

	// Escape hatch
	pub fn dynamic(concrete: Arc<dyn traits::MealPlannerDatabase>) -> Self {
		Self(concrete)
	}
}

impl std::ops::Deref for MealPlannerDatabase {
	Target = Arc<dyn traits::MealPlannerDatabase>

	deref(&self) -> Self::Target {
		&self.0
	}
}

Now you could technically have an Arc, Mutex, whatever and the consumer would be none the wiser, it still allows you to use the inner functions as you normally would self.meal_planner.book_meal().await?.

You can even expand on it with an actual inner pattern if you need the Mutex, or something more complicated. The dynamic specifies that we can still use it as a test, as we can replace the internals with our mock.

Shared dependencies as a Service

The last pattern I want to show is shared dependency management. For that we'll use a few rust features as well. The corner stone of the pattern is to create a single shared resource, which we can use to new up all the required dependencies we need.

pub struct App {
}

#[derive(Clone)]
pub struct SharedApp(Arc<App>)

impl SharedApp {
	pub fn new() -> Self {
		Self(Arc::new(App{}))
	}
}

impl std::ops::Deref for SharedApp {
	Target = Arc<App>;

	fn deref(&self) -> Self::Target {
		&self.0
	}
}

Again we use a custom deref that makes sure we can reach the inner pattern, without having to wrap everything in Arcs, and or mutexes. I forgot to mention why we do so. When you've got 10-100 dependencies, it becomes a little long in the tooth, to have to wrap each an everything in Arcs because the SharedApp is a shared object and needs to be clone.

Before we move on to how to actually use this pattern, I'd like to give a recommendation. The App should not contain every single struct you need, it should contain foundational IO resources. Such as a database connection pool, queue manager, grpc connection, logger instance etc. Things that need setup from external configuration.

pub struct App {
	psql: sqlx::ConnPool<Postgres>
	// ...
}

That means that it won't contain MealPlannerAPI or MealPlannerDatabase. We'll get to those in another way.

To actually get to the concrete types we'll use something called extension traits

//file: meal_planner_api.rs

pub struct MealPlannerAPI {
	// skipped for brevity ...
}

pub mod extensions {
	pub trait MealPlannerAPIExt {
		fn meal_planner_api(&self) -> MealPlannerAPI;
	}

	impl MealPlannerAPIExt for SharedApp {
		fn meal_planner_api(&self) -> MealPlannerAPI {
			MealPlannerAPI::new(self.meal_planner_database())
		}
	}
}

This means that we can now from the outside call app.meal_planner_api() and we'll get an instance of the concrete type. If you've got a high volume service, you can either choose to move these values down into the shared struct itself, or cache them in the SharedApp using an object pool. In most cases the performance cost is negligible. In some cases rust will even inline these functions even if they're traits, to make them faster.

The database is the same, but uses values on self instead.

//file: meal_planner_database.rs

pub struct MealPlannerDatabase {
	// skipped for brevity ...
}

pub mod extensions {
	pub trait MealPlannerDatabaseExt {
		fn meal_planner_database(&self) -> MealPlannerDatabase;
	}

	impl MealPlannerDatabaseExt for SharedApp {
		fn meal_planner_database(&self) -> MealPlannerDatabase {
			MealPlannerDatabase::psql(self.psql.clone())
		}
	}
}

Notice that we use the psql method instead, and that this acts like a normal struct, even if it fronts for a trait. This is super convenient. This also means that you could technically create multiple Apps for different purposes and only choose to implement the extensions for those that need said dependencies.

This should cover all of our needs to handling dependencies. And if you'd like to can see this in action at: https://git.front.kjuulh.io/kjuulh/flux-releaser. Where I heavily use this pattern both for a cli and for a service in the same crate.

Dependencies all of them

We may have to run multiple different hot paths in our code, which are code paths which see high traffic, or where the main traffic comes through. This may be a http runtime, grpc, messaging etc.

For that right now, tokio is the name of the game. This is also why I didn't touch on the question above of why I marked nearly every function as async. If you develop this kind of software, it is a given that nearly all functions will touch some IO, and as such will be async, if not you will just have to go back afterwards and add async.

You want a fast, ergonomic, and stable runtime. In most languages these are built in. In rust the defacto standard is tokio. Even if there are multiple other alternatives on the marked, but for now, tokio is what you'd probably choose if you built services. It may change in the future though so don't take my word as gospel, and figure out what fits best for you. The only thing I ask is that you be consistent.

Tokio has the benefit of being able to spawn many virtual threads (tasks), and as such even if we only have a single core, or part of one. We can still run asynchronous work.

This should most of the time be done by a lifecycle management library, something that can make sure that a bunch of parallel services are running at the same time, and if one fails they all shut down. But we can just start by hacking our own together to illustrate how it works.

#[tokio::main]
async fn main() -> Result<()> {
	let app = SharedApp::new();

	tokio::select! {
		res = app.meal_planner_api().serve() => {
			res
		},
		res = app.meal_planner_grpc().serve() => {
			res
		},
		// .. As many as you'd like
	}

	Ok(())
}

This is a bit of a naive example, but should illustrate that you can run multiple tasks at the same time serving requests. Do note that if one exits all of them will terminate. But we can now share app between all the different runtimes and execution flows, like you'd normally do in any service.

I will go into how to actually make a nice development environment in another article, such that you should know which packages to provide as a standard development offering. But for now we'll just let our little service setup everything for itself. So keep in mind that the database setup, apis, runtimes etc. could be provided by a dedicated team.

Testability

One of the most important criteria for myself is being able to test a service. I usually defer on writing fewer more end-2-end tests rather than a lot of small unit tests. This is convenient, because rust doesn't make it easy to write unit tests.

Lets start with integration tests and then afterwards move on to unittests, because in rust they're quite different.

Integration tests

Integration tests I categorize a a test that span an entire service, including its io dependencies, but not other services. Such that you'd include a database, messaging broker, but not S3 or another service in your tests. It should poke the application from the outside, at least as much as possible, but still be able to introspect the state of the app using the libraries. So for me integration tests are categorized as a greybox test. Somewhere in the middle of whitebox and blackbox.

To setup integration tests for a service in rust, is a bit different than what you're used to. First of all, you'll want to place the test file somewhere else than where they normally life (in the code beside the functionality is the usual place). As such you'd create a folder in your crate:

tests/ # new folder
src/

Each file under tests will be module like we normally have it in rust, this will become important later.

A tests file looks like this

#[tokio::test]
async fn can_book_a_meal() -> Result<()> {
	let (endpoints, app) = setup().await?; // TODO: more on this in a bit

	let resp = reqwest::post(endpoints.meal_planner_http).await?;

	assert!(resp.status.is_success())

	let meal_bookings = app.meal_planner_database().get_meal_bookings().await?;

	// ... more asserts

	Ok(())
}

There is a few different pieces we haven't gone through before, but the first important piece is the setup function. You'd want as a few as possible concurrent apps running, as such the setup can be shared across tests (this is only possible pr. file, as each file is a binary in of itself, as such they cannot share memory between them).

So the setup should setup an app once, let the tests do its thing, and once all of them are done, shut down.


async fn setup() -> Result<(Endpoints, SharedApp)> {
	// You need a separate tokio runtime, as otherwise it would shutdown between each test

	// OnceCell to only spawn a server once
	INIT.call_once(|| {
	std::thread::spawn(|| {
		let rt = tokio::Runtime::new().unwrap();
		rt.block_on(async move {
			// Init the server
			let server = Server::new().await.unwrap();

			// Set global options
            unsafe {
                ENDPOINTS = Some(server.endpoints.clone());
                APP = Some(server.app.clone());
            }

			// Actually wait for the server, this should never terminate before the tests are done. I.e. start a webserver and stay blocking.
            server.start().await.unwrap();
		});
	});
	});

	// Wait for the server to come up, i.e. call a ping endpoint or something
	wait_for_server().await?;

	return Ok(unsafe { (ENDPOINTS.unwrap(), APP.unwrap()) })
}

Again lots of technicalities (see flux_releaser for a more thorough example). Just remember that we start up a process once in a separate thread, which has its own runtime. Let the server run, and outside of that we wait.

A small disclaimer here, this is what I would constitute as arcane knowledge, thankfully you only have to do this once, and it can be packaged up, so that you don't have to deal with this complexity all the time. It is just too useful and essential for testing to not mention.

I will also stop here for now with integration testing, if you'd like a follow up let me know at contact@kasperhermansen.com.

Unit testing

Depending on what you're doing in rust, unit testing can either be a breeze, or an absolute nightware. Essentially if you use structs all the way down with dependency injection shown in the previous section, without using traits, it is very difficult to do proper unittesting. I.e. you have no way of slicing functionality. If you use traits all the way down, then it will require a lot of boiler plate, or excessive usage of macros. Which I will touch on after this section.

What I recommend is:

  • Using traits for IO
  • Splitting functionality to make the business logic parts isolated and testable, this is not always applicable, but does make things easier.

Split dat IO

IO, oh, IO without you we would just be a space heater, with you we're filled with heartbreak, and stupid proses somehow.

IO doesn't come equal, and when I mean IO in this case, I mean side effects pretty much, not everything that happens external to the program. I mean any external part of your application that we've got no control over. This means from a testing point of view, the database, sometimes filesystem, other services, http requests, etc. etc.

This is pretty much the only place outside of the strategy pattern, where I use traits, especially async traits.

#[async_trait]
pub trait MealPlannerDatabase {
	async fn book_meal(&self) -> Result<()>;
}
pub type DynMealPlannerDatabase = Arc<dyn MealPlannerDatabase>

#[async_trait]
pub trait MealPlannerEvents {
	async fn meal_booked(&self) -> Result<()>
}
pub type DynMealPlannerEvents = Arc<dyn MealPlannerEvents>

This means like in the previous sections that we can mock the external services, which allows us to focus on the business logic inside the MealPlannerAPI, or rather MealPlannerService


pub struct MealPlannerService {
	// Please use the wrapper pattern shown in a previous section, this is just an example
	database: DynMealPlannerDatabase,
	events: DynMealPlannerEvents
}

impl MealPlannerService {
	pub fn new(database: DynMealPlannerDatabase, events: DynMealPlannerEvents) -> Self {
		Self {
			database,
			events
		}
	}

	pub async fn book_meal(&self) -> Result<()> {
		let meal_booking = self.generate_meal_booking();

		self.database.book_meal(&meal_booking).await?;
		self.events.meal_booked(&meal_booking).await?;

		Ok(())
	}

	fn generate_meal_booking(&self) -> MealBooking {
		// ...
	}
}

As you can see there isn't a terrible amount of meat on this logic, I'd actually normally argue that this shouldn't even be unit tested, but for completeness sake, lets just say that generate_meal_booking is unreasonably complicated and requires not just locking down its functionality, but helping guide development.

You can now choose to implement your own mocks for the Database and/or Events. And test the book_meal function to make sure the database and events are called with what you expect them too. Currently I'd either recommend rolling your own mocks, or using mockall.

Split dat class

It may be useful in rust to simply split your functionality into multiple parts, those that call external services, and simply isolating business logic.

impl MealPlannerService {
	pub fn new(database: DynMealPlannerDatabase, events: DynMealPlannerEvents) -> Self {
		Self {
			database,
			events
		}
	}

	pub async fn book_meal(&self) -> Result<()> {
		let meal_booking = self.generate_meal_booking();

		self.database.book_meal(&meal_booking).await?;
		self.events.meal_booked(&meal_booking).await?;

		Ok(())
	}

	fn generate_meal_booking(&self) -> MealBooking {
		// ...
	}
}

Now we can simply call generate_meal_booking, simple as that. But now you may say, but, but I don't get my precious 100% test coverage, and I'd like to ask if you're out here collecting points, or actually building software. Enough feathers ruffled, I'd highly recommend choosing wisely what to test, if you want 100% test coverage, you're gonna trade that for increased boilerplate and complexity, and unless you're building a rocket, it may not be warranted.

This is it for testing, next one we're gonna move into a few general points

Ergonomics

To macro, or not to macro

Macros are useful, so much so, that they're tempting to use everywhere. Procmacro is literally crack cocaine, I will provide a word of caution though. Macros are another language inside rust, and can do anything the heart desires. First of all if you use macros, you will trade complexity and developer experience for decreased perceived complexity. Sometimes it is needed, other times it is a convenience, so be sure to choose wisely.

For example:

  • async_trait is essential, rust doesn't have object safe traits without, or at least not without arcane knowledge, and increased boilerplate. This is the only non-struct procmacro I regularly use for filling gaps in functionality.
  • mockall is quite useful for generating mocks, though be careful with it, it can introduce unexpected code, and introduce general limitations on your traits and structs. I only use it for traits.

You should definitely use procmacros if they're essential for your app, such as in rocket, clap, tracing, leptos, etc. A good rule of thumb is, simply to really think if a procmacro is essential for your use-case. Often it is, most I've overused them in the past, and had a hell of a time cleaning them up.

Defer for simplicity

Rust has enough tools and features to do a lot of things in 100 different ways. If you're serious about building services and product, defer for simplicity and be consistent. You could take a stance and say that you wont use async, or never use clone, etc. You'd end up taking on a whole load of complexity that would make the service quite unapproachable for further development. Raw dogging channels for request/reply is a nice feature, but honestly, it is a foundational block of functionality not a great api.

Keep things simple, and resist the need for creating abstractions for everything. It is okay to have the same code in a few places, and don't use macros for doing DRY. I've never seen it play out right

Use crates, and build your own

Quite simply if you're building services, build your own crates, tailor them to your needs, develop a crate that automatically setups up a database connection, bundle your own logging setup that makes sure we export things as json etc. Implement your own context libraries for sharing variables throughout a call etc. There are a lot of libraries that isn't useful on crates.io for others, but if you choose to build small individual services, it can be quite useful to have easy to use out of the box functionality

Workspaces, be aware

Workspaces are nice, and I actually default to them for my own services, but be careful I've got a tendency to make small libraries in these workspaces alongside my app. This can make it difficult to know where a crate comes from, and gives the service multiple responsibilities, or reason to be deployed / worked on. As such remember to keep services and workspaces focused on the topic at hand. That is unless you use a mono repo approach, but that is quite difficult to do with rusts compile times.

Conclusion

I hoped that I've shown you some currently good practices for how to develop services in rust. We've covered anything I think is essential for building production ready code, which trades some performance for increased ergonomics, while keeping complexity at bay. It should be mentioned that this is just my own opinions and what feels right in 2024, where we're still missing crucial async features in rust. So it could change quite a bit over the next few years.

If you feel like something was unclear, or you'd like a topic to be expanded upon, let me know at contact@kasperhermansen.com.

Thanks a lot of reading, and I hope to see you at some point to a Rust Aarhus Meetup