Configuring inconfigurable

In this article, we will be changing which address a pre-compiled game server binds to by writing a little bit of Rust.

The game

I love playing 7 Days to Die (7d2d) with my friends.

The game has been in an "alpha" state for more than 10 years now and while it is definitely playable and very enjoyable, the game engine is still not completely polished, so the server calculations are pretty CPU-intensive, which usually means less smooth experience for the person hosting the game server.

So running the server part of the game on a separate machine, AKA "dedicated server" gives a smoother gaming experience.

Hosting

The game server doesn't need all of the power of a server in a data center, just a bit of CPU time and a few gigs of RAM. Luckily there is a plethora of cloud providers that can rent these to us at a reasonable price. I briefly looked at the Digital Ocean, which was recommended by friends. The glance was brief enough to completely overlook the fact that their droplets are priced hourly, so I don't actually have to pay for the whole month of renting an expensive VM, since we usually don't play more than a couple of hours a week.

Missing this crucial piece of information and being wary of "unknown unknowns", I decided to turn to the familiar provider fly.io, even knowing about its UDP networking issues.

Fly.io UDP networking issues

At the moment of writing this, fly.io has four gotchas about UDP communication, as their documentation puts it.

  • The socket needs to bind to a specific address.
  • No ipv6.
  • Same port outside the container and inside.
  • The packet's MTU is ~1300 instead of 1500 as with most other servers.

I don't need to care about ipv6, and I haven't run into any issues with the MTU size yet, and while the 7d2d server socket port is easily configurable, the address is not.

And the game doesn't work without a UDP connection.

I was almost ready to revisit the investigation into the alternative cloud providers, but then I remembered the excellent series of YouTube videos by Amos about messing with the compiled code.

Specifically the mention of LD_PRELOAD here

LD_PRELOAD

To recap: programs can call functions that are defined in external "dynamic libraries" and you can easily ask Linux to kindly look for those functions in a library that you provide, effectively "hijacking" the call.

To start listening on a port, the program needs to first ask the operating system for a socket using a function appropriately called socket. The function returns a file descriptor, which is a number that is used in later calls to reference this socket. The bind address needs to be changed only for UDP sockets, messing up the address for other sockets will lead to the server not working. so this function needs to be intercepted first, to know which sockets are UDP. The goal is to go from

regular network socket creation

to

intercepted network calls

Writing a dynamic library in Rust

To start, invoke cargo with a --lib flag: cargo new interceptor --lib

Now this will create a crate that gets compiled as a "rust" library, but the goal is to make a "dynamic C" library. To rectify this, a section needs to be added to the Cargo.toml

[lib]
crate-type = ["cdylib"]

To export a rust function from such a library two things are required:

Overriding socket

First, we'll look at how to "override" the system socket function. Signatures of overridden functions should match the original ones to prevent issues. To get the parameter types we can use libc crate. The definition will look something like

use libc::c_int;

#[no_mangle]
pub extern "C" fn socket(domain: c_int, ty: c_int, protocol: c_int) -> c_int {
    -1
}

Returning -1 is often a way to communicate a failure in C world, so this makes our server stuck in a boot loop, failing to call some external APIs with errors like

libcurl info message cache 4 (Closing connection 2)

What if we call the original instead?

#[no_mangle]
pub extern "C" fn socket(domain: c_int, ty: c_int, protocol: c_int) -> c_int {
    unsafe { libc::socket(domain, ty, protocol) }
}

This leads to the app just crashing with a segmentation fault (core dumped). Well, this is discouraging, just one unsafe block and we are already in the scary segfault land.

You might have guessed what has happened here, but I had to turn to the friendly Rust discord server where people quickly pointed me to the fact that the libc::socket call just ends up in the same overridden function, that we just wrote, creating an infinite recursion.

Quick stackoverflow search later we find a way to get a link to the original function and call it, using the libc::dlsym function. I've also wrapped the call to that function into a macro because there is some boilerplate involved, that I wasn't keen on copy-pasting in two places. I failed to correctly express the type of a function pointer, that can be cast from void *, so it ended up being a macro rather than a regular function.

macro_rules! get_original_function {
    ($name:expr, $ty:ty) => {{
        let orig_name = CString::new($name).unwrap();
        let orig_fn_ptr = unsafe { libc::dlsym(libc::RTLD_NEXT, orig_name.into_raw()) };
        let f: $ty = unsafe { std::mem::transmute(orig_fn_ptr) };
        f
    }};
}

Now we need to remember which of the file descriptors the system provided to us are for UDP sockets and when the server calls "bind" on them, override the address to which the bind is happening.

I am not looking forward to adding more unsafe code and since this isn't a high-load web server, I don't expect that the socket creation function will be called too many times, so I went the safest route by storing UDP file descriptors in a OnceLock<Mutex<HashSet<c_int>>>.

  • OnceLock allows initiating the value once, from any thread
  • Mutex allows modifying the content from any thread
  • HashSet allows fast* inclusion checks.
Using a `Vec` instead of a `HashSet` is likely faster for a small number of file descriptors because calculating hashes takes some non-trivial number of instructions, while cpu can just load the whole vector into its cache and quickly traverse it, comparing integers; but for the theoretical case (that will likely never happen in this application) if there is a large number of file descriptors, the vector implementation at some point will become slower.

This function will help us with getting a release-mutex-on-drop mutable reference (aka Guard)

fn get_sockets<'a>() -> MutexGuard<'a, HashSet<c_int>> {
    UDP_SOCKETS_FDS
        .get_or_init(Default::default)
        .lock()
        .unwrap()
}

And then write the rest of our logic. We get the original function using the macro and immediately call it.

#[no_mangle]
pub extern "C" fn socket(domain: c_int, ty: c_int, protocol: c_int) -> c_int {
    let sock =
        get_original_function!("socket", fn(c_int, c_int, c_int) -> c_int)(domain, ty, protocol);

    if sock != -1 && domain == libc::AF_INET && ty == libc::SOCK_DGRAM {
        get_sockets().insert(sock);
    }

    sock
}

Overriding bind

Overriding the address will be a bit more tricky since the interceptor has to provide a const pointer to an address structure. C type system does not allow authors to express how long this pointer needs to remain valid and the documentation is of no help either. We can only hope (or go read the source code, but who has the time (my friend does, he actually went and read it, confirming that kernel tries to get rid of the user-passed pointer as soon as possible for the sake of kernel devs mental health)) that the authors of this function did the sensible thing and didn't store this pointer for later examination, so we can get rid of the structure once the call returns.

#[no_mangle]
pub extern "C" fn bind(
    sock: c_int,
    addr: *const libc::sockaddr,
    addr_len: libc::socklen_t,
) -> c_int {
    let mut addr_ptr = addr;
    let mut new_addr: libc::sockaddr;

    if get_sockets().contains(&sock) {
        todo!("override the address");
    }

    get_original_function!(
        "bind",
        fn(c_int, *const libc::sockaddr, libc::socklen_t) -> c_int
    )(sock, addr_ptr, addr_len)
}

The address that the socket needs to be bound to can be queried using a small crate dns_lookup like this:

let fly_dns_bind_addr = dns_lookup::lookup_host("fly-global-services")
    .expect("to resolve the address")
    .into_iter()
    .filter_map(|a| match a {
        std::net::IpAddr::V4(v4) => Some(v4.octets()),
        std::net::IpAddr::V6(_) => None,
    })
    .next()
    .unwrap();

I am freely using unwrap everywhere, which might be a bad idea since it is panicking across an FFI boundary. I only hope that the game will crash since all these errors are indeed not recoverable.

If the type of socket is UDP, we can dereference its addr pointer and change it a bit.

Does anyone know a better way to write "replace these bytes with those bytes" in Rust?

for (target, source) in (&mut new_addr.sa_data[2..6])
            .into_iter()
            .zip(fly_dns_bind_addr)
        {
            *target = source as i8;
        }
        addr_ptr = &new_addr;

Just the final call remains and we are done!

    get_original_function!(
        "bind",
        fn(c_int, *const libc::sockaddr, libc::socklen_t) -> c_int
    )(sock, addr_ptr, addr_len)

Screenshot of three players standing on top of each other in a "Seven Days to Die" game

Conclusion

The interceptor library works but it could definitely use some fixes for wider applications, for example, removing the socket from the list of UDP sockets when the app requests to close the socket. The source code can be found here. In case you want your own 7d2d server, the repository also contains instructions, fly.io config and a telegram bot for easy server management.

Source code

PRs welcome!

Happy hacking and slashing!