Ridiculously Tiny, Auditable Images with Rust and Distroless
Table of Contents
- The White Rabbit - Trivy
- Phase 1 - Application Auditability
- Phase 2 - Getting rid of base image vulnerabilities
- Phase 3 - Static Compilation
- Phase 4 - Smaller! (aka using musl)
- Phase 5 - Even Smaller! (aka distrolessless)
- Phase 6 - Still Smaller! (aka compiler optimisation)
- Where next?
The White Rabbit - Trivy
I recently had fun building a little toy app in rust, and when the time came to containerising and deploying it, I found myself going down a rabbit hole of miniaturisation. This is the story of that rabbit hole.
TLDR: We’re going to wind up building a tiny image with a minimal surface area for vulnerabilities and full auditability - jump to the end for the final Containerfile1
This leporid journey starts with insecurity. That is, on initially building the application into an image based (arbitrarily) on a debian image and scanning with Trivy, I find myself irrationally annoyed by a distressing number of vulnerabilities2.
On further investigation, I’m further irrationally annoyed by the fact that not only are all of the vulnerabilities from my arbitrary base, but Trivy evidently has no knowledge of my own app’s dependencies.
Phase 1 - Application Auditability
Fortunately, letting Trivy know about what dependencies we use in an application is trivial3, courtesy of some existing tooling - namely the cargo auditable plugin. This works by generating an SBOM (Software Bill Of Materials) and stuffing it right into the compiled binary.
To make use of this, we simply install the plugin and use that for building our binary in the build stage of our Containerfile:
# ...
RUN cargo install cargo-auditable
COPY Cargo.lock Cargo.toml ./
COPY src/ src/
RUN cargo auditable build --release
# ...Phase 2 - Getting rid of base image vulnerabilities
Again existing tooling comes to the rescue, in the form of “Distroless” Container Images, designed specifically for the problem of runaway dependencies. Considering our application is a single binary, we don’t really need much at all from our base and we’re right in the target audience.
Swapping out the base for our final application image from debian:trixy-slim to distroless/cc-debian13:nonroot4
is fairly straight-forward, and immediately gets our image’s vulnerability count down to zero.
As a nice little side-effect, it also drops our image size from an already respectable 34MiB to an impressive 14.3MiB.
As a side-side-effect, it also gives me an irrational urge to see just how small we can make this thing.
Phase 3 - Static Compilation
What we really, really want is to5 be able to use the distroless static base - but to do this we need to make
sure our binary is truly statically compiled.
Static compilation with rust is easy-ish enough, we just need to set a magic incantation RUSTFLAG to target
the feature `-C target-feautre=+crt-static.
And immediately we’re struck with a wall of linking errors to symbols in libcrypto-lib-c_zlib.o…
The Side Quest - rustls
The reqwest library, used extensively by our toy app considering it mostly just wraps around calling an API, has two available backends for dealing with TLS:
native-tls- which wraps the OS TLS frameworks (or the ubiquitous OpenSSL library)rustls- a rust native library
rustls, for hopefully evident reasons, is what we want for our use case. It’s also the default choice with the
default reqwest feature flags. However, watching compilation and digging through our cargo.lock manifest, for some
undocumented6 reason it appears that when combined with the tokio asynchronous runtime (also used extensively by our
toy app) then the native-tls dependency is preferred instead.
Fortunately, it’s easy enough, if fiddly, to simply be very specific about our feature flags in our cargo dependencies:
reqwest = { version = "0.12.24", features = [
"json",
# Force rustls
"rustls-tls",
# Defaults except native-tls
"charset",
"http2",
"system-proxy",
], default-features = false }Where were we?
And back from our sidequest, we’re now compiling again. Swapping to the static distroless base for the final image, we’re now down to ~4.9MiB. Pow! now we’re in truly tiny image territory!
Phase 4 - Smaller! (aka using musl)
We may be tiny already, but by this stage my irrational urge for miniaturisation has become a full-blown obsession. Next step, let’s try replacing those old, stodgy glibc std lib with musl - the standard library used by the famously tiny (but not tiny enough) alpine distro based images.
If you care about such inane things as “performance” or “reliability” - you probably want to either ignore this section7 or do some thorough benchmarking. musl is not a drop in replacement for glibc with regards to compatibility or performance.
There’s a few steps to this, but nothing too onerous.
- Use a musl friendly alpine base for our build image
- Install the
musl-dev- this isn’t installed by default in the base as it’s not exactly a common requirement. - Ensure we target the correct platform triplet
With these changes, we’re now down to a smidge under 4.5MiB. glibc added 400 whole KiB of bloat to our image.
Phase 5 - Even Smaller! (aka distrolessless)
While doing this write-up, a now obvious-in-hindsight thought came to me… Our binary is now fully self-contained, is there anything we even need from our static distroless base? Long answer, all it really provides are tzinfo (not needed, as long as we’re happy to keep everything in UTC) and ca-certificates (not used, as we’ve baked our trust store directly into the binary with rustls).
Short answer, no.
Replacing our base for the final image with FROM scratch to build it from a literal bare base, we now get a (yes,
working) image with a total size of: 3.61MiB
# Image build base
FROM rust:1-alpine as builder
WORKDIR /build
# Our musl build dependency
RUN apk add --no-cache musl-dev
# For our SBOM generation
RUN cargo install cargo-auditable
# Tell rustc to statically link everything
ENV RUSTFLAGS="-C target-feature=+crt-static"
COPY Cargo.lock Cargo.toml ./
COPY src/ src/
RUN cargo auditable build --release --target x86_64-unknown-linux-musl
# Runtime image "base"
FROM scratch
# The binary is literally the only thing we need
COPY --from=builder /build/target/x86_64-unknown-linux-musl/release/ai-operator /bin/
# Oh and a bit of metadata about how to actually run...
WORKDIR /app
EXPOSE 3000/tcp
CMD ["/bin/ai-operator"]Phase 6 - Still Smaller! (aka compiler optimisation)
At this stage, there’s nothing more we can do to reduce this from an image build perspective. The image layers are literally just the binary itself and metadata - there’s simply nothing left to strip out.
Any further reductions will need to target shrinking the binary itself, but fortunately(?) the rust compiler defaults don’t optimise for binary size - and we have some levers to pull here.
Updating our release profile settings with a fairly naive set of size-oriented optimisations, we’re now down to 2.1MiB:
- Stripping all symbols out of the binary - yes these are injected by default in the release profile!
- Setting our optimisation level for size, not speed
- Enabling link time optimisation, and forcing a single codegen unit to eliminate optimisation boundaries.
[profile.release]
strip = "symbols"
opt-level = "z"
lto = true
codegen-units = 1
There’s something else we can do here to shave off a surprising number of bytes - eliminating runtime panic unwinds.
By setting panic = "abort" we remove any stdlib stack traces for any possible panics.8
This gives us a final score of… 1.89MiB for the entire image.
Where next?
Aside from nerfing the aforementioned panic handling, we’re starting to run of things to eliminate without taking a hatchet to the code itself.
If I were to really let the obsession take me well and truly beyond reason, there are some static assets bundled in the binary that would make obvious targets to explore eliminating:
- Ignoring Phase 1 and just don’t add our SBOM. Who cares about security, anyway?
- Stripping superfluous trust certs - We’re currently bundling a full trust store with reqwest. Considering we’re only really hitting a single endpoint, we could get very fiddly and add ourselves some maintenance burden by manually building our own trust store consisting of the single ca at the root of our target endpoint (and pray the provider doesn’t change it)
-
Or Dockerfile, if you’re one for brand names. ↩
-
Even if none of them are particularly high. ↩
-
One might even say… Trivyal 🥁 ↩
-
Turns out while we don’t need much from our base, we still need something. The distroless base gives us glibc + libssl, and the cc variants also gives us glibcc1 - all of which turn out to be required by our binary courtesy of dependencies. ↩
-
zig-a-zig-ah ↩
-
If anybody has any clue as to why, I would be glad to acquire this knowledge from you… ↩
-
Actually you should probably just ignore this entire blog ↩
-
This setting hasn’t made it into the final image though, as this is likely an optimisation beyond reason. Stack traces are too useful when things go pear shaped. ↩