Build Configuration - The Rust Performance Book (2024)

You can drastically change the performance of a Rust program without changingits code, just by changing its build configuration. There are many possiblebuild configurations for each Rust program. The one chosen will affect severalcharacteristics of the compiled code, such as compile times, runtime speed,memory use, binary size, debuggability, profilability, and which architecturesyour compiled program will run on.

Most configuration choices will improve one or more characteristics whileworsening one or more others. For example, a common trade-off is to acceptworse compile times in exchange for higher runtime speeds. The right choicefor your program depends on your needs and the specifics of your program, andperformance-related choices (which is most of them) should be validated withbenchmarking.

It is worth reading this chapter carefully to understand all the buildconfiguration choices. However, for the impatient or forgetful,cargo-wizard encapsulates this information and can help you choose anappropriate build configuration.

Note that Cargo only looks at the profile settings in the Cargo.toml file atthe root of the workspace. Profile settings defined in dependencies areignored. Therefore, these options are mostly relevant for binary crates, notlibrary crates.

Release Builds

The single most important build configuration choice is simple but easy tooverlook: make sure you are using a release build rather than a dev buildwhen you want high performance. This is usually done by specifying the--release flag to Cargo.

Dev builds are the default. They are good for debugging, but are not optimized.They are produced if you run cargo build or cargo run. (Alternatively,running rustc without additional options also produces an unoptimized build.)

Consider the following final line of output from a cargo build run.

Finished dev [unoptimized + debuginfo] target(s) in 29.80s

This output indicates that a dev build has been produced. The compiled codewill be placed in the target/debug/ directory. cargo run will run the devbuild.

In comparison, release builds are much more optimized, omit debug assertionsand integer overflow checks, and omit debug info. 10-100x speedups over devbuilds are common! They are produced if you run cargo build --release orcargo run --release. (Alternatively, rustc has multiple options foroptimized builds, such as -O and -C opt-level.) This will typically takelonger than a dev build because of the additional optimizations.

Consider the following final line of output from a cargo build --release run.

Finished release [optimized] target(s) in 1m 01s

This output indicates that a release build has been produced. The compiled codewill be placed in the target/release/ directory. cargo run --release willrun the release build.

See the Cargo profile documentation for more details about the differencesbetween dev builds (which use the dev profile) and release builds (which usethe release profile).

The default build configuration choices used in release builds provide a goodbalance between the abovementioned characteristics such as compile times, runtimespeed, and binary size. But there are many possible adjustments, as thefollowing sections explain.

Maximizing Runtime Speed

The following build configuration options are designed primarily to maximizeruntime speed. Some of them may also reduce binary size.

Codegen Units

The Rust compiler splits crates into multiple codegen units to parallelize(and thus speed up) compilation. However, this might cause it to miss somepotential optimizations. You may be able to improve runtime speed and reducebinary size, at the cost of increased compile times, by setting the number ofunits to one. Add these lines to the Cargo.toml file:

[profile.release]codegen-units = 1

Example 1,Example 2.

Link-time Optimization

Link-time optimization (LTO) is a whole-program optimization technique thatcan improve runtime speed by 10-20% or more, and also reduce binary size, atthe cost of worse compile times. It comes in several forms.

The first form of LTO is thin local LTO, a lightweight form of LTO. Bydefault the compiler uses this for any build that involves a non-zero level ofoptimization. This includes release builds. To explicitly request this level ofLTO, put these lines in the Cargo.toml file:

[profile.release]lto = false

The second form of LTO is thin LTO, which is a little more aggressive, andlikely to improve runtime speed and reduce binary size while also increasingcompile times. Use lto = "thin" in Cargo.toml to enable it.

The third form of LTO is fat LTO, which is even more aggressive, and mayimprove performance and reduce binary size further while increasing buildtimes again. Use lto = "fat" in Cargo.toml to enable it.

Finally, it is possible to fully disable LTO, which will likely worsen runtimespeed and increase binary size but reduce compile times. Use lto = "off" inCargo.toml for this. Note that this is different to the lto = false option,which, as mentioned above, leaves thin local LTO enabled.

Alternative Allocators

It is possible to replace the default (system) heap allocator used by a Rustprogram with an alternative allocator. The exact effect will depend on theindividual program and the alternative allocator chosen, but large improvementsin runtime speed and large reductions in memory usage have been seen inpractice. The effect will also vary across platforms, because each platform’ssystem allocator has its own strengths and weaknesses. The use of analternative allocator is also likely to increase binary size and compile times.

jemalloc

One popular alternative allocator for Linux and Mac is jemalloc, usable viathe tikv-jemallocator crate. To use it, add a dependency to yourCargo.toml file:

[dependencies]tikv-jemallocator = "0.5"

Then add the following to your Rust code, e.g. at the top of src/main.rs:

#[global_allocator]static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc;

Furthermore, on Linux, jemalloc can be configured to use transparent hugepages (THP). This can further speed up programs, possibly at the cost ofhigher memory usage.

Do this by setting the MALLOC_CONF environment variable appropriately beforebuilding your program, for example:

MALLOC_CONF="thp:always,metadata_thp:always" cargo build --release

The system running the compiled program also has to be configured to supportTHP. See this blog post for more details.

mimalloc

Another alternative allocator that works on many platforms is mimalloc,usable via the mimalloc crate. To use it, add a dependency to yourCargo.toml file:

[dependencies]mimalloc = "0.1"

Then add the following to your Rust code, e.g. at the top of src/main.rs:

#[global_allocator]static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

CPU Specific Instructions

If you do not care about the compatibility of your binary on older (or othertypes of) processors, you can tell the compiler to generate the newest (andpotentially fastest) instructions specific to a certain CPU architecture,such as AVX SIMD instructions for x86-64 CPUs.

To request these instructions from the command line, use the -C target-cpu=native flag. For example:

RUSTFLAGS="-C target-cpu=native" cargo build --release

Alternatively, to request these instructions from a config.toml file (forone or more projects), add these lines:

[build]rustflags = ["-C", "target-cpu=native"]

This can improve runtime speed, especially if the compiler finds vectorizationopportunities in your code.

If you are unsure whether -C target-cpu=native is working optimally, comparethe output of rustc --print cfg and rustc --print cfg -C target-cpu=nativeto see if the CPU features are being detected correctly in the latter case. Ifnot, you can use -C target-feature to target specific features.

Profile-guided Optimization

Profile-guided optimization (PGO) is a compilation model where you compileyour program, run it on sample data while collecting profiling data, and thenuse that profiling data to guide a second compilation of the program. This canimprove runtime speed by 10% or more.Example 1,Example 2.

It is an advanced technique that takes some effort to set up, but is worthwhilein some cases. See the rustc PGO documentation for details. Also, thecargo-pgo command makes it easier to use PGO (and BOLT, which is similar)to optimize Rust binaries.

Unfortunately, PGO is not supported for binaries hosted on crates.io anddistributed via cargo install, which limits its usability.

Minimizing Binary Size

The following build configuration options are designed primarily to minimizebinary size. Their effects on runtime speed vary.

Optimization Level

You can request an optimization level that aims to minimize binary size byadding these lines to the Cargo.toml file:

[profile.release]opt-level = "z"

This may also reduce runtime speed.

An alternative is opt-level = "s", which targets minimal binary size a littleless aggressively. Compared to opt-level = "z", it allows slightly moreinlining and also the vectorization of loops.

Abort on panic!

If you do not need to unwind on panic, e.g. because your program doesn’t usecatch_unwind, you can tell the compiler to simply abort on panic. Onpanic, your program will still produce a backtrace.

This might reduce binary size and increase runtime speed slightly, and may evenreduce compile times slightly. Add these lines to the Cargo.toml file:

[profile.release]panic = "abort"

Strip Debug Info and Symbols

You can tell the compiler to strip debug info and symbols from the compiledbinary. Add these lines to Cargo.toml to strip just debug info:

[profile.release]strip = "debuginfo"

Alternatively, use strip = "symbols" to strip both debug info and symbols.

Prior to Rust 1.77, the default behaviour was to do no stripping. As of Rust1.77 the default behaviour is to strip debug info in release builds.

Stripping debug info can greatly reduce binary size. On Linux, the binary sizeof a small Rust programs might shrink by 4x when debug info is stripped.Stripping symbols can also reduce binary size, though generally not by as much.Example.The exact effects are platform-dependent.

However, stripping makes your compiled program more difficult to debug andprofile. For example, if a stripped program panics, the backtrace produced maycontain less useful information than normal. The exact effects for the twolevels of stripping depend on the platform.

Other Ideas

For more advanced binary size minimization techniques, consult thecomprehensive documentation in the excellent min-sized-rust repository.

Minimizing Compile Times

The following build configuration options are designed primarily to minimizecompile times.

Linking

A big part of compile time is actually linking time, particularly whenrebuilding a program after a small change. It is possible to select a fasterlinker than the default one.

One option is lld, which is available on Linux and Windows. To specify lldfrom the command line, use the -C link-arg=-fuse-ld=lld flag. For example:

RUSTFLAGS="-C link-arg=-fuse-ld=lld" cargo build --release

Alternatively, to specify lld from a config.toml file (for one or moreprojects), add these lines:

[build]rustflags = ["-C", "link-arg=-fuse-ld=lld"]

lld is not fully supported for use with Rust, but it should work for most usecases on Linux and Windows. There is a GitHub Issue tracking full support forlld.

Another option is mold, which is currently available on Linux and macOS.Simply substitute mold for lld in the instructions above. mold is oftenfaster than lld.Example.It is also much newer and may not work in all cases.

Unlike the other options in this chapter, there are no trade-offs here!Alternative linkers can be dramatically faster, without any downsides.

Experimental Parallel Front-end

If you use nightly Rust, you can enable the experimental parallel front-end.It may reduce compile times at the cost of higher compile-time memory usage. Itwon’t affect the quality of the generated code.

You can do that by adding -Zthreads=N to RUSTFLAGS, for example:

RUSTFLAGS="-Zthreads=8" cargo build --release

Alternatively, to enable the parallel front-end from a config.toml file (forone or more projects), add these lines:

[build]rustflags = ["-Z", "threads=8"]

Values other than 8 are possible, but that is the number that tends to givethe best results.

In the best cases, the experimental parallel front-end reduces compile times byup to 50%. But the effects vary widely and depend on the characteristics of thecode and its build configuration, and for some programs there is no compiletime improvement.

Cranelift Codegen Back-end

If you use nightly Rust on x86-64/Linux or ARM/Linux, you can enable theCranelift codegen back-end. It may reduce compile times at the cost of lowerquality generated code, and therefore is recommended for dev builds rather thanrelease builds.

First, install the back-end with this rustup command:

rustup component add rustc-codegen-cranelift-preview --toolchain nightly

To select Cranelift from the command line, use the-Zcodegen-backend=cranelift flag. For example:

RUSTFLAGS="-Zcodegen-backend=cranelift" cargo +nightly build

Alternatively, to specify Cranelift from a config.toml file (for one ormore projects), add these lines:

[unstable]codegen-backend = true[profile.dev]codegen-backend = "cranelift"

For more information, see the Cranelift documentation.

Custom profiles

In addition to the dev and release profiles, Cargo supports customprofiles. It might be useful, for example, to create a custom profile halfwaybetween dev and release if you find the runtime speed of dev buildsinsufficient and the compile times of release builds too slow for everydaydevelopment.

Summary

There are many choices to be made when it comes to build configurations. Thefollowing points summarize the above information into some recommendations.

  • If you want to maximize runtime speed, consider all of the following:codegen-units = 1, lto = "fat", an alternative allocator, and panic = "abort".
  • If you want to minimize binary size, consider opt-level = "z",codegen-units = 1, lto = "fat", panic = "abort", and strip = "symbols".
  • In either case, consider -C target-cpu=native if broad architecture supportis not needed, and cargo-pgo if it works with your distribution mechanism.
  • Always use a faster linker if you are on a platform that supports it, becausethere are no downsides to doing so.
  • Use cargo-wizard if you need additional help with these choices.
  • Benchmark all changes, one at a time, to ensure they have the expectedeffects.

Finally, this issue tracks the evolution of the Rust compiler’s own buildconfiguration. The Rust compiler’s build system is stranger and more complexthan that of most Rust programs. Nonetheless, this issue may be instructive inshowing how build configuration choices can be applied to a large program.

Build Configuration - The Rust Performance Book (2024)

FAQs

How to optimize Rust Lang? ›

Optimizing Rust code involves the following steps;
  1. Use the Release Profile. When compiling your Rust code, use the --release flag. ...
  2. Leverage Iterators. Iterators in Rust are zero-cost abstractions that can optimize your code significantly. ...
  3. Make use of 'match' instead of 'if else' ...
  4. Use 'Cargo Bench' to Benchmark.
Sep 18, 2023

How do you enable LTO in Rust? ›

Link-time Optimization

Use lto = "thin" in Cargo. toml to enable it. The third form of LTO is fat LTO, which is even more aggressive, and may improve performance and reduce binary size further while increasing build times again.

What is the opt level in Rustflags? ›

The opt-level setting controls the number of optimizations Rust will apply to your code with a range of zero to three. Applying more optimizations extends compiling time, so if you're in development and compiling your code often, you want faster compiling even at the expense of the resulting code running slower.

Does Rust outperform C++? ›

When comparing, Rust performance vs C++ is often cited as being faster because of its unique components. More often than not, their speed depends on the program being developed, the compiler, and the quality of the code.

Does LTO improve performance? ›

LTO can give double digit performance boosts for many programs. Can lower RAM usage per program making it very useful for limited memory systems.

What is LTO optimization? ›

What is Link Time Optimization (LTO) Link Time Optimization is a form of interprocedural optimization that is performed at the time of linking application code. Without LTO, Arm® Compiler for Linux compiles and optimizes each source file independently of one another, then links them to form the executable.

How do I enable God mode in Rust? ›

To enable god mode in Rust, open the in-game console first by pressing F1. Then, simply type the command “God” into the console and press enter.

What is the difference between Rust opt level 2 and 3? ›

Both opt-level = 2 and 3 optimize for speed at the expense of binary size, but level 3 does more vectorization and inlining than level 2 . In particular, you'll see that at opt-level equal to or greater than 2 LLVM will unroll loops.

How do you turn off optimization in Rust? ›

-Zmir-opt-level=0 should disable all optimizations and none of the mandatory passes.

What are rustflags? ›

RUSTFLAGS — A space-separated list of custom flags to pass to all compiler invocations that Cargo performs. In contrast with cargo rustc , this is useful for passing a flag to all compiler instances. See build.

How do I make Rust run smoother? ›

Upgrading Your Hardware Components
  1. Upgrading Your RAM. Increasing your RAM (random access memory) is one of the easiest ways to speed up your Rust experience. ...
  2. Upgrading Your Graphics Card. A powerful graphics card is essential for running a graphics-intensive game like Rust. ...
  3. Upgrading Your Storage. ...
  4. Upgrading Your CPU.
Jan 12, 2024

How long does it take to get good at Rust Lang? ›

However, a very rough rule of thumb is that 3-6 months of learning and coding in Rust would usually be expected to be enough for an already experienced developer to reach a solid intermediate level and be able to work on commercial projects.

Top Articles
What Is a Burner Phone?
How do I know if someone archived me on WhatsApp?
Odawa Hypixel
Erika Kullberg Wikipedia
Do you need a masters to work in private equity?
Words From Cactusi
Steve Strange - From Punk To New Romantic
270 West Michigan residents receive expert driver’s license restoration advice at last major Road to Restoration Clinic of the year
Rainfall Map Oklahoma
Craigslist Pets Southern Md
Wildflower1967
Costco Gas Foster City
The ULTIMATE 2023 Sedona Vortex Guide
Jackson Stevens Global
Midlife Crisis F95Zone
Brett Cooper Wikifeet
Harem In Another World F95
Georgia Vehicle Registration Fees Calculator
Spoilers: Impact 1000 Taping Results For 9/14/2023 - PWMania - Wrestling News
Strange World Showtimes Near Roxy Stadium 14
Sulfur - Element information, properties and uses
Blue Rain Lubbock
Iu Spring Break 2024
Phoebus uses last-second touchdown to stun Salem for Class 4 football title
Melendez Imports Menu
Ivegore Machete Mutolation
Somewhere In Queens Showtimes Near The Maple Theater
Mta Bus Forums
Wku Lpn To Rn
Jurassic World Exhibition Discount Code
Dhs Clio Rd Flint Mi Phone Number
How often should you visit your Barber?
Kempsville Recreation Center Pool Schedule
Frequently Asked Questions - Hy-Vee PERKS
Ixlggusd
What Happened To Father Anthony Mary Ewtn
Here’s how you can get a foot detox at home!
Car Crash On 5 Freeway Today
Today's Final Jeopardy Clue
42 Manufacturing jobs in Grayling
Craigslist Pets Huntsville Alabama
Pensacola Cars Craigslist
Empires And Puzzles Dark Chest
Kb Home The Overlook At Medio Creek
Frigidaire Fdsh450Laf Installation Manual
Myrtle Beach Craigs List
Joblink Maine
Value Village Silver Spring Photos
Arginina - co to jest, właściwości, zastosowanie oraz przeciwwskazania
Fine Taladorian Cheese Platter
Google Flights Missoula
Room For Easels And Canvas Crossword Clue
Latest Posts
Article information

Author: Margart Wisoky

Last Updated:

Views: 6250

Rating: 4.8 / 5 (78 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Margart Wisoky

Birthday: 1993-05-13

Address: 2113 Abernathy Knoll, New Tamerafurt, CT 66893-2169

Phone: +25815234346805

Job: Central Developer

Hobby: Machining, Pottery, Rafting, Cosplaying, Jogging, Taekwondo, Scouting

Introduction: My name is Margart Wisoky, I am a gorgeous, shiny, successful, beautiful, adventurous, excited, pleasant person who loves writing and wants to share my knowledge and understanding with you.