Custom Measurements
By default, Criterion.rs measures the wall-clock time taken by the benchmarks. However, there are many other ways to measure the performance of a function, such as hardware performance counters or POSIX's CPU time. Since version 0.3.0, Criterion.rs has had support for plugging in alternate timing measurements. This page details how to define and use these custom measurements.
Note that as of version 0.3.0, only timing measurements are supported, and only a single measurement can be used for one benchmark. These restrictions may be lifted in future versions.
Defining Custom Measurements
For developers who wish to use custom measurements provided by an existing crate, skip to "Using Custom Measurements" below.
Custom measurements are defined by a pair of traits, both defined in criterion::measurement
.
Measurement
First, we'll look at the main trait, Measurement
.
#![allow(unused)] fn main() { pub trait Measurement { type Intermediate; type Value: MeasuredValue; fn start(&self) -> Self::Intermediate; fn end(&self, i: Self::Intermediate) -> Self::Value; fn add(&self, v1: &Self::Value, v2: &Self::Value) -> Self::Value; fn zero(&self) -> Self::Value; fn to_f64(&self, val: &Self::Value) -> f64; fn formatter(&self) -> &dyn ValueFormatter; } }
The most important methods here are start
and end
and their associated types, Intermediate
and Value
. start
is called to start a measurement and end
is called to complete it. As an
example, the start
method of the wall-clock time measurement returns the value of the system
clock at the moment that start
is called. This starting time is then passed to the end
function,
which reads the system clock again and calculates the elapsed time between the two calls. This
pattern - reading some system counter before and after the benchmark and reporting the difference -
is a common way for code to measure performance.
The next two functions, add
and zero
are pretty simple; Criterion.rs sometimes needs to be able
to break up a sample into batches that are added together (eg. in Bencher::iter_batched
) and so
we need to have a way to calculate the sum of the measurements for each batch to get the overall
value for the sample.
to_f64
is used to convert the measured value to an f64
value so that Criterion can perform its
analysis. As of 0.3.0, only a single value can be returned for analysis per benchmark. Since f64
doesn't carry any unit information, the implementor should be careful to choose their units to avoid
having extremely large or extremely small values that may have floating-point precision issues. For
wall-clock time, we convert to nanoseconds.
Finally, we have formatter
, which just returns a trait-object reference to a ValueFormatter
(more on this later).
For our half-second measurement, this is all pretty straightforward; we're still measuring
wall-clock time so we can just use Instant
and Duration
like WallTime
does:
#![allow(unused)] fn main() { /// Silly "measurement" that is really just wall-clock time reported in half-seconds. struct HalfSeconds; impl Measurement for HalfSeconds { type Intermediate = Instant; type Value = Duration; fn start(&self) -> Self::Intermediate { Instant::now() } fn end(&self, i: Self::Intermediate) -> Self::Value { i.elapsed() } fn add(&self, v1: &Self::Value, v2: &Self::Value) -> Self::Value { *v1 + *v2 } fn zero(&self) -> Self::Value { Duration::from_secs(0) } fn to_f64(&self, val: &Self::Value) -> f64 { let nanos = val.as_secs() * NANOS_PER_SEC + u64::from(val.subsec_nanos()); nanos as f64 } fn formatter(&self) -> &dyn ValueFormatter { &HalfSecFormatter } } }
ValueFormatter
The next trait is ValueFormatter
, which defines how a measurement is displayed to the user.
#![allow(unused)] fn main() { pub trait ValueFormatter { fn format_value(&self, value: f64) -> String {...} fn format_throughput(&self, throughput: &Throughput, value: f64) -> String {...} fn scale_values(&self, typical_value: f64, values: &mut [f64]) -> &'static str; fn scale_throughputs(&self, typical_value: f64, throughput: &Throughput, values: &mut [f64]) -> &'static str; fn scale_for_machines(&self, values: &mut [f64]) -> &'static str; } }
All of these functions accept a value to format in f64 form; the values passed in will be in the
same scale as the values returned from to_f64
, but may not be the exact same values. That is, if
to_f64
returns values scaled to "thousands of cycles", the values passed to format_value
and
the other functions will be in the same units, but may be different numbers (eg. the mean of all
sample times).
Implementors should try to format the values in a way that will make sense to humans. "1,500,000 ns" is needlessly confusing while "1.5 ms" is much clearer. If you can, try to use SI prefixes to simplify the numbers. An easy way to do this is to have a series of conditionals like so:
#![allow(unused)] fn main() { if ns < 1.0 { // ns = time in nanoseconds per iteration format!("{:>6} ps", ns * 1e3) } else if ns < 10f64.powi(3) { format!("{:>6} ns", ns) } else if ns < 10f64.powi(6) { format!("{:>6} us", ns / 1e3) } else if ns < 10f64.powi(9) { format!("{:>6} ms", ns / 1e6) } else { format!("{:>6} s", ns / 1e9) } }
It's also a good idea to limit the amount of precision in floating-point output - after a few
digits the numbers don't matter much anymore but add a lot of visual noise and make the results
harder to interpret. For example, it's very unlikely that anyone cares about the difference between
10.2896653s
and 10.2896654s
- it's much more salient that their function takes "about 10.290
seconds per iteration".
With that out of the way, format_value
is pretty straightforward. format_throughput
is also not
too difficult; match on Throughput::Bytes
or Throughput::Elements
and generate an appropriate
description. For wall-clock time, that would likely take the form of "bytes per second", but a
measurement that read CPU performance counters might want to display throughput in terms of "cycles
per byte". Note that default implementations of format_value
and format_throughput
are provided
which use scale_values
and scale_throughputs
, but you can override them if you wish.
scale_values
is a bit more complex. This accepts a "typical" value chosen by Criterion.rs, and a
mutable slice of values to scale. This function should choose an appropriate unit based on the
typical value, and convert all values in the slice to that unit. It should also return a string
representing the chosen unit. So, for our wall-clock times where the measured values are in
nanoseconds, if we wanted to display plots in milliseconds we would multiply all of the input
values by 10.0f64.powi(-6)
and return "ms"
, because multiplying a value in nanoseconds by 10^-6
gives a value in milliseconds. scale_throughputs
does the same thing, only it converts a slice of
measured values to their corresponding scaled throughput values.
scale_for_machines
is similar to scale_values
, except that it's used for generating
machine-readable outputs. It does not accept a typical value, because this function should always
return values in the same unit.
Our half-second measurement formatter thus looks like this:
#![allow(unused)] fn main() { struct HalfSecFormatter; impl ValueFormatter for HalfSecFormatter { fn format_value(&self, value: f64) -> String { // The value will be in nanoseconds so we have to convert to half-seconds. format!("{} s/2", value * 2f64 * 10f64.powi(-9)) } fn format_throughput(&self, throughput: &Throughput, value: f64) -> String { match *throughput { Throughput::Bytes(bytes) => format!( "{} b/s/2", f64::from(bytes) / (value * 2f64 * 10f64.powi(-9)) ), Throughput::Elements(elems) => format!( "{} elem/s/2", f64::from(elems) / (value * 2f64 * 10f64.powi(-9)) ), } } fn scale_values(&self, ns: f64, values: &mut [f64]) -> &'static str { for val in values { *val *= 2f64 * 10f64.powi(-9); } "s/2" } fn scale_throughputs( &self, _typical: f64, throughput: &Throughput, values: &mut [f64], ) -> &'static str { match *throughput { Throughput::Bytes(bytes) => { // Convert nanoseconds/iteration to bytes/half-second. for val in values { *val = (bytes as f64) / (*val * 2f64 * 10f64.powi(-9)) } "b/s/2" } Throughput::Elements(elems) => { for val in values { *val = (elems as f64) / (*val * 2f64 * 10f64.powi(-9)) } "elem/s/2" } } } fn scale_for_machines(&self, values: &mut [f64]) -> &'static str { // Convert values in nanoseconds to half-seconds. for val in values { *val *= 2f64 * 10f64.powi(-9); } "s/2" } } }
Using Custom Measurements
Once you (or an external crate) have defined a custom measurement, using it is relatively easy.
You will need to override the Criterion
struct (which defaults to WallTime
) by providing your
own measurement using the with_measurement
function and overriding the default Criterion
object
configuration. Your benchmark functions will also have to declare the measurement type they work
with.
#![allow(unused)] fn main() { fn fibonacci_cycles(criterion: &mut Criterion<HalfSeconds>) { // Use the criterion struct as normal here. } fn alternate_measurement() -> Criterion<HalfSeconds> { Criterion::default().with_measurement(HalfSeconds) } criterion_group! { name = benches; config = alternate_measurement(); targets = fibonacci_cycles } }