bench-show

Show, plot and compare benchmark results

https://github.com/composewell/bench-show

Version on this page:	0.3.1
LTS Haskell 20.26:	0.3.2
Stackage Nightly 2024-10-31:	0.3.2@rev:1
Latest on Hackage:	0.3.2@rev:1

See all snapshots bench-show appears in

BSD-3-Clause licensed by Harendra Kumar

Maintained by [email protected]

This version can be pinned in stack with:bench-show-0.3.1@sha256:92d2521108dd2674cd8516ed270638034e8dbaed6dfef622c312e48796a0b931,6635

Module documentation for 0.3.1

BenchShow
- BenchShow.Tutorial

Depends on 16 packages(full list with versions):

ansi-wl-pprint, base, bench-show, Chart, Chart-diagrams, csv, directory, filepath, mwc-random, optparse-applicative, optparse-simple, semigroups, split, statistics, transformers, vector

Used by 1 package in lts-14.27(full list with versions):

bench-show

Generate text reports and graphical charts from the benchmark results generated by gauge or criterion and stored in a CSV file. This tool is especially useful when you have many benchmarks or if you want to compare benchmarks across multiple packages. You can generate many interesting reports including:

Show individual reports for all the fields measured e.g. time taken, peak memory usage, allocations, among many other fields measured by gauge
Sort benchmark results on a specified criterion e.g. you may want to see the biggest cpu hoggers or biggest memory hoggers on top
Across two benchmark runs (e.g. before and after a change), show all the operations that resulted in a regression of more than x% in descending order, so that we can quickly identify and fix performance problems in our application.
Across two (or more) packages providing similar functionality, show all the operations where the performance differs by more than 10%, so that we can critically analyze the packages and choose the right one.

Quick Start

Use gauge or criterion to generate a results.csv file, and then use either the bench-show executable or the library APIs to generate textual or graphical reports.

Executable

Use bench-show executable with report and graph sub-commands:

$ bench-show report results.csv
$ bench-show graph results.csv output

For advanced usage, control the generated report by the CLI flags.

Library

Use report and graph library functions:

report "results.csv"  Nothing defaultConfig
graph  "results.csv" "output" defaultConfig

For advanced usage, control the generated report by modifying the defaultConfig.

Reports and Charts

report with Fields presentation style generates a multi-column report. We can select many fields from a gauge raw report. Units of the fields are automatically determined based on the range of values:

$ bench-show --presentation Fields report results.csv

report "results.csv" Nothing defaultConfig { presentation = Fields }

Benchmark     time(μs) maxrss(MiB)
------------- -------- -----------
vector/fold     641.62        2.75
streamly/fold   639.96        2.75
vector/map      638.89        2.72
streamly/map    653.36        2.66
vector/zip      651.42        2.58
streamly/zip    644.33        2.59

graph generates one bar chart per field:

$ bench-show --presentation Fields graph results.csv

graph "results.csv" "output" defaultConfig

When the input file contains results from a single benchmark run, by default all the benchmarks are placed in a single benchmark group named “default”.

Grouping

Let’s write a benchmark classifier to put the streamly and vector benchmarks in their own groups:

   classifier name =
       case splitOn "/" name of
           grp : bench -> Just (grp, concat bench)
           _          -> Nothing

Now we can show the two benchmark groups as separate columns. We can generate reports comparing different benchmark fields (e.g. time and maxrss) for all the groups:

   report "results.csv" Nothing
     defaultConfig { classifyBenchmark = classifier }

(time)(Median)
Benchmark streamly(μs) vector(μs)
--------- ------------ ----------
fold            639.96     641.62
map             653.36     638.89
zip             644.33     651.42

We can do the same graphically as well, just replace report with graph in the code above. Each group is placed as a cluster on the graph. Multiple clusters are placed side by side (i.e. on the same scale) for easy comparison. For example:

Regression, Percentage Difference and Sorting

We can append benchmarks results from multiple runs to the same file. These runs can then be compared. We can run benchmarks before and after a change and then report the regressions by percentage change in a sorted order:

Given a results file with two runs, this code generates the report that follows:

   report "results.csv" Nothing
     defaultConfig
         { classifyBenchmark = classifier
         , presentation = Groups PercentDiff
         , selectBenchmarks = \f ->
              reverse
              $ map fst
              $ sortBy (comparing snd)
              $ either error id $ f $ ColumnIndex 1
         }

(time)(Median)(Diff using min estimator)
Benchmark streamly(0)(μs)(base) streamly(1)(%)(-base)
--------- --------------------- ---------------------
zip                      644.33                +23.28
map                      653.36                 +7.65
fold                     639.96                -15.63

It tells us that in the second run the worst affected benchmark is zip taking 23.28 percent more time compared to the baseline.

Graphically:

Full Documentation and examples

See the haddock documentation on Hackage
See the comprehensive tutorial module in the haddock docs
For examples see the test directory in the package

Contributions and Feedback

Contributions are welcome! Please see the TODO.md file or the existing issues if you want to pick up something to work on.

Any feedback on improvements or the direction of the package is welcome. You can always send an email to the maintainer or raise an issue for anything you want to suggest or discuss, or send a PR for any change that you would like to make.

Changes

0.3.1

Bug Fixes

report and graph generation now works even when number of samples is less than 3

0.3.0

Breaking Changes

The signature of selectBenchmarks has changed, use ‘Nothing’ as the second argument of the benchmark generator function to port old code without any impact.
Removed the broken ‘Percent’ constructor from GroupStyle. Use PercentDiff instead to make relative comparisons.
The behavior of PercentDiff has changed, it now computes the % from the lower value instead of from the baseline.
The default diffStrategy has been changed to SingleEstimator instead of MinEstimator.

Deprecations

Config fields title and titleAnnotations have been deprecated, please use mkTitle instead.

Bug Fixes

GroupStyle Absolute now honors the MinEstimator setting. When MinEstimator is set, the groups being compared to baseline now display the value based on the estimator which provides closest estimate to the baseline.

Enhancements

Add a CLI executable to generate textual reports and graphs from criterion or gauge csv output file.
Add Multiples as a comparison option, the group being compared is shown as a multiple of the baseline.
Add ability to omit the baseline group from the results when we are doing a relative comparison among groups.
Add the ability to sort the benchmarks using a different criterion than the one used to present the benchmarks in the final report output.
Add mkTitle config option to use a function for generating a custom report title.

0.2.2

Allow additional annotations to title to be controlled via config
Better error handling

0.2.1

Use new version of statistics package.

0.2.0

Release Notes

Due to a bug in the statistics package, reporting may crash on certain inputs with a vector index out of bounds message. The bug has been fixed and will be available in an upcoming release.

Breaking Changes

The package bench-graph has been renamed to bench-show to reflect the fact that it now includes text reports as well. This includes the change of module name BenchGraph to BenchShow.
The bgraph API has been removed and replaced by graph
The way output file is generated has changed. Now field name or group name being plotted or both may be suffixed to the output file name automatically. The estimator type (e.g. mean or median) is also suffixed to the filename.
Changes to Config record:
- chartTitle field has been renamed to title.
- The type of outputDir is now a Maybe.
- comparisonStyle has been replaced by presentation
- ComparisonStyle has been replaced by Presentation
- sortBenchmarks has been replaced by selectBenchmarks. The new function can be defined as follows in terms of an older definition: selectBenchmarks = \g -> sortBenchmarks $ either error (map fst) $ f (ColumnIndex 0)
- sortBenchGroups has been replaced by selectGroups
- setYScale field has been broken down into two fields fieldRanges and fieldTicks. Now you also need to specify which fields’ scale you want to set.

Enhancements

A report API has been added to generate textual reports
More ways to compare groups have been added, including percent and percent difference
Now we can show multiple fields as columns in a single benchmark group report
Field units are now automatically selected based on the range of values
Additions to Config record type:
- selectFields added to select the fields to be plotted and to change their presentation order.
- selectBenchmarks can now sort the results based on values corresponding to any field or benchmark group.
- new fields added: diffStrategy, verbose, estimator, threshold

0.1.4

Fix a bug resulting in a bogus error, something like “Field [time] found at different indexes..” even though the field has exactly the same index at all places.

0.1.3

Add maxrss plotting support

0.1.2

Fixed a bug that caused missing graphs in some cases when multiple iterations of a benchmark are present in the bechmark results file.
Better error reporting to pinpoint errors when a problem occurs.

0.1.1

Support GHC 8.4

0.1.0

Initial release