Performance & Profiling
Profiling with pprof
- Query the pprof cpu endpoint on the node host: 
- CPU: 
curl -X GET localhost:6060/debug/pprof/profile?seconds=<number> > <filename> - Heap: 
curl -X GET localhost:6060/debug/pprof/heap?seconds=<number> > <filename> - can query from your local machine by substituting localhost with the IP of the node, depending on your network setup. By doing this, can skip step 2.
 
 - CPU: 
 - If querying on the node host, SCP the file to yourself: 
scp <filename> <user>@<host>:<path>- E.g. 
scp <filename> [email protected]:/home/roman/osmosis/pprof - ensure that your ISP or firewall is not blocking the file transfer
 
 - E.g. 
 - Run a web server and open up a browser
go tool pprof -http=localhost:8080 <filename>graphvizmust be installed
 
Memory
Causes
The following cause memory issues in Go – Creating substrings and subslices. – Wrong use of the defer statement. – Unclosed HTTP response bodies (or unclosed resources in general). – Orphaned hanging go routines. – Global variables.
Interpreting Output
– inuse_space: Means pprof is showing the amount of memory allocated
and not yet released.
– inuse_objects: Means pprof is showing the amount of objects allocated
and not yet released.
– alloc_space: Means pprof is showing the amount of memory allocated,
regardless if it was released or not.
– alloc_objects: Means pprof is showing the amount of objects allocated,
regardless if they were released or not.
– flat: Represents the memory allocated by a function and still held by that
function.
– cum: Represents the memory allocated by a function or any other function
that is called down the stack.
Useful links
- Pprof Doc
 - Graphviz Download
 - Using SCP
 - Advanced Go Profiling Talk (YouTube)
 - Notes from the talk above
 - Memory Leaking Scenarios
 - Great blogpost about profiling heap
 
Benchmarking
Best practices
- Running the benchmarks on an idle machine not running on battery
 - Use 
-benchmemto also get stats on allocated objects and space - Use 
benchstatto compare performance across different git branches - Adding -run='$^' or -run=- to each go test command to avoid running the tests too
 
Benchstat sample output for illustration:
name                old time/op    new time/op    delta
Decode-4               2.20s ± 0%     1.54s ± 0%   ~     (p=1.000 n=1+1)
For benchstat specifically:
- Using higher -count values if the benchmark numbers aren't stable
- if you don't, your sample size would be too small and 
deltamight not be reported (like in example above) because it is not significant enough. - if you do, might take longer since you need multiple runs to get a good sample size
 - people recommend 5 as a good enough sample size
 
 - if you don't, your sample size would be too small and 
 
Adding -run='$^' or -run=- to each go test command to avoid running the tests too
Example
Let's assume that we are working on branch osmosis/string and added some performance improvements to tree.String().
As a result, we would like to bench test like in the following in iavl.
To get a nice bench summary we would follow these steps:
- Checkout the 
masterbranch and get the output of the benchmark: 
git checkout master
go test -benchmem -run=^$ -bench ^BenchmarkTreeString$ -benchmem -count 5 github.com/cosmos/iavl > bench_string_old.txt
- Checkout our 
osmosis/stringbranch and get the output of the benchmark: 
git checkout master
go test -benchmem -run=^$ -bench ^BenchmarkTreeString$ -benchmem -count 5 github.com/cosmos/iavl > bench_string_new.txt
- Compare the two outputs with 
benchstat: 
benchstat bench_string_old.txt bench_string_new.txt
- Evaluate the output and attach to your PR, if needed