http/pprof Profiling
Using the http/pprof support you can profile your web applications and services to see exactly where your performance or memory is being taken.
Building and Running the Project
We have a website that we will use to learn and explore more about profiling. This project is just creating a struct type value, marshaling it to JSON and using that as the response.
Build and run the example program.
$ go build
$ ./pprof
Test it is working.
http://localhost:4000/sendjson
To add load to the service while running profiling we can run these command.
// Send 1M request using 8 connections.
$ hey -m POST -c 8 -n 1000000 "http://localhost:4000/sendjson"
Raw http/pprof
We already added the following import so we can include the profiling route to our web service.
import _ "net/http/pprof"
Look at the basic profiling stats from the new endpoint.
http://localhost:4000/debug/pprof
Run a single search from the Browser and then refresh the profile information.
http://localhost:4000/sendjson
Put some load of the web application. Review the raw profiling information once again.
$ hey -m POST -c 8 -n 1000000 "http://localhost:4000/sendjson"
Looking at Heap Profiles
http://localhost:4000/debug/pprof/heap?debug=1
At the top of the heap profile you will see this information.
heap profile: 0: 0 [9318: 3545424] @ heap/1048576
[0:] Currently live objects,
[0] Amount of memory occupied by live objects
[9318:] Total number of allocations
[3545424] Amount of memory occupied by all allocations
At the bottom of the heap profile you will see this information.
General statistics.
Alloc uint64 // bytes allocated and not yet freed
TotalAlloc uint64 // bytes allocated (even if freed)
Sys uint64 // bytes obtained from system (sum of XxxSys below)
Lookups uint64 // number of pointer lookups
Mallocs uint64 // number of mallocs
Frees uint64 // number of frees
Main allocation heap statistics.
HeapAlloc uint64 // bytes allocated and not yet freed (same as Alloc above)
HeapSys uint64 // bytes obtained from system
HeapIdle uint64 // bytes in idle spans
HeapInuse uint64 // bytes in non-idle span
HeapReleased uint64 // bytes released to the OS
HeapObjects uint64 // total number of allocated objects
Low-level fixed-size structure allocator statistics.
// Inuse is bytes used now.
// Sys is bytes obtained from system.
StackInuse uint64 // bytes used by stack allocator
StackSys uint64
MSpanInuse uint64 // mspan structures
MSpanSys uint64
MCacheInuse uint64 // mcache structures
MCacheSys uint64
BuckHashSys uint64 // profiling bucket hash table
GCSys uint64 // GC metadata
OtherSys uint64 // other system allocations
Garbage collector statistics.
NextGC uint64 // next collection will happen when HeapAlloc ≥ this amount
PauseNs [256]uint64 // circular buffer of recent GC pause durations, most recent at [(NumGC+255)%256]
NumGC uint32 // Number of GC that have run
Interactive Profiling
Heap Profiling
Run the pprof tool.
$ go tool pprof -<PICK_MEM_PROFILE> ./pprof http://localhost:4000/debug/pprof/heap
Documentation of memory profile options.
// Useful to see pressure on heap over time.
-alloc_space : All allocations happened since program start ** default
-alloc_objects: Number of object allocated at the time of profile
// Useful to see current status of heap.
-inuse_space : Allocations live at the time of profile
-inuse_objects: Number of bytes allocated at the time of profile
If you want to reduce memory consumption, look at the -inuse_space
profile collected during normal program operation.
If you want to improve execution speed, look at the -alloc_objects
profile collected after significant running time or at program end.
CPU Profiling
Run the Go pprof tool in another window or tab to review cpu information.
$ go tool pprof http://localhost:4000/debug/pprof/profile
If you include the binary when using the browser tooling, you can get informatio down to the assembly.
$ go tool pprof -http :3000 ./pprof http://localhost:4000/debug/pprof/profile
Note that goroutines in “syscall” state consume an OS thread, other goroutines do not (except for goroutines that called runtime.LockOSThread, which is, unfortunately, not visible in the profile).
Note that goroutines in “IO wait” state do NOT consume an OS thread. They are parked on the non-blocking network poller.
Explore using the top, list, web and weblist commands from the command line. If you are using the browser based UI, these items on in the navigation.
Comparing Profiles
Take a snapshot of the current heap profile. Then do the same for the cpu profile.
$ curl -s http://localhost:4000/debug/pprof/heap > base.heap
After some time, take another snapshot:
$ curl -s http://localhost:4000/debug/pprof/heap > current.heap
Now compare both snapshots against the binary and get into the pprof tool:
$ go tool pprof -inuse_space -base base.heap current.heap