Part of running big distributed systems at scale is encountering issues which are hard to debug. Memory leaks, sudden crashes, threads hanging… they might all manifest under extreme production conditions, but never in our laptops or test environments.
That’s why sometimes we might need to go straight to the source, and be able to profile a single JVM which is under real production load.
This guide aims to show how we can attach a profiler to a running application when the network, AWS permissions or even a layer of containerisation might be in the way.

We will achieve this by making use of a profiler agent running next to the remote JVM, which will send data to our profiler client. The two will be connected by an SSH tunnel.
Continue reading