Observability of the Java Virtual Machine

- May 23, 2026

The JVM is one of the most observable runtimes. It provides us lots of tools for troubleshooting a JVM application in production.

1. Thread observability

Threads are how the JVM actually does work. When something is wrong in production, the symptom is almost always a thread: stopped, blocked, leaking etc. Thread dumps work on any JVM with no instrumentation, no agents, no restarts.

// (1) Deadlock — two threads grab the same pair of locks in opposite order.

new Thread(() -> grab(LOCK_A, LOCK_B), "deadlock-A-then-B").start();

new Thread(() -> grab(LOCK_B, LOCK_A), "deadlock-B-then-A").start();

http://localhost:8080/actuator/threaddump

To list the JVMS, we can use the command below.

PS C:\observe-jvm> jps -lv

25296 jdk.jcmd/sun.tools.jps.Jps -Dapplication.home=C:\Program Files\Microsoft\jdk-21.0.3.9-hotspot -Xms8m -Djdk.module.main=jdk.jcmd

9588 com.observejvm.l01threads.ThreadDemo -javaagent:C:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2025.2\lib\idea_rt.jar=60522 -Dfile.encoding=UTF-8 -Dsun.stdout.encoding=UTF-8 -Dsun.stderr.encoding=UTF-8

Now we know the PID of the JVM process so we can use the below command to get a thread dump:

PS C:\observe-jvm> jcmd 9588 Thread.print

...

Found one Java-level deadlock:

=============================

"deadlock-A-then-B":

waiting to lock monitor 0x000001c9c1a66340 (object 0x00000006043b0458, a java.lang.Object),

which is held by "deadlock-B-then-A"

"deadlock-B-then-A":

waiting to lock monitor 0x000001c9fe5f8390 (object 0x00000006043b0468, a java.lang.Object),

which is held by "deadlock-A-then-B"

Java stack information for the threads listed above:

===================================================

"deadlock-A-then-B":

at com.observejvm.l01threads.ThreadDemo.grab(ThreadDemo.java:62)

- waiting to lock <0x00000006043b0458> (a java.lang.Object)

- locked <0x00000006043b0468> (a java.lang.Object)

So the jcmd tool allows us to detect deadlocks and it provides which exact method call is causing that.

The second case we'll look at is a thread that waits for a stuck downstream call:

new Thread(() -> sleep(Long.MAX_VALUE), "stuck-downstream-call").start();

PS C:\observe-jvm> jcmd 2616 Thread.print

"stuck-downstream-call" #49 [668] prio=5 os_prio=0 cpu=0.00ms elapsed=16.36s tid=0x000001ce690a9e70 nid=668 waiting on condition [0x00000048faeff000]

java.lang.Thread.State: TIMED_WAITING (sleeping)

at java.lang.Thread.sleep0(java.base@21.0.3/Native Method)

at java.lang.Thread.sleep(java.base@21.0.3/Thread.java:509)

at com.observejvm.l01threads.ThreadDemo.sleep(ThreadDemo.java:79)

at com.observejvm.l01threads.ThreadDemo.lambda$main$0(ThreadDemo.java:42)

The third case we'll investigate is about 3 threads one of which is using a lot of CPU time but the others are mostly idle:

// (3) A small worker pool: one CPU-hot, two mostly idle.

var pool = Executors.newFixedThreadPool(3, r -> {

Thread t = new Thread(r);

t.setName("worker-" + t.threadId());

return t;

});

pool.submit(ThreadDemo::burnHot);

pool.submit(ThreadDemo::idleLoop);

private static void burnHot() {

long x = 1;

while (true) {

x = x * 3 + 1;

}

private static void idleLoop() {

while (true) sleep(1000);

}

PS C:\Users\bahad> jcmd 27236 Thread.print

"worker-49" #49 [10640] prio=5 os_prio=0 cpu=10531.25ms elapsed=10.60s tid=0x0000022d87610d30 nid=10640 runnable [0x000000a85e3fe000]

java.lang.Thread.State: RUNNABLE

at com.observejvm.l01threads.ThreadDemo.burnHot(ThreadDemo.java:77)

"worker-50" #50 [14692] prio=5 os_prio=0 cpu=0.00ms elapsed=10.60s tid=0x0000022d8760f2f0 nid=14692 waiting on condition [0x000000a85e4ff000]

java.lang.Thread.State: TIMED_WAITING (sleeping)

at java.lang.Thread.sleep0(java.base@21.0.3/Native Method)

at java.lang.Thread.sleep(java.base@21.0.3/Thread.java:509)

at com.observejvm.l01threads.ThreadDemo.sleep(ThreadDemo.java:86)

at com.observejvm.l01threads.ThreadDemo.idleLoop(ThreadDemo.java:82)

Note: The JVM has a small number of housekeeping daemon threads. It is easy to recognize them and some of them are listed below:

Reference Handler

Finalizer

Signal Dispatcher

Common-Cleaner

Notification Thread

Attach Listener

Service Thread

2- Memory Observability

Every Out Of Memory Error (OOM) has a story. The heap dump is the witness statement for us.

Memory issues are the second-most-common production incident after threads. An OOM kills the whole process. Auto-restart hides the leak, the cycle repeats, nobody investigates until customers notice.

Heap dumps contain enormously more information: a byte-by-byte snapshot of every live object, every reference, every field.

After running the sample application, we run the command below to take a heap dump:

PS C:\observe-jvm> jcmd 20484 GC.heap_dump C:\observe-jvm\leak.hprof

20484:

Dumping heap to C:\observe-jvm\leak.hprof ...

Heap dump file created [956318512 bytes in 0.678 secs]

We can also get the heap dump from the actuator endpoint:

http://localhost:8080/actuator/heapdump

Then we can open the hprof file using Intellij Idea:

(To be continued)

Search This Blog

COMPUTER ENGINEERING DIARIES