Observability of the Java Virtual Machine

Image
The JVM is one of the most observable runtimes. It provides us lots of tools for troubleshooting a JVM application in production. 1. Thread observability Threads are how the JVM actually does work. When something is wrong in production, the symptom is almost always a thread: stopped, blocked, leaking etc. Thread dumps work on any JVM with no  instrumentation, no agents, no restarts. <Example project link with /threaddump endpoint>         // (1) Deadlock — two threads grab the same pair of locks in opposite order.         new Thread(() -> grab(LOCK_A, LOCK_B), "deadlock-A-then-B").start();         new Thread(() -> grab(LOCK_B, LOCK_A), "deadlock-B-then-A").start(); http://localhost:8080/actuator/threaddump To list the JVMS, we can use the command below. PS C:\observe-jvm> jps -lv 25296 jdk.jcmd/sun.tools.jps.Jps -Dapplication.home=C:\Program Files\Microsoft\jdk-21.0.3.9-hotspot -Xms8m -Djdk.module.main=...

Java Bit Manipulation For Repeated DNA Sequences


Problem description from LeetCode:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

Example:
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC", "CCCCCAAAAA"]

Solution:
The idea is that we can use 2 bits to store a letter. 
We use a map to count occurences of 10 letter sequences.
When we get a new letter, we shift 2 bits to the left and then append the new letter.
After appending 11th character, we must remove the first char on the left.
To remove it, since we store 10 letters in 20 bits, our bit mask is: 0xFFFFF

Source Code:

public class RepeatedDNA {

    public static List findRepeatedDnaSequences(String string) {

        if (string.isEmpty()) {
            return new ArrayList<>();
        }

        if (string.length() <= 10) {
            return new ArrayList<>();
        }

        List result = new ArrayList<>();

        char[] s = string.toCharArray();

        Map map = new HashMap<>();

        int code = 0;
        for (int i = 0; i < 10; ++i) {
            code = (code << 2) | mapping(s[i]);
            //    System.out.println(code);
        }

        map.put(code, 1);

        for (int i = 10; i < s.length; ++i) {

            code = ((code << 2) & (0xFFFFF)) | mapping(s[i]);

            // System.out.println(Integer.toBinaryString(code));

            Integer count = map.get(code);

            if (count == null) {
                map.put(code, 0);
                count = 0;
            }

            if (count == 1) {
                result.add(string.substring(i - 9, i + 1));
            }

            map.put(code, count + 1);
        }
        
        return result;
    }

    static int mapping(char s) {
        if (s == 'A') return 0;
        if (s == 'C') return 1;
        if (s == 'G') return 2;
        if (s == 'T') return 3;

        return -1;
    }

    public static void main(String[] args) {

        List result = findRepeatedDnaSequences(
                "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
        );

        for (String item : result) {
            System.out.println(item);
        }
    }
}

Comments

Popular posts from this blog

The WeakReference class, monitoring memory leak and garbage collection in a Java application

Simplescalar Simulator - Part 2: sim-outorder.c

Notes on Java Performance