DataStax Help Center

Taking Thread dumps to Troubleshoot High CPU Utilization

Use the attached multidump.sh script to take thread dumps and generate top -H output at regular intervals while your nodes are experiencing high CPU utilization. The multidump.sh script must be run as the same user that owns the DSE java process; otherwise, the jstack command will not work.  The usage is:

multidump.sh pid interval count

If DSE is running as a service, use the pid from /var/run/dse.pid.  If DSE is started from the command line, use 'ps -ef | grep java' to determine the pid. The recommended parameters are an interval of 5 seconds and a count of 60, which would run for a total of 5 minutes. This will generate two files, jstack.out and top.out, which you should send to us.

If you have difficulties using the multidump script, you can use the multidump-alt script instead.  The usage is the same, but it uses kill -3 to take the thread dumps instead of jstack. This will not kill the process; the JVM interprets the -3 signal as a request to print a thread dump on stdout.  When DSE is run as a service, stdout is redirected to /var/log/cassandra/output.log file, so send us that as well as top.out.  If DSE is run as a standalone process, find out where its output is redirected and capture the thread dumps from there.

Was this article helpful?
10 out of 10 found this helpful
Have more questions? Submit a request

Comments

  • Avatar
    Philip Southam

    To save you some time, If you're using a modern version of java you probably won't have the jstack utility (as it appears to be depricated http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstack.html) and will have to use the multidump-alt.sh script.

  • Avatar
    J.B. Langston

    jstack is only included in the JDK, not the JRE. It is still available as of JDK 7, which is the version that Cassandra currently supports.

  • Avatar
    Philip Southam

    I stand corrected, thanks J.B.

  • Avatar
    José Martínez Poblete

    Attached a version of the script which merges the original scripts

    Usage: ./multidump.sh -i <interval> -c <count> -pid <PID> -pgm [jstack|kill]

           Default interval: 5 secs

           Default count   : 60

           Default PID     : DSE java PID

           Default PGM     : jstack

     

  • Avatar
    José Martínez Poblete

    I have changed the code of the script in order to get the information we need

    Attached is a newer version of the script. It needs the following in order to run

    On /etc/password make sure the cassandra user is set to use /bin/bash rather than /bin/false

    cassandra:x:106:111:Cassandra database,,,:/var/lib/cassandra:/bin/bash
    

    We need to use Oracle jstack, it will use the command resulting from this search:

    find /usr -name jstack -type f -perm /a=x | egrep -v openjdk | tail -1
    

    Become root and then cassandra, then execute:

    ./multidump.sh
    

    A succesful execution would look like this

    automaton@ip-172-31-2-39:~$ ./multidump.sh
    Begin processing...
    PID   %CPU  Process
    ===== ===== =======
    27825 8.15 MemoryMeter:1
     6930 7.00 CompactionExecutor:25
     6929 6.50 FlushWriter:13
    27063 5.90 Gang worker#0 (Parallel GC Threads)
    27064 5.90 Gang worker#1 (Parallel GC Threads)
    27828 5.50 COMMIT-LOG-WRITER
    
    Collecting files...
    jstack.out
    top.out
    topThreads.out
    End processing, please collect /tmp/multidump.tgz
    automaton@ip-172-31-2-39:~$
    

     

     

  • Avatar
    Michael Keeney

    We've also run in to several instances of this error:

    Unable to open socket file: target process not responding or HotSpot VM not loaded
    The -F option can be used when the target process is not responding

    The following fixed it (along with editing multidump.sh to include):

    sudo -u cassandra jstack -J-d64 -l <PID> >> jstack.out

    Edited by Michael Keeney
Powered by Zendesk