Tracing slow Java I/O in Solaris 10

We have a Java application which is significantly slower in Solaris 10 server than in Windows PC.

We have profiled (-Xprof) the application and observed that UnixFileSystem.getBooleanAttributes0 method consumes about 40% CPU time with native calls.

How can we follow our search to identify which is the cause of the slow behaviour?

Update:

We have executed dtrace and we observe the CPU time is 0.489 sec, while the time results are


real 0m40.759s
user 0m47.669s
sys 0m2.287s

It seems time is not spent on CPU but in another place. How can we proceed with further investigation having this dtrace results? Can we use another dtrace options to show more info?

Thank you very much.


# /dtrace-toolkit/DTraceToolkit-0.99/procsystime -aTp 5695
Tracing... Hit Ctrl-C to end...
dtrace: 1032 dynamic variable drops with non-empty dirty list
^C

Elapsed Times for PID 5695,

         SYSCALL          TIME (ns)
           gtime               7799
       sysconfig               8018
        lwp_kill              10105
     getsockname              10215
      systeminfo              10984
       getrlimit              11093
        lwp_self              20210
        schedctl              26361
    lwp_continue              27349
     priocntlsys              33390
          unlink              53050
     lwp_sigmask             136303
      setcontext             143447
   so_socketpair             202427
      lwp_create             223626
             brk             227141
       so_socket             263276
         memcntl             461091
           ioctl            1094955
lwp_cond_broadcast            1946728
lwp_mutex_wakeup            2579492
 lwp_cond_signal            2903277
            mmap            3082973
          access            4446250
        mprotect            4534229
      getdents64            5713309
          mmap64            6272480
            fsat            6925239
            stat            6926115
            open            8324983
           fcntl            8840868
           close           11264404
          llseek           11678791
          munmap           11867401
         fstat64           17852129
        lwp_park           22098570
          open64           23549922
          stat64           39673214
            read           41876097
     resolvepath           81905859
           yield          175946072
           write          225507878
lwp_mutex_timedlock          887174392
         pollsys        20393318412
   lwp_cond_wait       345099369278
          TOTAL:       367108549202

CPU Times for PID 5695,

         SYSCALL          TIME (ns)
           gtime               1208
       sysconfig               1647
        lwp_self               2198
      systeminfo               2856
        lwp_kill               3186
     getsockname               3625
       getrlimit               4613
    lwp_continue              13839
        schedctl              14059
     priocntlsys              14498
     lwp_sigmask              20541
          unlink              45801
      setcontext              96874
             brk             121804
        lwp_park             158601
   so_socketpair             195178
      lwp_create             209017
       so_socket             241199
         memcntl             365972
           ioctl             606397
lwp_cond_broadcast            1367556
lwp_mutex_wakeup            1561103
 lwp_cond_signal            1803831
          access            1885436
            mmap            2655710
          llseek            3266410
            open            3614795
            fsat            4043369
      getdents64            4253373
        mprotect            4345864
           close            4547395
           fcntl            4871095
            stat            5183012
          mmap64            5308789
         fstat64            8141116
          munmap           10774529
         pollsys           11928811
lwp_mutex_timedlock           19212980
          open64           19443958
   lwp_cond_wait           23144761
            read           28043647
          stat64           31710269
     resolvepath           70641662
           yield           77731698
           write          137678654
          TOTAL:          489282936

Syscall Counts for PID 5695,

         SYSCALL              COUNT
       getrlimit                  1
     getsockname                  1
           gtime                  1
        lwp_kill                  1
           rexit                  1
   so_socketpair                  1
       sysconfig                  1
      systeminfo                  1
          unlink                  1
    lwp_continue                  2
      lwp_create                  2
        schedctl                  2
        lwp_self                  3
     priocntlsys                  3
       so_socket                  3
        lwp_exit                  4
      setcontext                  7
        lwp_park                 11
         memcntl                 14
             brk                 16
     lwp_sigmask                 19
        mprotect                 25
            mmap                 56
          access                 67
lwp_cond_broadcast                 86
           ioctl                 87
            open                100
            fsat                129
          mmap64                133
lwp_mutex_wakeup                148
          munmap                153
 lwp_cond_signal                162
      getdents64                224
            stat                283
         pollsys                348
   lwp_cond_wait                537
lwp_mutex_timedlock                558
          open64                590
           fcntl                625
           close                777
          stat64               1146
          llseek               1368
            read               1404
         fstat64               1559
     resolvepath               1644
           yield              10049
           write              13416
          TOTAL:              35769

Update 2:

Here is the source of the dtraced Java application. It just loads several thousand classes from one single .jar file, reproducing the same behaviour observed in a bigger app.

import java.io.IOException;
import java.io.InputStream;
import java.util.Enumeration;
import java.util.Properties;
public class TestCL {
    public static void main(String[] args) {
        Properties properties = new Properties();
        InputStream in = TestCL.class.getClassLoader().getResourceAsStream("TestCL.properties");
        if (in == null) {
            System.err.println("TestCL.properties not found");
            System.exit(-1);
        }
        try {
            properties.load(in);
        } catch (IOException e) {
            System.err.println("TestCL.properties loading error");
            e.printStackTrace();
            System.exit(-2);
        } finally {
            try {
                in.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        Enumeration e = properties.keys();
        while (e.hasMoreElements()) {
            String className = (String) e.nextElement();
            try {
                Class.forName(className);
                System.out.println("OK " + className);
            } catch (Throwable t) {
               System.err.println("KO " + className + " " + t.getClass().getName() + " " + t.getMessage());
            }
        }
    }
}

Update 3:

All classes are packed in a single .jar file. We have also run the test with all classes unpacked and there's no significant improvement on performance.

java -cp library.jar:. TestCL


If your using Solaris, you are lucky to be able to use dtrace. This will allow you to profile to the kernel level and get more clues about how the jvm is interacting with the kernel.

More information here

http://www.devx.com/Java/Article/33943

If you want find out what you jvm is doing then run dtrace with the jvm probes.

http://java.sun.com/javase/6/docs/technotes/guides/vm/dtrace.html

http://www.solarisinternals.com/wiki/index.php/DTrace_Topics_Java#DTrace_Topics:_Java

this will give you much more meaningful output in relation to your program. Have a look at 'Method Times' section.

http://www.princeton.edu/~unix/Solaris/troubleshoot/diskio.html

is a great guide for finding i/o bottle necks.

this may also help http://prefetch.net/blog/index.php/2007/03/03/viewing-busy-code-paths-with-dtrace/

There are no hard and fast rules when tracking problems like this, but information is the key!!. If you follow these guides , You are well on the way to becoming a ninja system engineer.

You may be able to use jprofiler http://www.ej-technologies.com/products/jprofiler/overview.html

it's not open source, however i've had great success in tracking down java performance problems with it.

You should also run your java vm and your application with full debug logging. If you have access to the kernel logs then check them for unusual events.

Good luck.

Does anyone else on server fault have some ninja tips for fault finding problems such as this. I've got my own way of debugging, but it would be interesting to know what other think?

-----------------UPDATE--------------

I've looked at the trace again, seems that you seem to be doing a lot of resolvepaths.

This needs to be fixed. Are you using a very long path's, or running from a directory that is symlinked. Try removing symlinks, or using hardlinks and see if you get better results.

Does the app do lot's of directory traversal , then try using absolute path's , rather than relative paths.

Again, just an idea' I had over eating a cheese cake. Run dtrace with the java probes, that should drill down enough to see what java class/methods are taking the most time.

Good Luck (again). Don't give up now, I think we are very close to solution.


On your solaris system just run sys_diag -G -I1 -l -v and it will aggregate all performance info (CPU/Memory/Network/Disk IO/Dtrace/kernel...) and analyze the output with a single color-coded .html report of findings/bottlenecks characterizing the workload by sub-system.. This will show any/all bottlenecks as well as locking that might be occuring (lockstat,..). The latest version is v8.1 HTH.


Just to post the final result.

The server (Sun T Series) is just that slow for a single-threaded Java application. Go figure.

Thank you all for the answers and support.


Your dtrace output shows that your applications spends most of the time on writes. You should check (using dtrace of course :-)) where these writes go. If they go to solaris filesystem then you should check if the filesystem is not a bottleneck.