Monday, 26 December 2011

Java Sequential IO Performance

Many applications record a series of events to file-based storage for later use.  This can be anything from logging and auditing, through to keeping a transaction redo log in an event sourced design or its close relative CQRS

Java has a number of means by which a file can be sequentially written to, or read back again.  This article explores some of these mechanisms to understand their performance characteristics.  For the scope of this article I will be using pre-allocated files because I want to focus on performance.  Constantly extending a file imposes a significant performance overhead and adds jitter to an application resulting in highly variable latency.  "Why is a pre-allocated file better performance?", I hear you ask.  Well, on disk a file is made up from a series of blocks/pages containing the data.  Firstly, it is important that these blocks are contiguous to provide fast sequential access.   Secondly, meta-data must be allocated to describe this file on disk and saved within the file-system.  A typical large file will have a number of "indirect" blocks allocated to describe the chain of data-blocks containing the file contents that make up part of this meta-data.   I'll leave it as an exercise for the reader, or maybe a later article, to explore the performance impact of not preallocating the data files.  If you have used a database you may have noticed that it preallocates the files it will require.

The Test

I want to experiment with 2 file sizes.  One that is sufficiently large to test sequential access, but can easily fit in the file-system cache, and another that is much larger so that the cache subsystem is forced to retire pages so that new ones can be loaded.  For these two cases I'll use 400MB and 8GB respectively.  I'll also loop over the files a number of times to show the pre and post warm-up characteristics.

I'll test 4 means of writing and reading back files sequentially:
  1. RandomAccessFile using a vanilla byte[] of page size.
  2. Buffered FileInputStream and FileOutputStream.
  3. NIO FileChannel with ByteBuffer of page size.
  4. Memory mapping a file using NIO and direct MappedByteBuffer.
The tests are run on a 2.0Ghz Sandy Bridge CPU with 8GB RAM, an Intel 320 SSD on Fedora Core 15 64-bit Linux with an ext4 file system, and Oracle JDK 1.6.0_30.

The Code
import java.io.*;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

import static java.lang.Integer.MAX_VALUE;
import static java.lang.System.out;
import static java.nio.channels.FileChannel.MapMode.READ_ONLY;
import static java.nio.channels.FileChannel.MapMode.READ_WRITE;

public final class TestSequentialIoPerf
{
    public static final int PAGE_SIZE = 1024 * 4;
    public static final long FILE_SIZE = PAGE_SIZE * 2000L * 1000L;
    public static final String FILE_NAME = "test.dat";
    public static final byte[] BLANK_PAGE = new byte[PAGE_SIZE];

    public static void main(final String[] arg) throws Exception
    {
        preallocateTestFile(FILE_NAME);

        for (final PerfTestCase testCase : testCases)
        {
            for (int i = 0; i < 5; i++)
            {
                System.gc();
                long writeDurationMs = testCase.test(PerfTestCase.Type.WRITE,
                                                     FILE_NAME);

                System.gc();
                long readDurationMs = testCase.test(PerfTestCase.Type.READ,
                                                    FILE_NAME);

                long bytesReadPerSec = (FILE_SIZE * 1000L) / readDurationMs;
                long bytesWrittenPerSec = (FILE_SIZE * 1000L) / writeDurationMs;

                out.format("%s\twrite=%,d\tread=%,d bytes/sec\n",
                           testCase.getName(),
                           bytesWrittenPerSec, bytesReadPerSec);
            }
        }

        deleteFile(FILE_NAME);
    }

    private static void preallocateTestFile(final String fileName)
        throws Exception
    {
        RandomAccessFile file = new RandomAccessFile(fileName, "rw");

        for (long i = 0; i < FILE_SIZE; i += PAGE_SIZE)
        {
            file.write(BLANK_PAGE, 0, PAGE_SIZE);
        }

        file.close();
    }

    private static void deleteFile(final String testFileName) throws Exception
    {
        File file = new File(testFileName);
        if (!file.delete())
        {
            out.println("Failed to delete test file=" + testFileName);
            out.println("Windows does not allow mapped files to be deleted.");
        }
    }

    public abstract static class PerfTestCase
    {
        public enum Type { READ, WRITE }

        private final String name;
        private int checkSum;

        public PerfTestCase(final String name)
        {
            this.name = name;
        }

        public String getName()
        {
            return name;
        }

        public long test(final Type type, final String fileName)
        {
            long start = System.currentTimeMillis();

            try
            {
                switch (type)
                {
                    case WRITE:
                    {
                        checkSum = testWrite(fileName);
                        break;
                    }

                    case READ:
                    {
                        final int checkSum = testRead(fileName);
                        if (checkSum != this.checkSum)
                        {
                            final String msg = getName() +
                                " expected=" + this.checkSum +
                                " got=" + checkSum;
                            throw new IllegalStateException(msg);
                        }
                        break;
                    }
                }
            }
            catch (Exception ex)
            {
                ex.printStackTrace();
            }

            return System.currentTimeMillis() - start;
        }

        public abstract int testWrite(final String fileName) throws Exception;
        public abstract int testRead(final String fileName) throws Exception;
    }

    private static PerfTestCase[] testCases =
    {
        new PerfTestCase("RandomAccessFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                RandomAccessFile file = new RandomAccessFile(fileName, "rw");
                final byte[] buffer = new byte[PAGE_SIZE];
                int pos = 0;
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    byte b = (byte)i;
                    checkSum += b;

                    buffer[pos++] = b;
                    if (PAGE_SIZE == pos)
                    {
                        file.write(buffer, 0, PAGE_SIZE);
                        pos = 0;
                    }
                }

                file.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                RandomAccessFile file = new RandomAccessFile(fileName, "r");
                final byte[] buffer = new byte[PAGE_SIZE];
                int checkSum = 0;
                int bytesRead;

                while (-1 != (bytesRead = file.read(buffer)))
                {
                    for (int i = 0; i < bytesRead; i++)
                    {
                        checkSum += buffer[i];
                    }
                }

                file.close();

                return checkSum;
            }
        },

        new PerfTestCase("BufferedStreamFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                int checkSum = 0;
                OutputStream out = 
                    new BufferedOutputStream(new FileOutputStream(fileName));

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    byte b = (byte)i;
                    checkSum += b;
                    out.write(b);
                }

                out.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                int checkSum = 0;
                InputStream in = 
                    new BufferedInputStream(new FileInputStream(fileName));

                int b;
                while (-1 != (b = in.read()))
                {
                    checkSum += (byte)b;
                }

                in.close();

                return checkSum;
            }
        },


        new PerfTestCase("BufferedChannelFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                FileChannel channel = 
                    new RandomAccessFile(fileName, "rw").getChannel();
                ByteBuffer buffer = ByteBuffer.allocate(PAGE_SIZE);
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    byte b = (byte)i;
                    checkSum += b;
                    buffer.put(b);

                    if (!buffer.hasRemaining())
                    {
                        buffer.flip();
                        channel.write(buffer);
                        buffer.clear();
                    }
                }

                channel.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                FileChannel channel = 
                    new RandomAccessFile(fileName, "rw").getChannel();
                ByteBuffer buffer = ByteBuffer.allocate(PAGE_SIZE);
                int checkSum = 0;

                while (-1 != (channel.read(buffer)))
                {
                    buffer.flip();

                    while (buffer.hasRemaining())
                    {
                        checkSum += buffer.get();
                    }

                    buffer.clear();
                }

                return checkSum;
            }
        },

        new PerfTestCase("MemoryMappedFile")
        {
            public int testWrite(final String fileName) throws Exception
            {
                FileChannel channel = 
                    new RandomAccessFile(fileName, "rw").getChannel();
                MappedByteBuffer buffer = 
                    channel.map(READ_WRITE, 0,
                                Math.min(channel.size(), MAX_VALUE));
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    if (!buffer.hasRemaining())
                    {
                        buffer = 
                            channel.map(READ_WRITE, i,
                                        Math.min(channel.size() - i , MAX_VALUE));
                    }

                    byte b = (byte)i;
                    checkSum += b;
                    buffer.put(b);
                }

                channel.close();

                return checkSum;
            }

            public int testRead(final String fileName) throws Exception
            {
                FileChannel channel = 
                    new RandomAccessFile(fileName, "rw").getChannel();
                MappedByteBuffer buffer = 
                    channel.map(READ_ONLY, 0,
                                Math.min(channel.size(), MAX_VALUE));
                int checkSum = 0;

                for (long i = 0; i < FILE_SIZE; i++)
                {
                    if (!buffer.hasRemaining())
                    {
                        buffer = 
                            channel.map(READ_WRITE, i,
                                        Math.min(channel.size() - i , MAX_VALUE));
                    }

                    checkSum += buffer.get();
                }

                channel.close();

                return checkSum;
            }
        },
    };
}
Results

400MB file
===========
RandomAccessFile    write=379,610,750   read=1,452,482,269 bytes/sec
RandomAccessFile    write=294,041,636   read=1,494,890,510 bytes/sec
RandomAccessFile    write=250,980,392   read=1,422,222,222 bytes/sec
RandomAccessFile    write=250,366,748   read=1,388,474,576 bytes/sec
RandomAccessFile    write=260,394,151   read=1,422,222,222 bytes/sec

BufferedStreamFile  write=98,178,331    read=286,433,566 bytes/sec
BufferedStreamFile  write=100,244,738   read=288,857,545 bytes/sec
BufferedStreamFile  write=82,948,562    read=154,100,827 bytes/sec
BufferedStreamFile  write=108,503,311   read=153,869,271 bytes/sec
BufferedStreamFile  write=113,055,478   read=152,608,047 bytes/sec

BufferedChannelFile write=228,443,948   read=356,173,913 bytes/sec
BufferedChannelFile write=265,629,053   read=374,063,926 bytes/sec
BufferedChannelFile write=223,825,136   read=1,539,849,624 bytes/sec
BufferedChannelFile write=232,992,036   read=1,539,849,624 bytes/sec
BufferedChannelFile write=212,779,220   read=1,534,082,397 bytes/sec

MemoryMappedFile    write=300,955,180   read=305,899,925 bytes/sec
MemoryMappedFile    write=313,149,847   read=310,538,286 bytes/sec
MemoryMappedFile    write=326,374,501   read=303,857,566 bytes/sec
MemoryMappedFile    write=327,680,000   read=304,535,315 bytes/sec
MemoryMappedFile    write=326,895,450   read=303,632,320 bytes/sec

8GB File
============
RandomAccessFile    write=167,402,321   read=251,922,012 bytes/sec
RandomAccessFile    write=193,934,802   read=257,052,307 bytes/sec
RandomAccessFile    write=192,948,159   read=248,460,768 bytes/sec
RandomAccessFile    write=191,814,180   read=245,225,408 bytes/sec
RandomAccessFile    write=190,635,762   read=275,315,073 bytes/sec

BufferedStreamFile  write=154,823,102   read=248,355,313 bytes/sec
BufferedStreamFile  write=152,083,913   read=253,418,301 bytes/sec
BufferedStreamFile  write=133,099,369   read=146,056,197 bytes/sec
BufferedStreamFile  write=131,065,708   read=146,217,827 bytes/sec
BufferedStreamFile  write=132,694,052   read=148,116,004 bytes/sec

BufferedChannelFile write=186,703,740   read=215,075,218 bytes/sec
BufferedChannelFile write=190,591,410   read=211,030,680 bytes/sec
BufferedChannelFile write=187,220,038   read=223,087,606 bytes/sec
BufferedChannelFile write=191,585,397   read=221,297,747 bytes/sec
BufferedChannelFile write=192,653,214   read=211,789,038 bytes/sec

MemoryMappedFile    write=123,023,322   read=231,530,156 bytes/sec
MemoryMappedFile    write=121,961,023   read=230,403,600 bytes/sec
MemoryMappedFile    write=123,317,778   read=229,899,250 bytes/sec
MemoryMappedFile    write=121,472,738   read=231,739,745 bytes/sec
MemoryMappedFile    write=120,362,615   read=231,190,382 bytes/sec

Analysis

For years I was a big fan of using RandomAccessFile directly because of the control it gives and the predictable execution.  I never found using buffered streams to be useful from a performance perspective and this still seems to be the case.

In more recent testing I've found that using NIO FileChannel and ByteBuffer are doing much better. With Java 7 the flexibility of this programming approach has been improved for random access with SeekableByteChannel.

It seems that for reading RandomAccessFile and NIO do very well with Memory Mapped files winning for writes in some cases.

I've seen these results vary greatly depending on platform.  File system, OS, storage devices, and available memory all have a significant impact.  In a few cases I've seen memory-mapped files perform significantly better than the others but this needs to be tested on your platform because your mileage may vary...

A special note should be made for the use of memory-mapped large files when pushing for maximum throughput.  I've often found the OS can become unresponsive due the the pressure put on the virtual memory sub-system.

Conclusion

There is a significant difference in performance for the different means of doing sequential file IO from Java.  Not all methods are even remotely equal.  For most IO I've found the use of ByteBuffers and Channels to be the best optimised parts of the IO libraries.  If buffered streams are your IO libraries of choice, then it is worth branching out and and getting familiar with the implementations of Channel and Buffer or even falling back and using the good old RandomAccessFile.

30 comments:

  1. Interesting comparison, particularly because NIO seems to have a bad rep in some places (which I can't really judge).

    Two questions: a) is there a reason the ByteBuffer for case 3 wasn't allocated direct? does it make a performance difference? and b) do you have an explanation for the fact that BufferedChannelFile on the 400MB file performs much worse on the first two runs than on the last three, whereas RandomAccessFile doesn't show this behavior?

    ReplyDelete
  2. Ingo,

    a) I've found very little difference between direct and normal ByteBuffers when writing 4K buffers byte at a time.

    b) I'd guess it is an optimisation issue. Probably something related to OSR like I discussed in my last post. The RandomAccessFile example is very simple and thin layer over file access.

    ReplyDelete
  3. Quick question: it seems seldom if ever worth using BufferedStreams over regular streams, unless you actually do single-byte reads/writes (which is unlikely if you actually care about performance)? Most code that needs buffering implements it at higher level anyway.

    So wouldn't it make more sense to just measure use with plain FileInputStream/FileOutputStream -- may not make a big difference in numbers, but still.

    ReplyDelete
  4. To the last commenter, the benchmark does single-byte writes to the buffered stream:

    "byte b = (byte)i;
    checkSum += b;
    out.write(b);"

    I agree that it would be interesting to see the results of buffering to a byte[] and then calling FileOutputStream.write(byte b[]).

    Best,
    Ismael

    ReplyDelete
  5. I wonder what would be the results on a Solaris system. Unfortunately I do not have access to one at the moment.

    ReplyDelete
  6. > I'd guess it is an optimisation issue. Probably something related to OSR like I discussed in my last post. The RandomAccessFile example is very simple and thin layer over file access.

    You suppose overhead of not-inlined function call can make accountable infuence on IO performance? Seems surprising to me

    By the way -- did you try different buffer sizes, other from 4K? My experiments show what difference between NIO/IO tend to decrease, and become ~10% only with buffer size about 128-512K (for my hardware, sure).

    ReplyDelete
  7. > I agree that it would be interesting to see the results of buffering to a byte[] and then calling FileOutputStream.write(byte b[]).

    If I use FileInputStream/FileOutputStream directly (no Buffered*Stream) with write/read(byte[PAGE_SIZE]) then the read is significantly better and almost RandmonAccessFile levels. However the write only shows marginal improvement.

    ReplyDelete
  8. > You suppose overhead of not-inlined function call can make accountable infuence on IO performance?

    I think it is likely to be case of what intrinsics get applied during JIT'ing and when.

    > By the way -- did you try different buffer sizes, other from 4K? My experiments show what difference between NIO/IO tend to decrease, and become ~10% only with buffer size about 128-512K (for my hardware, sure).

    With a larger buffer then the chunking of this buffer down to pages/blocks of the storage will be happening outside of Java, i.e. in the kernel for the file system. So the amount of work on the Java side is relatively less with a larger buffer.

    ReplyDelete
  9. "If I use FileInputStream/FileOutputStream directly (no Buffered*Stream) with write/read(byte[PAGE_SIZE]) then the read is significantly better and almost RandmonAccessFile levels. However the write only shows marginal improvement."

    Thanks. Good to know.

    Best,
    Ismael

    ReplyDelete
  10. Could you compare with the Path.copyTo() from java 7?
    At least in the unix case, it uses native code. It could be interesting to see the difference.

    ReplyDelete
  11. id,

    Path.copyTo() does not exist as a method in my JDK 7. Can you post a snipet of code for what you expect to see?

    If I guess to what you are comparing can you expand on how this is similar to the other tests if it is simply copying one file to another? I just don't understand how it is like-for-like.

    ReplyDelete
  12. Repeated the test on:
    - Java 1.6.0_26 (HotSpot 64-Bit Server build 20.1-b02)
    - Xeon 2GHz cpu
    - Debian Squeeze
    - 48G RAM
    - hardware RAID1 with rotating disks (can't figure out which)

    RandomAccessFile write=71,871,627 read=1,022,083,593 bytes/sec
    RandomAccessFile write=91,133,607 read=1,021,446,384 bytes/sec
    RandomAccessFile write=85,629,468 read=976,516,867 bytes/sec
    RandomAccessFile write=87,691,879 read=981,430,454 bytes/sec
    RandomAccessFile write=87,300,318 read=977,915,721 bytes/sec

    BufferedStreamFile write=58,101,351 read=235,788,504 bytes/sec
    BufferedStreamFile write=141,629,639 read=223,130,141 bytes/sec
    BufferedStreamFile write=124,271,844 read=131,937,510 bytes/sec
    BufferedStreamFile write=146,568,381 read=132,372,426 bytes/sec
    BufferedStreamFile write=144,410,950 read=131,340,986 bytes/sec

    BufferedChannelFile write=326,569,663 read=325,596,184 bytes/sec
    BufferedChannelFile write=341,960,260 read=327,391,895 bytes/sec
    BufferedChannelFile write=346,839,408 read=1,081,309,398 bytes/sec
    BufferedChannelFile write=349,041,329 read=1,080,026,367 bytes/sec
    BufferedChannelFile write=346,531,302 read=1,066,250,162 bytes/sec

    MemoryMappedFile write=280,202,490 read=233,636,596 bytes/sec
    MemoryMappedFile write=190,188,749 read=262,479,974 bytes/sec
    MemoryMappedFile write=180,190,484 read=254,884,878 bytes/sec
    MemoryMappedFile write=189,673,535 read=238,236,491 bytes/sec
    MemoryMappedFile write=164,501,295 read=262,740,947 bytes/sec

    The system was mostly idle. Dips however, can be explained by other activity.

    ReplyDelete
  13. Very interesting. Do you have a similar analysis for sockets? Is it better to use basic sockets, as they have always existed in java (I realize the logic beneath the API has changed greatly)? Is it better to use NIO or even third-party tools such as netty?

    ReplyDelete
    Replies
    1. Falcon,

      I have done this for sockets but not in a while. Good reminder to stay current. Last time I checked basic blocking sockets were greater throughput and lower-latency than NIO sockets. You obviously need a thread per socket so 64-bit is required for a large number of clients. Frameworks like Netty and Mina I've found add a significant overhead.

      Delete
  14. Hi,

    The write part of BufferedChannelFile is flawed because channel.write(buffer) really writes nothing since the buffer is not flipped. So buffer.flip() is in order prior of calling channel.write().

    ReplyDelete
    Replies
    1. Seven,

      You are absolutely right. Good spot thanks. I'll re-run the tests and post the results. Initial runs suggest write performance is closer to RandomAccessFile.

      Delete
  15. Your RandomAccessTest stays completely inside of its loop until it needs to write a byte []. Only then does it make a method call and leave the loop. The other methods all have 4096 times as many external method invocations in the inner loop.

    A fairer test would be to assemble the byte[] the same way with each method, then write that. Or, actually write a byte at a time to the RandomAccessFile.

    Hotspot isn't going to inline complex methods.

    ReplyDelete
    Replies
    1. A lot of the benefits of RandomAccessFile are the patterns it encourages based on its API design.

      However I cannot agree with your conclusions. If you look at the generated asm code you can see that Hotspot has no trouble inlining the use of ByteBuffer and even applying intrinsics. The -server JIT complier is very aggressive about inlining. By default Hotspot will not inline a method greater than 35 bytecodes. You can experiment by setting -XX:MaxInlineSize=35. Note this is bytecodes and not bytes. the "external method invocations" as you put it are well below this threshold.

      Delete
    2. I am curious about the generated code; I'll take a look at it. You don't agree that it would be fairer across methods to assemble the byte[] and call each IO type the same number of times?

      Delete
    3. Ross,

      I believe the point of using ByteBuffers and Channels, or BufferedStreams, is so that you do not need to build up a buffer in advance. They are providing that feature by their design, otherwise why use them? If you build up a buffer in advance, I think you will find it slower because the buffer gets copied twice. The important cost is the number of times you ultimately call the IO sub-system out of your user process into the kernel.

      Delete
    4. I agree that with a Buffer you could do _less_ building up in advance, but you are testing what is essentially the worst case -- performing IO a single byte at a time. If you want to compare apples to apples, you should be calling RandomAccessFile.write(int), to write a single byte at a time.

      But nobody ever does that ;) I think what you're detecting here is that the JVM is really, really good at optimizing operations over local byte arrays.

      I created some additional tests and used byte-array methods instead. The results are pretty much as I expected. This is on Windows 7 64 bit. Buffered streams win slightly over RandomAccessFile. BufferedChannel is about the same. Memory mapping the file doubles write performance and quadruples read performance. Here are the 400MB file results:

      RandomAccessFile write=375,435,380 read=627,258,805 bytes/sec
      RandomAccessFile write=332,197,891 read=646,056,782 bytes/sec
      RandomAccessFile write=308,201,655 read=651,192,368 bytes/sec
      RandomAccessFile write=307,969,924 read=648,101,265 bytes/sec
      RandomAccessFile write=307,738,542 read=678,145,695 bytes/sec
      BufferedStreamFile write=192,481,203 read=249,603,900 bytes/sec
      BufferedStreamFile write=181,640,798 read=256,641,604 bytes/sec
      BufferedStreamFile write=178,009,561 read=155,859,969 bytes/sec
      BufferedStreamFile write=169,116,432 read=155,682,250 bytes/sec
      BufferedStreamFile write=174,446,337 read=153,236,064 bytes/sec
      BufferedStreamFile2 write=344,201,680 read=787,692,307 bytes/sec
      BufferedStreamFile2 write=314,592,933 read=827,474,747 bytes/sec
      BufferedStreamFile2 write=328,468,323 read=795,339,805 bytes/sec
      BufferedStreamFile2 write=322,265,932 read=811,089,108 bytes/sec
      BufferedStreamFile2 write=288,247,712 read=819,200,000 bytes/sec
      BufferedChannelFile write=167,937,679 read=330,322,580 bytes/sec
      BufferedChannelFile write=172,245,584 read=349,190,110 bytes/sec
      BufferedChannelFile write=146,077,032 read=622,492,401 bytes/sec
      BufferedChannelFile write=145,557,924 read=624,390,243 bytes/sec
      BufferedChannelFile write=145,248,226 read=626,299,694 bytes/sec
      BufferedChannelFile2 write=310,303,030 read=656,410,256 bytes/sec
      BufferedChannelFile2 write=313,629,402 read=662,783,171 bytes/sec
      BufferedChannelFile2 write=310,538,286 read=670,376,432 bytes/sec
      BufferedChannelFile2 write=306,586,826 read=672,577,996 bytes/sec
      BufferedChannelFile2 write=304,988,830 read=675,907,590 bytes/sec
      MemoryMappedFile write=248,845,686 read=300,513,573 bytes/sec
      MemoryMappedFile write=232,199,546 read=296,382,054 bytes/sec
      MemoryMappedFile write=299,196,493 read=358,355,205 bytes/sec
      MemoryMappedFile write=298,107,714 read=347,413,061 bytes/sec
      MemoryMappedFile write=297,026,831 read=341,049,125 bytes/sec
      MemoryMappedFile2 write=479,625,292 read=1,237,462,235 bytes/sec
      MemoryMappedFile2 write=482,449,941 read=1,047,570,332 bytes/sec
      MemoryMappedFile2 write=564,965,517 read=1,244,984,802 bytes/sec
      MemoryMappedFile2 write=563,411,279 read=1,233,734,939 bytes/sec
      MemoryMappedFile2 write=561,095,890 read=1,241,212,121 bytes/sec

      Delete
    5. Can you post your code to github, or somewhere similar, so others can see the approach and confirm the findings? It also needs to be tested with a much larger file to take caching out of the equation.

      Delete
  16. To me if you are looking for best IO performance than use nio and Memory Mapped File in Java is best if your application can use it. Thanks for analysis, I was also had same opinion for RandomAccessFile and I still use it:)

    ReplyDelete
  17. Memory mapped files are really fast for read when they are hot (loaded into memory by previous I/O).

    ReplyDelete
  18. FileInputStream / FileOutputStream and RandomAccesFile share same native code so speed difference is in byte vs byte[] read / write on java source level

    ReplyDelete
  19. Why didn't you use flushing to disk and clearing of disk buffer?
    BufferedOutputStream close() method includes flushing to disk and thus slower. If you append to other tests

    for RandomAccessFile
    RandomAccessFile raf;
    raf.getFD().sync();

    and for FileChannel
    channel.force(true);

    and for MappedByteBuffer
    buffer.force();

    you'll see that write performance in general is same (~10% difference only)

    And same situation with read performance if you'll do clearing of disk buffer before read (sync; echo 3 > /proc/sys/vm/drop_caches)
    Only MappedByteBuffer read test 70% faster than others on my environment.

    ReplyDelete
    Replies
    1. Normal behaviour is not to force the pages to disk. It would only make sense for a database transaction log write at commit points. I don't understand why you think it is relevant for this test?

      The point is not to test actual disk performance but to test the APIs from Java to the OS filesystem.

      Delete
  20. Repeated test on Win10 x64, i5-4460, 16GB RAM and MX250 500GB SSD:
    JVM: HotSpot 1.8.0_91
    400MB

    RandomAccessFile write=414 155 712 read=1 190 697 674 bytes/sec
    RandomAccessFile write=538 947 368 read=1 190 697 674 bytes/sec
    RandomAccessFile write=417 959 183 read=1 183 815 028 bytes/sec
    RandomAccessFile write=418 813 905 read=1 204 705 882 bytes/sec
    RandomAccessFile write=419 672 131 read=1 219 047 619 bytes/sec

    BufferedStreamFile write=47 694 457 read=351 587 982 bytes/sec
    BufferedStreamFile write=282 093 663 read=360 246 262 bytes/sec
    BufferedStreamFile write=257 934 508 read=312 910 618 bytes/sec
    BufferedStreamFile write=258 912 768 read=312 910 618 bytes/sec
    BufferedStreamFile write=256 480 901 read=312 671 755 bytes/sec

    BufferedChannelFile write=395 748 792 read=425 779 625 bytes/sec
    BufferedChannelFile write=260 559 796 read=424 016 563 bytes/sec
    BufferedChannelFile write=308 433 734 read=1 058 397 932 bytes/sec
    BufferedChannelFile write=308 433 734 read=426 222 684 bytes/sec
    BufferedChannelFile write=306 816 479 read=428 451 882 bytes/sec

    MemoryMappedFile write=846 280 991 read=918 385 650 bytes/sec
    MemoryMappedFile write=476 279 069 read=505 055 487 bytes/sec
    MemoryMappedFile write=498 296 836 read=930 909 090 bytes/sec
    MemoryMappedFile write=495 883 777 read=928 798 185 bytes/sec
    MemoryMappedFile write=495 883 777 read=922 522 522 bytes/sec

    8GB

    RandomAccessFile write=399 707 245 read=1 216 332 590 bytes/sec
    RandomAccessFile write=527 393 291 read=1 123 885 306 bytes/sec
    RandomAccessFile write=337 870 164 read=1 311 139 564 bytes/sec
    RandomAccessFile write=350 775 027 read=1 436 436 963 bytes/sec
    RandomAccessFile write=334 503 879 read=1 415 831 316 bytes/sec

    BufferedStreamFile write=40 273 735 read=354 463 242 bytes/sec
    BufferedStreamFile write=251 550 697 read=349 622 295 bytes/sec
    BufferedStreamFile write=270 033 292 read=291 364 347 bytes/sec
    BufferedStreamFile write=269 093 059 read=307 530 595 bytes/sec
    BufferedStreamFile write=272 684 907 read=298 901 740 bytes/sec

    BufferedChannelFile write=337 605 604 read=432 980 972 bytes/sec
    BufferedChannelFile write=243 172 643 read=419 801 168 bytes/sec
    BufferedChannelFile write=295 058 348 read=1 134 940 426 bytes/sec
    BufferedChannelFile write=304 557 959 read=452 022 292 bytes/sec
    BufferedChannelFile write=301 597 820 read=438 450 010 bytes/sec

    MemoryMappedFile write=484 160 756 read=577 959 644 bytes/sec
    MemoryMappedFile write=389 631 391 read=438 145 156 bytes/sec
    MemoryMappedFile write=391 194 307 read=505 273 545 bytes/sec
    MemoryMappedFile write=400 175 858 read=516 943 270 bytes/sec
    MemoryMappedFile write=399 531 798 read=515 966 492 bytes/sec

    ReplyDelete
  21. I changed this test, so that each test would write a byte array, instead of a single byte, also added FileOutputStream and BufferedOutputStream. Results are for Amazon EC2 c4.2xl instance.

    File size 409mb

    Test bytes/second write speed
    RandomAccessFile.byte[]-rw 1,870,319,634
    RandomAccessFile.byte[]-rw 1,878,899,082
    RandomAccessFile.byte[]-rw 1,878,899,082
    BufferedOutputStream-FileOutputStream 156,634,799
    BufferedOutputStream-FileOutputStream 127,126,008
    BufferedOutputStream-FileOutputStream 127,047,146
    FileOutputStream 127,244,485
    FileOutputStream 127,126,008
    FileOutputStream 127,204,968
    RandomAccessFile.ByteBuffer-rw 1,587,596,899
    RandomAccessFile.ByteBuffer-rw 1,836,771,300
    RandomAccessFile.ByteBuffer-rw 1,870,319,634
    RandomAccessFile.DirectByteBuffer-rw 1,853,393,665
    RandomAccessFile.DirectByteBuffer-rw 1,969,230,769
    RandomAccessFile.DirectByteBuffer-rw 1,950,476,190
    RandomAccessFile.MappedByteBuffer-rw 2,202,150,537
    RandomAccessFile.MappedByteBuffer-rw 2,226,086,956
    RandomAccessFile.MappedByteBuffer-rw 2,226,086,956
    WritableByteChannel.MappedByteBuffer-rw 2,133,333,333
    WritableByteChannel.MappedByteBuffer-rw 2,214,054,054
    WritableByteChannel.MappedByteBuffer-rw 2,226,086,956

    ReplyDelete
  22. I know it is an old topic. Nevertheless, I attach my results, running on a virtualized Red Hat system (400MB size). I was interested in the varying block size effect. Perhaps somebody out can me clear up about the effect of the larger bloksize (better IO vs. more CPU time).

    Start time:2020-10-28 10:49:19.225
    Nr.of iterations: 1 loops
    Page size : 1 KBytes
    RandomAccessFile write=231,177,333 read=733,524,355 (bytes/sec), CPUtime=11671 (millSec)
    BufferedStreamFile write=66,562,662 read=335,627,663 (bytes/sec), CPUtime=36884 (millSec)
    BufferedChannelFile write=215,533,571 read=334,859,385 (bytes/sec), CPUtime=15634 (millSec)
    MemoryMappedFile write=519,006,588 read=761,904,761 (bytes/sec), CPUtime=6647 (millSec)
    End time:2020-10-28 10:50:34.827


    Start time:2020-10-28 10:56:30.198
    Nr.of iterations: 1 loops
    Page size : 4 KBytes
    RandomAccessFile write=342,017,368 read=1,321,929,966 (bytes/sec), CPUtime=30170 (millSec)
    BufferedStreamFile write=137,500,419 read=332,481,026 (bytes/sec), CPUtime=84231 (millSec)
    BufferedChannelFile write=334,558,523 read=392,487,543 (bytes/sec), CPUtime=45373 (millSec)
    MemoryMappedFile write=351,905,150 read=478,364,963 (bytes/sec), CPUtime=40418 (millSec)
    End time:2020-10-28 10:59:58.12


    Start time:2020-10-28 11:01:24.82
    Nr.of iterations: 1 loops
    Page size : 8 KBytes
    RandomAccessFile write=341,504,085 read=1,449,911,504 (bytes/sec), CPUtime=59296 (millSec)
    BufferedStreamFile write=261,896,769 read=291,810,636 (bytes/sec), CPUtime=118721 (millSec)
    BufferedChannelFile write=329,730,926 read=396,726,233 (bytes/sec), CPUtime=91002 (millSec)
    MemoryMappedFile write=309,991,864 read=429,947,253 (bytes/sec), CPUtime=90976 (millSec)
    End time:2020-10-28 11:07:36.88

    ReplyDelete