Many applications record a series of events to file-based storage for later use. This can be anything from logging and auditing, through to keeping a transaction redo log in an event sourced design or its close relative CQRS.
Java has a number of means by which a file can be sequentially written to, or read back again. This article explores some of these mechanisms to understand their performance characteristics. For the scope of this article I will be using pre-allocated files because I want to focus on performance. Constantly extending a file imposes a significant performance overhead and adds jitter to an application resulting in highly variable latency. "Why is a pre-allocated file better performance?", I hear you ask. Well, on disk a file is made up from a series of blocks/pages containing the data. Firstly, it is important that these blocks are contiguous to provide fast sequential access. Secondly, meta-data must be allocated to describe this file on disk and saved within the file-system. A typical large file will have a number of "indirect" blocks allocated to describe the chain of data-blocks containing the file contents that make up part of this meta-data. I'll leave it as an exercise for the reader, or maybe a later article, to explore the performance impact of not preallocating the data files. If you have used a database you may have noticed that it preallocates the files it will require.
The Test
I want to experiment with 2 file sizes. One that is sufficiently large to test sequential access, but can easily fit in the file-system cache, and another that is much larger so that the cache subsystem is forced to retire pages so that new ones can be loaded. For these two cases I'll use 400MB and 8GB respectively. I'll also loop over the files a number of times to show the pre and post warm-up characteristics.
I'll test 4 means of writing and reading back files sequentially:
The Code
400MB file
===========
RandomAccessFile write=379,610,750 read=1,452,482,269 bytes/sec
RandomAccessFile write=294,041,636 read=1,494,890,510 bytes/sec
RandomAccessFile write=250,980,392 read=1,422,222,222 bytes/sec
RandomAccessFile write=250,366,748 read=1,388,474,576 bytes/sec
RandomAccessFile write=260,394,151 read=1,422,222,222 bytes/sec
BufferedStreamFile write=98,178,331 read=286,433,566 bytes/sec
BufferedStreamFile write=100,244,738 read=288,857,545 bytes/sec
BufferedStreamFile write=82,948,562 read=154,100,827 bytes/sec
BufferedStreamFile write=108,503,311 read=153,869,271 bytes/sec
BufferedStreamFile write=113,055,478 read=152,608,047 bytes/sec
BufferedChannelFile write=228,443,948 read=356,173,913 bytes/sec
BufferedChannelFile write=265,629,053 read=374,063,926 bytes/sec
MemoryMappedFile write=300,955,180 read=305,899,925 bytes/sec
MemoryMappedFile write=313,149,847 read=310,538,286 bytes/sec
MemoryMappedFile write=326,374,501 read=303,857,566 bytes/sec
MemoryMappedFile write=327,680,000 read=304,535,315 bytes/sec
MemoryMappedFile write=326,895,450 read=303,632,320 bytes/sec
8GB File
============
RandomAccessFile write=167,402,321 read=251,922,012 bytes/sec
RandomAccessFile write=193,934,802 read=257,052,307 bytes/sec
RandomAccessFile write=192,948,159 read=248,460,768 bytes/sec
RandomAccessFile write=191,814,180 read=245,225,408 bytes/sec
RandomAccessFile write=190,635,762 read=275,315,073 bytes/sec
BufferedStreamFile write=154,823,102 read=248,355,313 bytes/sec
BufferedStreamFile write=152,083,913 read=253,418,301 bytes/sec
BufferedStreamFile write=133,099,369 read=146,056,197 bytes/sec
BufferedStreamFile write=131,065,708 read=146,217,827 bytes/sec
BufferedStreamFile write=132,694,052 read=148,116,004 bytes/sec
BufferedChannelFile write=186,703,740 read=215,075,218 bytes/sec
BufferedChannelFile write=190,591,410 read=211,030,680 bytes/sec
BufferedChannelFile write=187,220,038 read=223,087,606 bytes/secJava has a number of means by which a file can be sequentially written to, or read back again. This article explores some of these mechanisms to understand their performance characteristics. For the scope of this article I will be using pre-allocated files because I want to focus on performance. Constantly extending a file imposes a significant performance overhead and adds jitter to an application resulting in highly variable latency. "Why is a pre-allocated file better performance?", I hear you ask. Well, on disk a file is made up from a series of blocks/pages containing the data. Firstly, it is important that these blocks are contiguous to provide fast sequential access. Secondly, meta-data must be allocated to describe this file on disk and saved within the file-system. A typical large file will have a number of "indirect" blocks allocated to describe the chain of data-blocks containing the file contents that make up part of this meta-data. I'll leave it as an exercise for the reader, or maybe a later article, to explore the performance impact of not preallocating the data files. If you have used a database you may have noticed that it preallocates the files it will require.
The Test
I want to experiment with 2 file sizes. One that is sufficiently large to test sequential access, but can easily fit in the file-system cache, and another that is much larger so that the cache subsystem is forced to retire pages so that new ones can be loaded. For these two cases I'll use 400MB and 8GB respectively. I'll also loop over the files a number of times to show the pre and post warm-up characteristics.
I'll test 4 means of writing and reading back files sequentially:
- RandomAccessFile using a vanilla byte[] of page size.
- Buffered FileInputStream and FileOutputStream.
- NIO FileChannel with ByteBuffer of page size.
- Memory mapping a file using NIO and direct MappedByteBuffer.
The Code
import java.io.*; import java.nio.ByteBuffer; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; import static java.lang.Integer.MAX_VALUE; import static java.lang.System.out; import static java.nio.channels.FileChannel.MapMode.READ_ONLY; import static java.nio.channels.FileChannel.MapMode.READ_WRITE; public final class TestSequentialIoPerf { public static final int PAGE_SIZE = 1024 * 4; public static final long FILE_SIZE = PAGE_SIZE * 2000L * 1000L; public static final String FILE_NAME = "test.dat"; public static final byte[] BLANK_PAGE = new byte[PAGE_SIZE]; public static void main(final String[] arg) throws Exception { preallocateTestFile(FILE_NAME); for (final PerfTestCase testCase : testCases) { for (int i = 0; i < 5; i++) { System.gc(); long writeDurationMs = testCase.test(PerfTestCase.Type.WRITE, FILE_NAME); System.gc(); long readDurationMs = testCase.test(PerfTestCase.Type.READ, FILE_NAME); long bytesReadPerSec = (FILE_SIZE * 1000L) / readDurationMs; long bytesWrittenPerSec = (FILE_SIZE * 1000L) / writeDurationMs; out.format("%s\twrite=%,d\tread=%,d bytes/sec\n", testCase.getName(), bytesWrittenPerSec, bytesReadPerSec); } } deleteFile(FILE_NAME); } private static void preallocateTestFile(final String fileName) throws Exception { RandomAccessFile file = new RandomAccessFile(fileName, "rw"); for (long i = 0; i < FILE_SIZE; i += PAGE_SIZE) { file.write(BLANK_PAGE, 0, PAGE_SIZE); } file.close(); } private static void deleteFile(final String testFileName) throws Exception { File file = new File(testFileName); if (!file.delete()) { out.println("Failed to delete test file=" + testFileName); out.println("Windows does not allow mapped files to be deleted."); } } public abstract static class PerfTestCase { public enum Type { READ, WRITE } private final String name; private int checkSum; public PerfTestCase(final String name) { this.name = name; } public String getName() { return name; } public long test(final Type type, final String fileName) { long start = System.currentTimeMillis(); try { switch (type) { case WRITE: { checkSum = testWrite(fileName); break; } case READ: { final int checkSum = testRead(fileName); if (checkSum != this.checkSum) { final String msg = getName() + " expected=" + this.checkSum + " got=" + checkSum; throw new IllegalStateException(msg); } break; } } } catch (Exception ex) { ex.printStackTrace(); } return System.currentTimeMillis() - start; } public abstract int testWrite(final String fileName) throws Exception; public abstract int testRead(final String fileName) throws Exception; } private static PerfTestCase[] testCases = { new PerfTestCase("RandomAccessFile") { public int testWrite(final String fileName) throws Exception { RandomAccessFile file = new RandomAccessFile(fileName, "rw"); final byte[] buffer = new byte[PAGE_SIZE]; int pos = 0; int checkSum = 0; for (long i = 0; i < FILE_SIZE; i++) { byte b = (byte)i; checkSum += b; buffer[pos++] = b; if (PAGE_SIZE == pos) { file.write(buffer, 0, PAGE_SIZE); pos = 0; } } file.close(); return checkSum; } public int testRead(final String fileName) throws Exception { RandomAccessFile file = new RandomAccessFile(fileName, "r"); final byte[] buffer = new byte[PAGE_SIZE]; int checkSum = 0; int bytesRead; while (-1 != (bytesRead = file.read(buffer))) { for (int i = 0; i < bytesRead; i++) { checkSum += buffer[i]; } } file.close(); return checkSum; } }, new PerfTestCase("BufferedStreamFile") { public int testWrite(final String fileName) throws Exception { int checkSum = 0; OutputStream out = new BufferedOutputStream(new FileOutputStream(fileName)); for (long i = 0; i < FILE_SIZE; i++) { byte b = (byte)i; checkSum += b; out.write(b); } out.close(); return checkSum; } public int testRead(final String fileName) throws Exception { int checkSum = 0; InputStream in = new BufferedInputStream(new FileInputStream(fileName)); int b; while (-1 != (b = in.read())) { checkSum += (byte)b; } in.close(); return checkSum; } }, new PerfTestCase("BufferedChannelFile") { public int testWrite(final String fileName) throws Exception { FileChannel channel = new RandomAccessFile(fileName, "rw").getChannel(); ByteBuffer buffer = ByteBuffer.allocate(PAGE_SIZE); int checkSum = 0; for (long i = 0; i < FILE_SIZE; i++) { byte b = (byte)i; checkSum += b; buffer.put(b); if (!buffer.hasRemaining()) { buffer.flip(); channel.write(buffer); buffer.clear(); } } channel.close(); return checkSum; } public int testRead(final String fileName) throws Exception { FileChannel channel = new RandomAccessFile(fileName, "rw").getChannel(); ByteBuffer buffer = ByteBuffer.allocate(PAGE_SIZE); int checkSum = 0; while (-1 != (channel.read(buffer))) { buffer.flip(); while (buffer.hasRemaining()) { checkSum += buffer.get(); } buffer.clear(); } return checkSum; } }, new PerfTestCase("MemoryMappedFile") { public int testWrite(final String fileName) throws Exception { FileChannel channel = new RandomAccessFile(fileName, "rw").getChannel(); MappedByteBuffer buffer = channel.map(READ_WRITE, 0, Math.min(channel.size(), MAX_VALUE)); int checkSum = 0; for (long i = 0; i < FILE_SIZE; i++) { if (!buffer.hasRemaining()) { buffer = channel.map(READ_WRITE, i, Math.min(channel.size() - i , MAX_VALUE)); } byte b = (byte)i; checkSum += b; buffer.put(b); } channel.close(); return checkSum; } public int testRead(final String fileName) throws Exception { FileChannel channel = new RandomAccessFile(fileName, "rw").getChannel(); MappedByteBuffer buffer = channel.map(READ_ONLY, 0, Math.min(channel.size(), MAX_VALUE)); int checkSum = 0; for (long i = 0; i < FILE_SIZE; i++) { if (!buffer.hasRemaining()) { buffer = channel.map(READ_WRITE, i, Math.min(channel.size() - i , MAX_VALUE)); } checkSum += buffer.get(); } channel.close(); return checkSum; } }, }; }Results
===========
RandomAccessFile write=379,610,750 read=1,452,482,269 bytes/sec
RandomAccessFile write=294,041,636 read=1,494,890,510 bytes/sec
RandomAccessFile write=250,980,392 read=1,422,222,222 bytes/sec
RandomAccessFile write=250,366,748 read=1,388,474,576 bytes/sec
RandomAccessFile write=260,394,151 read=1,422,222,222 bytes/sec
BufferedStreamFile write=98,178,331 read=286,433,566 bytes/sec
BufferedStreamFile write=100,244,738 read=288,857,545 bytes/sec
BufferedStreamFile write=82,948,562 read=154,100,827 bytes/sec
BufferedStreamFile write=108,503,311 read=153,869,271 bytes/sec
BufferedStreamFile write=113,055,478 read=152,608,047 bytes/sec
BufferedChannelFile write=228,443,948 read=356,173,913 bytes/sec
BufferedChannelFile write=265,629,053 read=374,063,926 bytes/sec
BufferedChannelFile write=223,825,136 read=1,539,849,624 bytes/sec
BufferedChannelFile write=232,992,036 read=1,539,849,624 bytes/sec
BufferedChannelFile write=212,779,220 read=1,534,082,397 bytes/sec
MemoryMappedFile write=300,955,180 read=305,899,925 bytes/sec
MemoryMappedFile write=313,149,847 read=310,538,286 bytes/sec
MemoryMappedFile write=326,374,501 read=303,857,566 bytes/sec
MemoryMappedFile write=327,680,000 read=304,535,315 bytes/sec
MemoryMappedFile write=326,895,450 read=303,632,320 bytes/sec
8GB File
============
RandomAccessFile write=167,402,321 read=251,922,012 bytes/sec
RandomAccessFile write=193,934,802 read=257,052,307 bytes/sec
RandomAccessFile write=192,948,159 read=248,460,768 bytes/sec
RandomAccessFile write=191,814,180 read=245,225,408 bytes/sec
RandomAccessFile write=190,635,762 read=275,315,073 bytes/sec
BufferedStreamFile write=154,823,102 read=248,355,313 bytes/sec
BufferedStreamFile write=152,083,913 read=253,418,301 bytes/sec
BufferedStreamFile write=133,099,369 read=146,056,197 bytes/sec
BufferedStreamFile write=131,065,708 read=146,217,827 bytes/sec
BufferedStreamFile write=132,694,052 read=148,116,004 bytes/sec
BufferedChannelFile write=186,703,740 read=215,075,218 bytes/sec
BufferedChannelFile write=190,591,410 read=211,030,680 bytes/sec
BufferedChannelFile write=191,585,397 read=221,297,747 bytes/sec
BufferedChannelFile write=192,653,214 read=211,789,038 bytes/sec
MemoryMappedFile write=123,023,322 read=231,530,156 bytes/sec
MemoryMappedFile write=121,961,023 read=230,403,600 bytes/sec
MemoryMappedFile write=123,317,778 read=229,899,250 bytes/sec
MemoryMappedFile write=121,472,738 read=231,739,745 bytes/sec
MemoryMappedFile write=120,362,615 read=231,190,382 bytes/sec
Analysis
For years I was a big fan of using RandomAccessFile directly because of the control it gives and the predictable execution. I never found using buffered streams to be useful from a performance perspective and this still seems to be the case.
In more recent testing I've found that using NIO FileChannel and ByteBuffer are doing much better. With Java 7 the flexibility of this programming approach has been improved for random access with SeekableByteChannel.
It seems that for reading RandomAccessFile and NIO do very well with Memory Mapped files winning for writes in some cases.
I've seen these results vary greatly depending on platform. File system, OS, storage devices, and available memory all have a significant impact. In a few cases I've seen memory-mapped files perform significantly better than the others but this needs to be tested on your platform because your mileage may vary...
A special note should be made for the use of memory-mapped large files when pushing for maximum throughput. I've often found the OS can become unresponsive due the the pressure put on the virtual memory sub-system.
Conclusion
There is a significant difference in performance for the different means of doing sequential file IO from Java. Not all methods are even remotely equal. For most IO I've found the use of ByteBuffers and Channels to be the best optimised parts of the IO libraries. If buffered streams are your IO libraries of choice, then it is worth branching out and and getting familiar with the implementations of Channel and Buffer or even falling back and using the good old RandomAccessFile.
Interesting comparison, particularly because NIO seems to have a bad rep in some places (which I can't really judge).
ReplyDeleteTwo questions: a) is there a reason the ByteBuffer for case 3 wasn't allocated direct? does it make a performance difference? and b) do you have an explanation for the fact that BufferedChannelFile on the 400MB file performs much worse on the first two runs than on the last three, whereas RandomAccessFile doesn't show this behavior?
Ingo,
ReplyDeletea) I've found very little difference between direct and normal ByteBuffers when writing 4K buffers byte at a time.
b) I'd guess it is an optimisation issue. Probably something related to OSR like I discussed in my last post. The RandomAccessFile example is very simple and thin layer over file access.
Quick question: it seems seldom if ever worth using BufferedStreams over regular streams, unless you actually do single-byte reads/writes (which is unlikely if you actually care about performance)? Most code that needs buffering implements it at higher level anyway.
ReplyDeleteSo wouldn't it make more sense to just measure use with plain FileInputStream/FileOutputStream -- may not make a big difference in numbers, but still.
To the last commenter, the benchmark does single-byte writes to the buffered stream:
ReplyDelete"byte b = (byte)i;
checkSum += b;
out.write(b);"
I agree that it would be interesting to see the results of buffering to a byte[] and then calling FileOutputStream.write(byte b[]).
Best,
Ismael
I wonder what would be the results on a Solaris system. Unfortunately I do not have access to one at the moment.
ReplyDelete> I'd guess it is an optimisation issue. Probably something related to OSR like I discussed in my last post. The RandomAccessFile example is very simple and thin layer over file access.
ReplyDeleteYou suppose overhead of not-inlined function call can make accountable infuence on IO performance? Seems surprising to me
By the way -- did you try different buffer sizes, other from 4K? My experiments show what difference between NIO/IO tend to decrease, and become ~10% only with buffer size about 128-512K (for my hardware, sure).
> I agree that it would be interesting to see the results of buffering to a byte[] and then calling FileOutputStream.write(byte b[]).
ReplyDeleteIf I use FileInputStream/FileOutputStream directly (no Buffered*Stream) with write/read(byte[PAGE_SIZE]) then the read is significantly better and almost RandmonAccessFile levels. However the write only shows marginal improvement.
> You suppose overhead of not-inlined function call can make accountable infuence on IO performance?
ReplyDeleteI think it is likely to be case of what intrinsics get applied during JIT'ing and when.
> By the way -- did you try different buffer sizes, other from 4K? My experiments show what difference between NIO/IO tend to decrease, and become ~10% only with buffer size about 128-512K (for my hardware, sure).
With a larger buffer then the chunking of this buffer down to pages/blocks of the storage will be happening outside of Java, i.e. in the kernel for the file system. So the amount of work on the Java side is relatively less with a larger buffer.
"If I use FileInputStream/FileOutputStream directly (no Buffered*Stream) with write/read(byte[PAGE_SIZE]) then the read is significantly better and almost RandmonAccessFile levels. However the write only shows marginal improvement."
ReplyDeleteThanks. Good to know.
Best,
Ismael
Could you compare with the Path.copyTo() from java 7?
ReplyDeleteAt least in the unix case, it uses native code. It could be interesting to see the difference.
id,
ReplyDeletePath.copyTo() does not exist as a method in my JDK 7. Can you post a snipet of code for what you expect to see?
If I guess to what you are comparing can you expand on how this is similar to the other tests if it is simply copying one file to another? I just don't understand how it is like-for-like.
Repeated the test on:
ReplyDelete- Java 1.6.0_26 (HotSpot 64-Bit Server build 20.1-b02)
- Xeon 2GHz cpu
- Debian Squeeze
- 48G RAM
- hardware RAID1 with rotating disks (can't figure out which)
RandomAccessFile write=71,871,627 read=1,022,083,593 bytes/sec
RandomAccessFile write=91,133,607 read=1,021,446,384 bytes/sec
RandomAccessFile write=85,629,468 read=976,516,867 bytes/sec
RandomAccessFile write=87,691,879 read=981,430,454 bytes/sec
RandomAccessFile write=87,300,318 read=977,915,721 bytes/sec
BufferedStreamFile write=58,101,351 read=235,788,504 bytes/sec
BufferedStreamFile write=141,629,639 read=223,130,141 bytes/sec
BufferedStreamFile write=124,271,844 read=131,937,510 bytes/sec
BufferedStreamFile write=146,568,381 read=132,372,426 bytes/sec
BufferedStreamFile write=144,410,950 read=131,340,986 bytes/sec
BufferedChannelFile write=326,569,663 read=325,596,184 bytes/sec
BufferedChannelFile write=341,960,260 read=327,391,895 bytes/sec
BufferedChannelFile write=346,839,408 read=1,081,309,398 bytes/sec
BufferedChannelFile write=349,041,329 read=1,080,026,367 bytes/sec
BufferedChannelFile write=346,531,302 read=1,066,250,162 bytes/sec
MemoryMappedFile write=280,202,490 read=233,636,596 bytes/sec
MemoryMappedFile write=190,188,749 read=262,479,974 bytes/sec
MemoryMappedFile write=180,190,484 read=254,884,878 bytes/sec
MemoryMappedFile write=189,673,535 read=238,236,491 bytes/sec
MemoryMappedFile write=164,501,295 read=262,740,947 bytes/sec
The system was mostly idle. Dips however, can be explained by other activity.
Very interesting. Do you have a similar analysis for sockets? Is it better to use basic sockets, as they have always existed in java (I realize the logic beneath the API has changed greatly)? Is it better to use NIO or even third-party tools such as netty?
ReplyDeleteFalcon,
DeleteI have done this for sockets but not in a while. Good reminder to stay current. Last time I checked basic blocking sockets were greater throughput and lower-latency than NIO sockets. You obviously need a thread per socket so 64-bit is required for a large number of clients. Frameworks like Netty and Mina I've found add a significant overhead.
Hi,
ReplyDeleteThe write part of BufferedChannelFile is flawed because channel.write(buffer) really writes nothing since the buffer is not flipped. So buffer.flip() is in order prior of calling channel.write().
Seven,
DeleteYou are absolutely right. Good spot thanks. I'll re-run the tests and post the results. Initial runs suggest write performance is closer to RandomAccessFile.
Your RandomAccessTest stays completely inside of its loop until it needs to write a byte []. Only then does it make a method call and leave the loop. The other methods all have 4096 times as many external method invocations in the inner loop.
ReplyDeleteA fairer test would be to assemble the byte[] the same way with each method, then write that. Or, actually write a byte at a time to the RandomAccessFile.
Hotspot isn't going to inline complex methods.
A lot of the benefits of RandomAccessFile are the patterns it encourages based on its API design.
DeleteHowever I cannot agree with your conclusions. If you look at the generated asm code you can see that Hotspot has no trouble inlining the use of ByteBuffer and even applying intrinsics. The -server JIT complier is very aggressive about inlining. By default Hotspot will not inline a method greater than 35 bytecodes. You can experiment by setting -XX:MaxInlineSize=35. Note this is bytecodes and not bytes. the "external method invocations" as you put it are well below this threshold.
I am curious about the generated code; I'll take a look at it. You don't agree that it would be fairer across methods to assemble the byte[] and call each IO type the same number of times?
DeleteRoss,
DeleteI believe the point of using ByteBuffers and Channels, or BufferedStreams, is so that you do not need to build up a buffer in advance. They are providing that feature by their design, otherwise why use them? If you build up a buffer in advance, I think you will find it slower because the buffer gets copied twice. The important cost is the number of times you ultimately call the IO sub-system out of your user process into the kernel.
I agree that with a Buffer you could do _less_ building up in advance, but you are testing what is essentially the worst case -- performing IO a single byte at a time. If you want to compare apples to apples, you should be calling RandomAccessFile.write(int), to write a single byte at a time.
DeleteBut nobody ever does that ;) I think what you're detecting here is that the JVM is really, really good at optimizing operations over local byte arrays.
I created some additional tests and used byte-array methods instead. The results are pretty much as I expected. This is on Windows 7 64 bit. Buffered streams win slightly over RandomAccessFile. BufferedChannel is about the same. Memory mapping the file doubles write performance and quadruples read performance. Here are the 400MB file results:
RandomAccessFile write=375,435,380 read=627,258,805 bytes/sec
RandomAccessFile write=332,197,891 read=646,056,782 bytes/sec
RandomAccessFile write=308,201,655 read=651,192,368 bytes/sec
RandomAccessFile write=307,969,924 read=648,101,265 bytes/sec
RandomAccessFile write=307,738,542 read=678,145,695 bytes/sec
BufferedStreamFile write=192,481,203 read=249,603,900 bytes/sec
BufferedStreamFile write=181,640,798 read=256,641,604 bytes/sec
BufferedStreamFile write=178,009,561 read=155,859,969 bytes/sec
BufferedStreamFile write=169,116,432 read=155,682,250 bytes/sec
BufferedStreamFile write=174,446,337 read=153,236,064 bytes/sec
BufferedStreamFile2 write=344,201,680 read=787,692,307 bytes/sec
BufferedStreamFile2 write=314,592,933 read=827,474,747 bytes/sec
BufferedStreamFile2 write=328,468,323 read=795,339,805 bytes/sec
BufferedStreamFile2 write=322,265,932 read=811,089,108 bytes/sec
BufferedStreamFile2 write=288,247,712 read=819,200,000 bytes/sec
BufferedChannelFile write=167,937,679 read=330,322,580 bytes/sec
BufferedChannelFile write=172,245,584 read=349,190,110 bytes/sec
BufferedChannelFile write=146,077,032 read=622,492,401 bytes/sec
BufferedChannelFile write=145,557,924 read=624,390,243 bytes/sec
BufferedChannelFile write=145,248,226 read=626,299,694 bytes/sec
BufferedChannelFile2 write=310,303,030 read=656,410,256 bytes/sec
BufferedChannelFile2 write=313,629,402 read=662,783,171 bytes/sec
BufferedChannelFile2 write=310,538,286 read=670,376,432 bytes/sec
BufferedChannelFile2 write=306,586,826 read=672,577,996 bytes/sec
BufferedChannelFile2 write=304,988,830 read=675,907,590 bytes/sec
MemoryMappedFile write=248,845,686 read=300,513,573 bytes/sec
MemoryMappedFile write=232,199,546 read=296,382,054 bytes/sec
MemoryMappedFile write=299,196,493 read=358,355,205 bytes/sec
MemoryMappedFile write=298,107,714 read=347,413,061 bytes/sec
MemoryMappedFile write=297,026,831 read=341,049,125 bytes/sec
MemoryMappedFile2 write=479,625,292 read=1,237,462,235 bytes/sec
MemoryMappedFile2 write=482,449,941 read=1,047,570,332 bytes/sec
MemoryMappedFile2 write=564,965,517 read=1,244,984,802 bytes/sec
MemoryMappedFile2 write=563,411,279 read=1,233,734,939 bytes/sec
MemoryMappedFile2 write=561,095,890 read=1,241,212,121 bytes/sec
Can you post your code to github, or somewhere similar, so others can see the approach and confirm the findings? It also needs to be tested with a much larger file to take caching out of the equation.
DeleteTo me if you are looking for best IO performance than use nio and Memory Mapped File in Java is best if your application can use it. Thanks for analysis, I was also had same opinion for RandomAccessFile and I still use it:)
ReplyDeleteMemory mapped files are really fast for read when they are hot (loaded into memory by previous I/O).
ReplyDeleteFileInputStream / FileOutputStream and RandomAccesFile share same native code so speed difference is in byte vs byte[] read / write on java source level
ReplyDeleteWhy didn't you use flushing to disk and clearing of disk buffer?
ReplyDeleteBufferedOutputStream close() method includes flushing to disk and thus slower. If you append to other tests
for RandomAccessFile
RandomAccessFile raf;
raf.getFD().sync();
and for FileChannel
channel.force(true);
and for MappedByteBuffer
buffer.force();
you'll see that write performance in general is same (~10% difference only)
And same situation with read performance if you'll do clearing of disk buffer before read (sync; echo 3 > /proc/sys/vm/drop_caches)
Only MappedByteBuffer read test 70% faster than others on my environment.
Normal behaviour is not to force the pages to disk. It would only make sense for a database transaction log write at commit points. I don't understand why you think it is relevant for this test?
DeleteThe point is not to test actual disk performance but to test the APIs from Java to the OS filesystem.
Repeated test on Win10 x64, i5-4460, 16GB RAM and MX250 500GB SSD:
ReplyDeleteJVM: HotSpot 1.8.0_91
400MB
RandomAccessFile write=414 155 712 read=1 190 697 674 bytes/sec
RandomAccessFile write=538 947 368 read=1 190 697 674 bytes/sec
RandomAccessFile write=417 959 183 read=1 183 815 028 bytes/sec
RandomAccessFile write=418 813 905 read=1 204 705 882 bytes/sec
RandomAccessFile write=419 672 131 read=1 219 047 619 bytes/sec
BufferedStreamFile write=47 694 457 read=351 587 982 bytes/sec
BufferedStreamFile write=282 093 663 read=360 246 262 bytes/sec
BufferedStreamFile write=257 934 508 read=312 910 618 bytes/sec
BufferedStreamFile write=258 912 768 read=312 910 618 bytes/sec
BufferedStreamFile write=256 480 901 read=312 671 755 bytes/sec
BufferedChannelFile write=395 748 792 read=425 779 625 bytes/sec
BufferedChannelFile write=260 559 796 read=424 016 563 bytes/sec
BufferedChannelFile write=308 433 734 read=1 058 397 932 bytes/sec
BufferedChannelFile write=308 433 734 read=426 222 684 bytes/sec
BufferedChannelFile write=306 816 479 read=428 451 882 bytes/sec
MemoryMappedFile write=846 280 991 read=918 385 650 bytes/sec
MemoryMappedFile write=476 279 069 read=505 055 487 bytes/sec
MemoryMappedFile write=498 296 836 read=930 909 090 bytes/sec
MemoryMappedFile write=495 883 777 read=928 798 185 bytes/sec
MemoryMappedFile write=495 883 777 read=922 522 522 bytes/sec
8GB
RandomAccessFile write=399 707 245 read=1 216 332 590 bytes/sec
RandomAccessFile write=527 393 291 read=1 123 885 306 bytes/sec
RandomAccessFile write=337 870 164 read=1 311 139 564 bytes/sec
RandomAccessFile write=350 775 027 read=1 436 436 963 bytes/sec
RandomAccessFile write=334 503 879 read=1 415 831 316 bytes/sec
BufferedStreamFile write=40 273 735 read=354 463 242 bytes/sec
BufferedStreamFile write=251 550 697 read=349 622 295 bytes/sec
BufferedStreamFile write=270 033 292 read=291 364 347 bytes/sec
BufferedStreamFile write=269 093 059 read=307 530 595 bytes/sec
BufferedStreamFile write=272 684 907 read=298 901 740 bytes/sec
BufferedChannelFile write=337 605 604 read=432 980 972 bytes/sec
BufferedChannelFile write=243 172 643 read=419 801 168 bytes/sec
BufferedChannelFile write=295 058 348 read=1 134 940 426 bytes/sec
BufferedChannelFile write=304 557 959 read=452 022 292 bytes/sec
BufferedChannelFile write=301 597 820 read=438 450 010 bytes/sec
MemoryMappedFile write=484 160 756 read=577 959 644 bytes/sec
MemoryMappedFile write=389 631 391 read=438 145 156 bytes/sec
MemoryMappedFile write=391 194 307 read=505 273 545 bytes/sec
MemoryMappedFile write=400 175 858 read=516 943 270 bytes/sec
MemoryMappedFile write=399 531 798 read=515 966 492 bytes/sec
I changed this test, so that each test would write a byte array, instead of a single byte, also added FileOutputStream and BufferedOutputStream. Results are for Amazon EC2 c4.2xl instance.
ReplyDeleteFile size 409mb
Test bytes/second write speed
RandomAccessFile.byte[]-rw 1,870,319,634
RandomAccessFile.byte[]-rw 1,878,899,082
RandomAccessFile.byte[]-rw 1,878,899,082
BufferedOutputStream-FileOutputStream 156,634,799
BufferedOutputStream-FileOutputStream 127,126,008
BufferedOutputStream-FileOutputStream 127,047,146
FileOutputStream 127,244,485
FileOutputStream 127,126,008
FileOutputStream 127,204,968
RandomAccessFile.ByteBuffer-rw 1,587,596,899
RandomAccessFile.ByteBuffer-rw 1,836,771,300
RandomAccessFile.ByteBuffer-rw 1,870,319,634
RandomAccessFile.DirectByteBuffer-rw 1,853,393,665
RandomAccessFile.DirectByteBuffer-rw 1,969,230,769
RandomAccessFile.DirectByteBuffer-rw 1,950,476,190
RandomAccessFile.MappedByteBuffer-rw 2,202,150,537
RandomAccessFile.MappedByteBuffer-rw 2,226,086,956
RandomAccessFile.MappedByteBuffer-rw 2,226,086,956
WritableByteChannel.MappedByteBuffer-rw 2,133,333,333
WritableByteChannel.MappedByteBuffer-rw 2,214,054,054
WritableByteChannel.MappedByteBuffer-rw 2,226,086,956
I know it is an old topic. Nevertheless, I attach my results, running on a virtualized Red Hat system (400MB size). I was interested in the varying block size effect. Perhaps somebody out can me clear up about the effect of the larger bloksize (better IO vs. more CPU time).
ReplyDeleteStart time:2020-10-28 10:49:19.225
Nr.of iterations: 1 loops
Page size : 1 KBytes
RandomAccessFile write=231,177,333 read=733,524,355 (bytes/sec), CPUtime=11671 (millSec)
BufferedStreamFile write=66,562,662 read=335,627,663 (bytes/sec), CPUtime=36884 (millSec)
BufferedChannelFile write=215,533,571 read=334,859,385 (bytes/sec), CPUtime=15634 (millSec)
MemoryMappedFile write=519,006,588 read=761,904,761 (bytes/sec), CPUtime=6647 (millSec)
End time:2020-10-28 10:50:34.827
Start time:2020-10-28 10:56:30.198
Nr.of iterations: 1 loops
Page size : 4 KBytes
RandomAccessFile write=342,017,368 read=1,321,929,966 (bytes/sec), CPUtime=30170 (millSec)
BufferedStreamFile write=137,500,419 read=332,481,026 (bytes/sec), CPUtime=84231 (millSec)
BufferedChannelFile write=334,558,523 read=392,487,543 (bytes/sec), CPUtime=45373 (millSec)
MemoryMappedFile write=351,905,150 read=478,364,963 (bytes/sec), CPUtime=40418 (millSec)
End time:2020-10-28 10:59:58.12
Start time:2020-10-28 11:01:24.82
Nr.of iterations: 1 loops
Page size : 8 KBytes
RandomAccessFile write=341,504,085 read=1,449,911,504 (bytes/sec), CPUtime=59296 (millSec)
BufferedStreamFile write=261,896,769 read=291,810,636 (bytes/sec), CPUtime=118721 (millSec)
BufferedChannelFile write=329,730,926 read=396,726,233 (bytes/sec), CPUtime=91002 (millSec)
MemoryMappedFile write=309,991,864 read=429,947,253 (bytes/sec), CPUtime=90976 (millSec)
End time:2020-10-28 11:07:36.88