Previously I’ve written about doing fun IO stuff in C#. I found out that some of my old tricks still worked in C# but….
Now having done a lot of C++ I knew about async IO buffered and un-buffered and could have made unmanaged code calls to open or create the file and pass the handle back, but just like it sounds it is kind of a pain to setup and if you are going down that path you might as well code it all up in C++ anyway.
I was mostly right. I have been working on a file sync tool for managing all my SQL Sever backup files. Naturally, I wanted to be as fast as humanly possible. Wanting that speed and getting it from the CLR are two completely different things. I know how to do asynchronous IO, and with a little trick, you can do un-buffered IO as well. The really crappy part is you can’t do both in the CLR.
From my previous post, you know that SQL Server does asynchronous, un-buffered IO on reads and writes. The CLR allows you to so asynchronous reads with a fun bit of coding and an call back structure. I took this code from one of the best papers on C# and IO: Sequential File Programming Patterns and Performance with .NET I made some minor changes and cleaned up the code a bit.
internal class AsyncFileCopy { // globals private const int Buffers = 8; // number of outstanding requests private const int BufferSize = 8*1024*1024; // request size, one megabyte public static FileStream Source; // source file stream public static FileStream Target; // target file stream public static long TotalBytes; // total bytes to process public static long BytesRead; // bytes read so far public static long BytesWritten; // bytes written so far public static long Pending; // number of I/O's in flight public static Object WriteCountMutex = new Object[0]; // mutex to protect count // Array of buffers and async results. public static AsyncRequestState[] Request = new AsyncRequestState[Buffers]; public static void AsyncBufferedFileCopy(string inputfile, string outputfile) { Source = new FileStream(inputfile, // open source file FileMode.Open, // for read FileAccess.Read, // FileShare.Read, // allow other readers BufferSize, // buffer size FileOptions.Asynchronous); // use async Target = new FileStream(outputfile, // create target file FileMode.Create, // fault if it exists FileAccess.Write, // will write the file FileShare.None, // exclusive access BufferSize, // buffer size FileOptions.Asynchronous); //unbuffered async TotalBytes = Source.Length; // Size of source file Target.SetLength(TotalBytes); //Set target file lenght to avoid file growth var writeCompleteCallback = new AsyncCallback(WriteCompleteCallback); for (int i = 0; i < Buffers; i++) Request[i] = new AsyncRequestState(i); // launch initial async reads for (int i = 0; i < Buffers; i++) { // no callback on reads. Request[i].ReadAsyncResult = Source.BeginRead(Request[i].Buffer, 0, BufferSize, null, i); Request[i].ReadLaunched.Set(); // say that read is launched } // wait for the reads to complete in order, process buffer and then write it. for (int i = 0; (BytesRead < TotalBytes); i = (i + 1)%Buffers) { Request[i].ReadLaunched.WaitOne(); // wait for flag that says buffer is reading int bytes = Source.EndRead(Request[i].ReadAsyncResult); // wait for read complete BytesRead += bytes; // process the buffer <your code goes here> Target.BeginWrite(Request[i].Buffer, 0, bytes, writeCompleteCallback, i); // write it } // end of reader loop while (Pending > 0) Thread.Sleep(10); // wait for all the writes to complete Source.Close(); Target.Close(); // close the files } // structure to hold IO request buffer and result. // end AsyncRequestState declaration // Asynchronous Callback completes writes and issues next read public static void WriteCompleteCallback(IAsyncResult ar) { lock (WriteCountMutex) { // protect the shared variables int i = Convert.ToInt32(ar.AsyncState); // get request index Target.EndWrite(ar); // mark the write complete BytesWritten += BufferSize; // advance bytes written Request[i].BufferOffset += Buffers*BufferSize; // stride to next slot if (Request[i].BufferOffset < TotalBytes) { // if not all read, issue next read Source.Position = Request[i].BufferOffset; // issue read at that offset Request[i].ReadAsyncResult = Source.BeginRead(Request[i].Buffer, 0, BufferSize, null, i); Request[i].ReadLaunched.Set(); } } } #region Nested type: AsyncRequestState public class AsyncRequestState { // data that tracks each async request public byte[] Buffer; // IO buffer to hold read/write data public long BufferOffset; // buffer strides thru file BUFFERS*BUFFER_SIZE public IAsyncResult ReadAsyncResult; // handle for read requests to EndRead() on. public AutoResetEvent ReadLaunched; // Event signals start of read public AsyncRequestState(int i) { // constructor BufferOffset = i*BufferSize; // offset in file where buffer reads/writes ReadLaunched = new AutoResetEvent(false); // semaphore says reading (not writing) Buffer = new byte[BufferSize]; // allocates the buffer } } #endregion }
The Fun bit about this code is you don’t need to spawn your own threads to do the work. All of this happens from a single thread call and the async happens in the background. I do make sure and grow the file to prevent dropping back into synchronous mode on file growths.
This next bit is the un-buffered stuff.
internal class UnBufferedFileCopy { public static int CopyBufferSize = 8 * 1024 * 1024; public static byte[] Buffer = new byte[CopyBufferSize]; const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000; public static int CopyFileUnbuffered(string inputfile, string outputfile) { var infile = new FileStream(inputfile, FileMode.Open, FileAccess.Read, FileShare.None, 8 , FileFlagNoBuffering | FileOptions.SequentialScan); var outfile = new FileStream(outputfile, FileMode.Create, FileAccess.Write, FileShare.None, 8, FileOptions.WriteThrough); int bytesRead; while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0) { outfile.Write(Buffer, 0, bytesRead); } outfile.Close(); outfile.Dispose(); infile.Close(); infile.Dispose(); return 1; } }
Since this is a synchronous call I’m not worried about extending the file for performance. There is the fragmentation issue to worry about. Without that the code is a bit cleaner. The secret sauce on this one is creating your own file option and passing it in.
const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;
I hear you asking now, where did this thing come from? Well, that is simple it is a regular flag you can pass in if you are doing things in C or C++ when you create a file handle. I got curious as to what the CLR was actually doing in the background. It has to make a call to the OS at some point and that means unmanaged code.
internal class UnmanagedFileCopy { public static int CopyBufferSize = 8 * 1024 * 1024; public static byte[] Buffer = new byte[CopyBufferSize]; private const int FILE_FLAG_NO_BUFFERING = unchecked(0x20000000); private const int FILE_FLAG_OVERLAPPED = unchecked(0x40000000); private const int FILE_FLAG_SEQUENTIAL_SCAN = unchecked(0x08000000); private const int FILE_FLAG_WRITE_THROUGH = unchecked((int)0x80000000); private const int FILE_FLAG_NONE = unchecked(0x00000000); public static FileStream infile; public static SafeFileHandle inhandle; public static FileStream outfile; public static SafeFileHandle outhandle; [DllImport("KERNEL32", SetLastError = true, CharSet = CharSet.Auto, BestFitMapping = false)] private static extern SafeFileHandle CreateFile(String fileName, int desiredAccess, FileShare shareMode, IntPtr securityAttrs, FileMode creationDisposition, int flagsAndAttributes, IntPtr templateFile); public static void CopyUnmanaged(string inputfile, string outputfile) { outhandle = CreateFile(outputfile, (int)FileAccess.Write, (int)FileShare.None, IntPtr.Zero, FileMode.Create, FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH, IntPtr.Zero); inhandle = CreateFile(inputfile, (int)FileAccess.Read, (int)FileShare.None, IntPtr.Zero, FileMode.Open, FILE_FLAG_NO_BUFFERING | FILE_FLAG_SEQUENTIAL_SCAN, IntPtr.Zero); outfile = new FileStream(outhandle, FileAccess.Write, 8, false); infile = new FileStream(inhandle, FileAccess.Read, 8, false); int bytesRead; while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0) { outfile.Write(Buffer, 0, bytesRead); } outfile.Close(); outfile.Dispose(); outhandle.Close(); outhandle.Dispose(); infile.Close(); infile.Dispose(); inhandle.Close(); inhandle.Dispose(); } }
If I was building my own unmanaged calls this would be it. When you profile the managed code for object creates/destroys you see that it is making calls to SafeFileHandle. Being the curious guy I am I did a little more digging. For those of you who don’t know there is an open source implementation of the Common Language Runtime called Mono. That means you can download the source code and take a look at how things are done. Poking around in the FileStream and associated code I saw that had all the file flags in the code but commented out un-buffered… Now I had a mystery on my hands. I tried to implement asynchronous un-buffered IO using all unmanaged code calls and couldn’t do it. There is a fundamental difference between a byte array in the CLR and what I can setup in native C++. One of the things you have to be able to do if you want asynchronous un-buffered IO is to sector align all reads and writes, including in and out of memory buffers. You can’t do it in C#. You have to allocate an unmanaged segment of memory and handle the reads and writes through that buffer. At the end of the day, you have written all the C++ you need to do the file copy stuff and rapped it in a managed code loop.
So, you can do asynchronous OR un-buffered but not both. From Sequential File Programming Patterns and Performance with .NET
the FileStream class does a fine job. Most applications do not need or want un-buffered IO. But, some applications like database systems and file copy utilities want the performance and control un-buffered IO offers.
And that is a real shame, I’d love to write some high performance IO stuff in C#. I settled on doing un-buffered IO since these copies are from a SQL Server which will always be under some kind of memory pressure, to the file server. If I could do both asynchronous and un-buffered I could get close to wire speed, around 105 to 115 megabytes a second. Just doing un-buffered gets me around 80 megabytes per second. Not horrible, but not the best.