Sunday, December 6, 2009

Textual description of firstImageUrl

Socket.SendFile - Implementing fast file transfer

Implementing fast file transfer with Socket.SendFile

When writing network applications, we usually have a need to implement file transfer between two hosts. For eg, imaging an FTP client, where the client is downloading or uploading a file from an FTP server. Similarly, you could be uploading an image file (probably a photo) as an attachment to a Blog or a website like Facebook or Flickr.

Usually, file transfer is implemented as a Read/Write pattern, where you read from the source stream and write into the destination stream. Here the source stream is the stream constructed from the Socket, and the target stream is the file, or vice versa if the file is being transferred to a destination server.

The simple Read/Write pattern for file transfer is implemented as follows.

using (FileStream fs = File.OpenRead(filename))
{
byte[] buffer = new byte[1024];
int read = fs.Read(buffer, 0, buffer.Length);
while (read & gt; 0)
{
ns.Write(buffer, 0, read);
read = fs.Read(buffer, 0, buffer.Length);
}
}

In the .NET framework, there is a better way to do file uploads, which is exposed through Socket.SendFile method. This method exposes the underlying Winsock API TransmitFile. This API is much more powerful and faster in terms of performance.

In order to check the performance difference, I wrote an application that compares the difference in performance between using the Read/Write pattern and the Socket.SendFile method.

Here is the test program:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
using System.Net;
using System.Net.Sockets;
using System.Net.Cache;
using System.Threading;
namespace socket_sendfile
{
delegate TimeSpan SendFileDelegate(Socket client, String filename, long fileSize);
class Header
{
public long FileSize { get; set; }
public int FileNumber { get; set; }
public void Serialize(Stream stream)
{
byte[] buffer = BitConverter.GetBytes(this.FileNumber);
stream.Write(buffer, 0, buffer.Length);
buffer = BitConverter.GetBytes(this.FileSize);
stream.Write(buffer, 0, buffer.Length);
}
public static Header Deserialize(Stream stream)
{
Header header = new Header();
byte[] buffer = new byte[4];
int read = stream.Read(buffer, 0, buffer.Length);
header.FileNumber = BitConverter.ToInt32(buffer, 0);
buffer = new byte[sizeof(long)];
read = stream.Read(buffer, 0, buffer.Length);
header.FileSize = BitConverter.ToInt64(buffer, 0);
return header;
}
}
class Program
{
private Random rand = new Random();
static void Main(string[] args)
{
Program prog = new Program();
try
{
prog.StartServer();
using (Socket client = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
{
client.Bind(new IPEndPoint(0, 0));
client.Connect(new IPEndPoint(IPAddress.Loopback, 8080));
prog.Run(client, new SendFileDelegate(prog.SendFile1));
Console.WriteLine();
prog.Run(client, new SendFileDelegate(prog.SendFile2));
}
}
catch (Exception e)
{
Console.Error.WriteLine(e);
}
}
void StartServer()
{
Thread serverThread = new Thread(new ThreadStart(this.Server));
serverThread.Start();
}
void Run(Socket client, SendFileDelegate sendFileMethod)
{
foreach (long size in this.GetNextSize())
{
String filename = Path.GetTempFileName();
this.CreateFile(filename, size);
for (int i = 0; i & lt; 10; i++)
{
TimeSpan ts = sendFileMethod(client, filename, size);
Console.WriteLine("{0} {1} {2}", i, size, ts.TotalMilliseconds);
}
}
}
IEnumerable<ulong> GetNextSize()
{
ulong[] sizes = { 1024, 4096, 8192, 16385, 65536, 1048576 };
for (int i = 0; i & lt; sizes.Length; i++)
{
yield return sizes[i];
}
}
void CreateFile(string filename, long size)
{
byte[] buffer = new byte[16384];
// first write out the file.
using (FileStream tempStream = File.OpenWrite(filename))
using (BinaryWriter bw = new BinaryWriter(tempStream))
{
long remaining = size;
while (remaining & gt; 0)
{
rand.NextBytes(buffer);
int writeSize = buffer.Length;
if (writeSize & gt; (int)remaining)
{
writeSize = (int)remaining;
}
bw.Write(buffer, 0, writeSize);
remaining -= writeSize;
}
}
}
TimeSpan SendFile1(Socket client, String filename, long fileSize)
{
Stopwatch timer = new Stopwatch();
timer.Start();
using (NetworkStream ns = new NetworkStream(client))
{
Header header = new Header();
header.FileSize = fileSize;
header.FileNumber = 1;
// send the header
header.Serialize(ns);
using (FileStream fs = File.OpenRead(filename))
{
byte[] buffer = new byte[1024];
int read = fs.Read(buffer, 0, buffer.Length);
while (read & gt; 0)
{
ns.Write(buffer, 0, read);
read = fs.Read(buffer, 0, buffer.Length);
}
}
}
timer.Stop();
return timer.Elapsed;
}
TimeSpan SendFile2(Socket client, String filename, long fileSize)
{
Stopwatch timer = new Stopwatch();
timer.Start();
using (NetworkStream ns = new NetworkStream(client))
{
Header header = new Header();
header.FileSize = fileSize;
header.FileNumber = 1;
byte[] headerBuffer = null;
using (MemoryStream ms = new MemoryStream())
{
header.Serialize(ms);
ms.Seek(0, SeekOrigin.Begin);
headerBuffer = ms.ToArray();
}
// send the header
client.SendFile(filename, headerBuffer, null, TransmitFileOptions.UseDefaultWorkerThread);
}
timer.Stop();
return timer.Elapsed;
}
void Server()
{
byte[] buffer = new byte[1024];
TcpListener listener = new TcpListener(8080);
listener.Start();
using (TcpClient client = listener.AcceptTcpClient())
using (NetworkStream ns = client.GetStream())
{
bool hasData = true;
while (hasData)
{
// first get the header. Header has the file size.
Header header = Header.Deserialize(ns);
long remaining = header.FileSize;
while (remaining & gt; 0)
{
int readSize = buffer.Length;
if ((long)readSize & gt; remaining)
readSize = (int)remaining;
int read = ns.Read(buffer, 0, readSize);
remaining -= read;
}
}
}
}
}
}
A couple of things to note about this implementation:

1) It uses Message framing to frame file transfers, since it uses the same socket for multiple file transfers. I have used the techniques in Serializing data from .NET to Java to do this. Even though there is no Java app that is involved here, the techniques are the same.

2) The server just drains the incoming stream. It does not save the incoming data to a file. Since we are just interested in benchmarking the performance between the two Send implementations, we should be ok here.

3) The program, which is basically a perf harness, uses a Strategy pattern to change the SendFile method used. That way everything else remains the same, and it just changes the SendFile method to get performance numbers.


Perf Comparison

The following graph shows the performance with the simple Read/Write pattern for file transfer.

performance of file transfer with Socket.BeginSend



The following chart shows the performance when Socket.SendFile is used.


performance of file transfer with Socket.SendFile



As you can see, there is a huge difference between the two, specially for 1M file size. With Socket.SendFile, it takes max 129ms for upload, whereas without this API, it takes 1000ms for upload. For smaller file sizes, there is not that much of a difference.

There is a huge variance in timings for the SendFile() method for 1M file size, but I havent been able to figure out the reason for that yet. Anyway, the fact that Socket.SendFile() is faster should not  be impacted by that.