Monday, July 8, 2013

Uploading Large files with HTML 5

The task was to fix a silverlight based web uploader.  The application was needed because so many companies are blocking port 21 for FTP, so we were looking for a web based solution to allow similar functionality through the firewall.

Since silverlight was end of life I started by upgrading the system to a free flash based plug in.  It worked beautifully in IE, but FF and Chrome both had issues with files larger than 500MB.  Most of my research on the Internet claimed this was a limitation/bug in flash; I thought it was odd that it worked correctly in IE though.  I later discovered it had nothing to do with the browser type but rather random chance in the available memory space that .NET had to allocate which just happened to occur at the same time I was testing the IE browser.

When I was unable to quickly find a solution to the flash issue, I turned to HTML 5.  Being new technology there were very few things built for it.  I found two HTML 5 uploaders, but neither of them seemed to work for me, probably because I did not fully understand the technology at the time.  So, I decided to roll my own solution using HTML 5, javascript, and chunking.

I got the entire solution working correctly, including having the file chunks uploaded into SQL Server and using varbinary(max) to combine them in the database.  As I started testing larger files I noticed that the code combining the chunks ran exponentially slower the more chunks there were.  I was expecting it to slow down at a linear rate, but with a 1GB file it took a couple of hours to combine 1024 1MB chunks; which was unacceptable to me.

Next I tried combining the chunks on the web server using an external command prompt "type" process.  It worked perfectly for small files.  However, the number of chunks ended up surpassing the parameter length that is allowed to be passed to a command line program, which brought that solution to a halt.

Then I tried combining the files in C#.NET, which ended up being hands down the fastest method yet.  However I then ran into the issue of the file size overloading the byte array for uploading into SQL Server.  This last discovery happened to be one of the most important ones I had made yet.  As it turns out the byte array erroring out was the root of my flash issue as well.  The error #2038 that flash had been returning to me was a generic error code that simply meant something had gone wrong on the server, but flash had no idea what the problem was.  For most people across the web this error seemed to stem from a permissions issue, but for me the byte array problem was the source of my issue.  Manually stepping through the .NET code when flash sent the request to it seems to be the only reliable method of tracking down errors like this.

So I used the OPENROWSET with BULK option in SQL Server.  This was working perfectly manually, but when I tried to automate it I ran into the issue that my sql service used a local account that did not exist on the web server and so could not authenticate to grab the file.  EXECUTE AS LOGIN did not work either to impersonate an Active Directory account because the local sql service did not have permissions to query the AD server.

Having .NET copy the newly combined file over to the database server was the key to solving the final problem.  After the database upload was completed I then had .NET clean up all the chunks and the merged file giving me a finished product that just needed some GUI polishing.

In Summary:

  • Using HTML5 I uploaded all the file chunks to the server.
  • In .NET I sorted and combined the newly uploaded chunks.
  • I moved the combined file to a location my SQL server could access.
  • Using OPENROWSET with the BULK option I copied the file into the database.