How to offer better downloads

HTTP headers for pain-free downloads of largish files

To offer HTTP file downloads via your own code [rather than redirecting elsewhere], it's often easy to rustle something up. However, the default behaviour in a lot of cases may not give users as good an experience as possible. With a bit of effort, you can polish that right up, and here are 4 sets of HTTP headers that help you do just that.

content-length

If you are able to, set the content-length header with the total size of the file.

This simple header will cause browsers to tell users [estimates for] the time-to-completion of downloads. [It's also required for browsers to attempt to do range requests, as in the following section.]

Note that if streaming a file to the client, e.g. using Django's StreamingHttpResponse, then at the point the HTTP headers are generated, the full bytes of the file are not available, so this can't happen automatically. You have to explicitly determine the length of the file and set the header. For example, if the file is stored in S3, S3 returns a content-length header with all responses, and you can take the value of this header and return it to the client.

accept-ranges / range / content-range [/ content-length]

By default, if the connection is interrupted, browsers will have to restart the download from the beginning. If you support HTTP range requests, which use the accept-ranges, range, content-range headers, [and content-length header] then browsers can resume downloads from where they left off.

See the MDN docs on HTTP range requests for more information. Note that S3 supports range requests, so if your code is essentially a proxy to S3, you can proxy the headers to and from S3 to support this fairly easily.

content-disposition

By default, browsers will guess at a suitable filename for the downloaded file, typically using the last path-component from the URL. Instead, consider what would be a more helpful filename, and set something like content-disposition: attachment; filename="very-helpful-filename.csv".

accept-encoding / content-encoding

Some files compress well, for example typical CSV files. Serving compressed versions of these would often make downloads much faster. However, if the most likely thing users will do will do is immediately and manually uncompress the file, you've just made their life a tiny bit harder.

However, the browser can do this so the user will never even notice. For example, if the browser sends an accept-encoding header specifying it accepts gzip [which most modern browsers do], and if the server returns a header of content-encoding: gzip with gzipped data, the browser will automatically decompress this data on download. The user will notice nothing other than faster downloads.

Unfortunately, S3 doesn't support this sort of content negotiation. If you're storing your data on S3, and want to support both gzipped and non-gzipped versions of an object, you'll have to store them under separate keys. To avoid this, you might be tempted to compress on-the-fly. However, you then won't be able to send a content-length, or handle range requests.


That's it!