The core S3 API is beautiful

Just PUT and GET: it couldn't be simpler

Tuesday December 31st, 2019

One of the beautiful, and maybe even genius, things about the core S3 API: it's just an HTTP PUT to store an object, and an HTTP GET to fetch it. You need a few headers, but that's it.

Structure of an S3 PUT request

PUT /key/of/the/object HTTP/1.1\r\n
host: my-example-bucket.s3-eu-west-1.amazonaws.com\r\n
authorization: ...\r\n
content-length: ...\r\n
x-amz-content-sha256: ...\r\n
x-amz-date: ...\r\n
\r\n
The bytes of the object

Structure of an S3 GET request

GET /key/of/the/object HTTP/1.1\r\n
host: my-example-bucket.s3-eu-west-1.amazonaws.com\r\n
authorization: ...\r\n
x-amz-content-sha256: ...\r\n
x-amz-date: ...\r\n
\r\n

This means that as long as you can inject the right headers, you can use any HTTP client to make the requests. Here's an example using Python requests.

import os
import requests

# from https://gist.github.com/michalc/ccb87856363a895fd1fadf52ab4cdcec
from aws_sig_v4_headers import aws_sig_v4_headers

host = 'my-example-bucket.s3-eu-west-1.amazonaws.com'
service = 's3'
region = 'eu-west-1'
path = '/key/of/the/object'
pre_auth_headers = {}
query = {}
data = b''
headers = aws_sig_v4_headers(
	os.environ['AWS_ACCESS_KEY_ID'], os.environ['AWS_SECRET_ACCESS_KEY'], pre_auth_headers,
	service, region, host, 'GET', path, query, data,
)
response = requests.get(f'https://{host}{path}', headers=headers)

If the core wasn't so straightforward, I suspect S3 would not be anywhere near as popular, and S3-compatible storage providers may not even exist.

I don't often say this: but thank you AWS.