S3 is not a filesystem
and that's OK
Something that occasionally catches developers out is the fact that S3 is not a filesystem, but a key-value store. Specifically
- the keys can be any UTF-8 encoded string, between 1 and 1024 bytes long;
- the values can be any binary string, beteen 0 bytes and 5 terabytes long.
Yes, you can emulate certain features of a filesystem using slashes in the keys. In fact, the AWS console does this: it creates a folder by creating a 0-byte object with a trailing slash in its key.
But, you don't have to be limited by the key-structure that a filesystem imposes. For example, if you want to:
- you can have objects, with data, at both of the keys
a
anda/b
; - and you can have objects, with data, that have keys ending in
/
.
Both of these are impossible in a traditional filesystem.
An example of using this is to store an uploaded file, some-unique-name.txt
say, and then store derived data inside its "folder", for example some-unique-name.txt/analysis1
and some-unique-name/analysis2
.
Now a warning
If you're about to leverage the fact S3 is a key-value store by having non filesystem-compatible keys, be warned: there are consequences.
- You won't be able to easily sync from your S3 bucket to a filesystem.
- Everything that claims to be S3 compatible may not be, and so migrating to those systems, or using them in a test environment, may be trickier. Specifically, minio [that I have used many times and generally really like] is not S3-compatible in this way.
So, there are some valid pragmatic reasons to make sure your keys are filesystem-compatible. But, I can't help the niggling feeling that it's a shame to be limited by systems you're not using.