Amazon’s Simple Storage Service (S3) provides a very useful interface for storing objects in redudant cloud storage, where you don’t have to worry about the underlying hardware. On top of being a service offered by Amazon, it’s also an industry standard API, and there are many services compatible with it.
What Is S3 Compatible?
In many cases, if you move to another cloud provider, you will have to rework a lot of your application. But, if you’re using S3 as your object storage backend, you’ll be able to move seamlessly to many other services.
This is because S3 is an open API standard. AWS’s Simple Storage Service is just an implementation of this standard; it’s native, and obviously will have the best support, but there are other services that will offer acceptable performance and stability, often for lower cost.
Switching to these services is easy—you simply have to change the URL endpoint your application uses, and it’s usually good to go after some minor tweaks to key handling. You will have to migrate your data with rclone, but it’s not a hard process, just a long one in some cases.
It’s no secret AWS is expensive. S3 is no different, and while storing files is very cheap, actually accessing those files is not. In a typical read/write heavy workload serving live files to users, storing the files is usually cheap; the highest costs are actually AWS data transfer charges, and S3 request charges:
Seeing a Cost Explorer breakdown like this, you may be tempted to consider a third party service that will be cheaper on the data transfer charges for your workload.
The two major competitors to AWS S3 are from Google and Microsoft. Google has their uncreatively named “Cloud Storage,” and Microsoft Azure has Azure Blob Storage. Google’s storage is S3 compatible, and is relatively easy to migrate too. Azure, on the other hand, is not S3 compatible, though there are tools like S3Proxy that can patch them together.
However, all of the storage services from the big three cloud providers will charge you high fees for data. They’re designed for enterprise customers, and if you’re a small business trying to minimize your costs, you should look elsewhere. There are other alternative cloud providers like Digital Ocean and Vultr that offer more streamlined pricing models with similar quality service.
Digital Ocean is a cloud provider designed to be simple. While it doesn’t offer as many features as major providers like AWS, it usually does right by the services it does offer. One of these services is object storage, with buckets being called Spaces, and it’s what we will recommend if you’re looking to move away from AWS.
Spaces are pretty simple. The base rate is $5 a month, and includes 250 GB of storage along with a whole TB of outbound data transfer. This is an insanely good deal—the same usage would cost over $90 on AWS S3.
Additional data storage is $0.02 per GB, fairly standard compared to S3 (although higher if you plan to use cheaper archive storage), and additional data is priced reasonable at $0.01 per GB transfered, which is 90% cheaper than AWS pricing.
Of course, this comes with a few limits, and unfortunately there are a lot more downsides and strings attached to this great deal.
- 750 requests, per IP address, to all of your Spaces.
- 150 combined operations per second to any Space, not including GET requests.
- 240 total operations including GET requests.
- 5 PUT or COPY requests per 5 minutes to any individual object in a Space
While these rate limits aren’t great to have, the limits are fairly generous, and you’re likely not going to hit them. If you are close to going over, you can minimize the effect of them by having multiple Spaces. If you’re unsure, you can enable bucket metrics in S3 to check your current usage.
Also, Spaces with over 3 million objects, or 1.5 million with versioning enabled, may require “intermittent maintenance periods” to ensure consistent performance. However, I personally have a bucket with over 2 million versioned objects that has not appeared to have experienced any significant downtime over 6 months, so this may not be a common occurence.
One major drawback of Spaces compared to S3 is the interface. Spaces is simple, and if you’re looking to just upload your website content or store some basic files, the web interface will allow uploads, downloads, and editing of most settings. However, if you’re storing lots of files or need advanced configuration, it’s quite frankly pretty bad, and you’ll have to mainly work with it over the S3 API.
For example—Spaces doesn’t even have a web editor to select your Lifecycle configuration, which handles storing old versions of objects used as backups in case of user deletion. That also means there’s no way of accessing or deleting versioned objects without listing the versions through the API and accessing them directly by version ID.
They also don’t have much documentation. To turn on versioning, for example, we had to consult S3’s own documentation to use the mostly ignored PutBucketVersioning endpoint, which thankfully is supported on Spaces despite it being ignored in DO’s docs. You’ll need to enable it through this endpoint:
PUT ?versioning <VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Status>Enabled</Status> </VersioningConfiguration>
And then enable version expiration:
PUT ?lifecycle <LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Rule> <ID>Bucket</ID> <Prefix>*</Prefix> <Status>Enabled</Status> <NoncurrentVersionExpiration> <NoncurrentDays>90</NoncurrentDays> </NoncurrentVersionExpiration> </Rule> </LifecycleConfiguration>
API keys are also very basic. You will not have granular control over individual buckets, objects, or anything else that comes with AWS IAM. This can be an issue if you plan to give keys to third parties.
Overall, the Digital Ocean experience is definitely nowhere close to how good AWS’s S3 is. But, if you’re fine with the limits, and don’t mind using the API for certain tasks, it can certainly save you a ton of money on bandwidth costs.
Since S3 is an open standard, it’s also something you can host yourself, which will be preferable for many people. There are lots of tools to do this, but one of the best is MinIO, which runs on Kubernetes (K8s).
Being on K8s means you can run it on every public cloud, including running it through serverless K8s services like AWS EKS. But, you’d still be subject to bandwidth costs in this case.
Where MinIO really shines is with dedicated servers, hybrid cloud solutions, and running on on-premises datacenters. If you’re paying for a dedicated network connection to a server, you won’t be nickle-and-dimed if you saturate that connection. This can make self-hosted storage very cheap if you’re planning on serving a lot of data to end users.
Also, running on your own hardware isn’t subject to the same limits as services like S3. You can host MinIO on blazing fast servers and get better performance in read/write heavy workloads (and you won’t be charged for requests). Of course, you will be required to pay the hardware costs for this performance.
Where it falls flat is on redudancy—because S3 stores your data in so many different places, it’s basically guaranteed to always work and never lose your data, barring a giant meteor. MinIO, on the other hand, can be hosted on a single server or through a distributed deployment. If you’re hosting on a single server, you will be screwed if your instance goes down. It’s the cheapest option, but we highly recommend multiple servers in a cluster or at least doing some kind of backup to S3.
MinIO is free to host under the GNU AGPL license, but you won’t get any support. Corporate licenses start at $1000/month and provide 24/7 support as well as a “Panic Button,” which will have their data recovery engineers ready to help you fix serious server failures.