Block vs file vs object storage

11 September, 2020
Back

This was a really confusing topic at first, but after much Googling, Gollwitzer made it much clearer.

General computing

We have to start with general computing first, because the same terms like block storage can mean different things on cloud.

DAS (Direct Attached Storage). The type of storage that everyone's most familiar with. It plugged-in and remove as a separate storage from the computer.

NAS (Network Attached Storage). If many users in the same location (like a small office) need to work on the same set of files, the NAS would be the best case. It's connected via LAN.

NAS provides connectivity to multiple DAS, which can be RAID10 configured for high availability.

SAN (Storage Area Network). In a data center, servers that provide massive storage space are connected with fibre optic cables - thus the name SAN. They serve data to users at high-speed.

File storage. Documents and files that are stored in heirarchical folders. We're also familiar with this in our local devices. Computers need the path to find the file.

Block storage. This is the granular view of all of the above. Data is stored in blocks of 512 bytes in a hard drive. A 128GB hard drive consists of many blocks. Taking the example from above, a file storage is essentially an organization of block data.

Object storage. A relatively new way of storing data than the rest, data is stored in blobs and are accompanied by unique IDs and metadata.

Cloud

Now when it comes to cloud, AWS for example offers EBS (Elastic Block Storage) and S3. EBS is the block storage, and S3 is the object storage. It's best to describe them in terms of use cases. Refer Cloud Academy and Red Hat.

Block storage. Independent disk drives that are physically separate. Provides low latency, can run high workloads and more expensive than object storage. Deployed in a SAN pattern and attached to a functioning server. They can only be accessed when mounted to an OS. Suitable for:

  • Databases
  • App processing

In AWS, snapshots of EBS are stored in S3.

Object storage. Data store that scales more massively (virtually unlimited) than block storage at a cheaper rate. Data is referenced with unique identifiers. Objects are stored in a flat address, unlike file storage's heirarchical system. But object storage do have folders to catergorize content in a flat format rather than drill down like folders do. Accessed via REST API (https), SDK, AWS CLI, or AWS Management Console. The old SOAP API is still available but new features are no longer supported.

Data types (almost anything):

  • Files and documents too
  • Media files
  • Big data
  • Unstructured data
  • Backup and Archives

AWS S3 contains data, within folders (optional), within buckets. Each data will have an object key and a unique URL. Bucket names are also unique from all other buckets globally because they are flat. In object store, we do not "mount", "open", install an OS, or run a database in it.


Back