image source: http://cdn.ttgtmedia.com/rms/onlineImages/sidebyside_comparison.png
Introduction to Object Storage
OpenStack Documentation defines Object Storage as a robust, highly scalable and fault tolerant storage platform for unstructured data such as objects. Objects are stored bits, accessed through a RESTful, HTTP-based interface. You cannot access data at the block or file level. Object Storage is commonly used to archive and back up data, with use cases in virtual machine image, photo, video and music storage.
If you want to know more about object storage take a look a this TechTarget SearchCloudStorage article.
OpenStack Swift Architecture
Object storage fits right into the current huge demand for storing web based content. Everything stored in Swift is an object and each object can be referenced by a URI which makes access to the stored content easily retrievable.
By far the best article on OpenStack Swift is this one by SwiftStack which has tons of useful information about Swift.
SwiftStack offers a commercial version of Swift and is a major contributor and promoter to the OpenStack Swift project. On Oct 30 SwiftStack received $16M series B funding.
A simple way to look at Swift can be a 2 tier architecture:
image source: https://swiftstack.com/images/posts/swift-global-replication/replica-overview.png
Swift is composed of two types of nodes:
- Proxy Nodes. This is the node that interface with "Swift clients" and to handle all the requests and processing. Clients only interacts with the Proxy node.
- Storage Nodes. This is the node that host the storage for the objects.
OpenStack Swift Terminology
image source: https://developers.seagate.com/download/attachments/1769521/SwiftStack%20Current%20Architecture.jpg?version=1&modificationDate=1381516991000&api=v2
- Partitions - A complete and non-overlapping set of key ranges such that each object, container and account is a member of exactly one partition as per the value of its key
- Ring - Maps each partition to a set of physical devices
- Objects - Key-value entries in the object store
- Containers - Groups of objects
- Accounts - Groups of containers
- Object/Storage Server - store, retrieve and delete objects stored on local devices
- Container Server - store the listing of objects using sqlite database.
- Account Server - similar to the container server but it store the listing of containers.
- Proxy Server - Scalable API request handler, determines storage node distribution of objects based on URL
- Replicator - utility process to handle the data replication
- Updater - handle update that are not performed successfully so as to maintain the integrity of the data in the Swift cluster
- Auditor - runs on each node to check for integrity of the object, container and account information.
- Data access: ring and partition
- Data representation: account, container and objects
- Servers type: proxy, object, container and account server
- Utility process: replicator, updater and auditor
image source: https://www.mirantis.com/blog/object-storage-openstack-cloud-swift-ceph/
The above diagram show the relationship between the Proxy server and the Account, Object and Container server via the Ring. For a more detailed description of the various servers in Swift visit this blog post.
All objects in Swift can be accessed via the RESTful API. SwiftStack blog has this description for the API format:
- The account storage location is a uniquely named storage area that contains the metadata (descriptive information) about the account itself as well as the list of containers in the account.
- Note that in Swift, an account is not a user identity. When you hear account, think storage area.
- The container storage location is the user-defined storage area within an account where metadata about the container itself and the list of objects in the container will be stored.
- The object storage location is where the data object and its metadata will be stored.
image source: https://swiftstack.com/static/global/images/swift_architecture_aco.jpg
Example of a Swift API call will be in this format:
Each object is represented by an URI: https://abc.com/v1/account/container/object_name
Swift is deployed as a cluster. Each cluster is made up of nodes. A node is the Linux machine where the Swift processes runs on. Each cluster can logically be divided into regions in which there are zones.
image source: https://swiftstack.com/static/global/images/swift_architecture_regions.jpg
Usually a Region represents a geographic location. Zones are also called the Availability Zone in which they are to be defined to isolate failure.
The heart of Swift is the - Ring
A ring is a static data structure in which object name is mapped to a partition using a "modified MD5" hashing algorithm. Each partition maps to a list of physical devices.
image source: https://julien.danjou.info/media/images/blog/2012/riak-ring.png
Melissa Palmer (@vmiss33) has a nice article on Swift ring.
Swift Storage Policy
New in the Juno release is the Storage Policy. This makes OpenStack more "enterprise" ready by allowing users and application developers to decide how they want to store, replicate and access data across different backends and geographical regions. The Juno Release stated for Swift Storage Policy:
Storage policies give users more control over cost and performance in terms of how they want to replicate and access data across different backends and geographical regions.
The Ring is the key in making the Storage Policy works without much change to the existing API.
OpenStack Series Part 1: How do you look at OpenStack?
OpenStack Series Part 2: What's new in the Juno Release?
OpenStack Series Part 3: Keystone - Identity Service
OpenStack Series Part 4: Nova - Compute Service
OpenStack Series Part 5: Glance - Image Service
OpenStack Series Part 6: Cinder - Block Storage Service
OpenStack Series Part 8: Neutron - Networking Service
OpenStack Series Part 9: Horizon - a web based UI Service
OpenStack Series Part 10: Heat - Orchestration Service
OpenStack Series Part 11: Ceilometer - Monitoring and Metering Service
OpenStack Series Part 12: Trove - Database Service
OpenStack Series Part 13: Docker in OpenStack
OpenStack Series Part 14: Sahara - Data Processing Service
OpenStack Series part 15: Messaging and Queuing System in OpenStack
OpenStack Series Part 16: Ceph in OpenStack
OpenStack Series Part 17: Congress - Policy Service
OpenStack Series Part 18: Network Function Virtualization in OpenStack
OpenStack Series Part 19: Storage Polices for Object Storage
OpenStack Series Part 20: Group-based Policy for Neutron
"SwiftStackBlog." A Globally Distributed OpenStack Swift Cluster. N.p., n.d. Web. 06 Nov. 2014.
"OpenStack Swift Architecture." Software Defined Storage. N.p., n.d. Web. 06 Nov. 2014.
"OpenStack Swift - Kinetic - Developers.seagate.com." OpenStack Swift - Kinetic - Developers.seagate.com. N.p., n.d. Web. 06 Nov. 2014.