Showing posts with label Nutanix. Show all posts
Showing posts with label Nutanix. Show all posts

Thursday, December 11, 2014

Nutanix - a web-scale solution provider

I got this t-shirt a few months back and the material of this t-shirt was so nice such that I put it neatly in the draw and had forgotten about it.

It was given to me for free for the Web-Scale Wednesday -An Global Online event where it brought together IT leaders, industry experts and enterprise customers to share their perspectives and experiences adopting web-scale IT and bringing it to the enterprise

What is Web-Scale?
According to this article from Gartner's blog, "Web-Scale" is a term that Gartner uses "in an effort to describe all of the things happening at large cloud services firms such as Google, Amazon, Rackspace, Netflix, Facebook, etc., that enables them to achieve extreme levels of service delivery as compared to many of their enterprise counterparts."  The article further identifies 6 elements that web scale has:
  • Industrial data centers,
  • Web-oriented architectures,
  • Programmable management,
  • Agile processes,
  • A collaborative organization style and
  • A learning culture.  
An interesting note to the word scale that most people will think of scaling in size, Gartner also stated that scale can refer to speed also.

Nutanix on the other hand suggest that a Web Scale Infrastructure has these 5 essential elements:

  • Hyper-convergence on x86 servers
  • Intelligence in Software
  • Distributed Everything
  • Self-Healing System
  • API-based Automation and Rich Analytic
Nutanix has this video on "What Is Web-scale IT":

The main idea of a Web Scaled IT infrastructure is to follow how the huge web companies such as Google, Facebook or Netflix build, deploy and manage their data center.  Web Scale principle can be applied to enterprise and even SMBs (Small to Medium Business) to provide agility, scaling and better return on investment (RTO) on x86 hardware.

Nutanix
Nutanix is founded in 2009 with its headquarter in San Jose.  First product was shipped in 2011.

In marketing term, Nutanix offers the Nutanix Web-scale hyper-converged infrastructure solutions which "revolutionizing the enterprise datacenter by delivering efficient, radically simple physical, virtual and cloud environments."

 Nutanix's product offerings comes in a varieties of mix and match of it software editions and hardware platforms.


Nutanix hardware platform includes:
  • NX-1000 series
  • NX-3000 series
  • NX-6000 series
  • NX-7000 series
  • NX-8000 series
  • NX-9000 series
For detail specification and comparison, we can visit this page.

Nutanix software editions includes:
  • Starter
  • Pro
  • Ultimate

For detail specification and description of the software, we can visit this page.

The Nutanix Solutions
According to the Nutanix web page: "The Nutanix Virtual Computing Platform is a web-scale converged infrastructure solution that consolidates the compute (server) tier and the storage tier into a single, integrated appliance.

The Nutanix Virtual Computing Platform integrates high-performance server resources with enterprise-class storage in a cost-effective 2U appliance. It eliminates the need for network-based storage architecture, such as a storage area network (SAN) or network-attached storage (NAS). The scalability and performance that the world’s largest, most efficient datacenters enjoy are now available to all enterprises and government agencies."

From the above paragraph, I believed that "web-scale converged infrastructure" is the most important words that describes Nutanix's solution which is web scale and with a converged infrastructure.  Providing to customer the ability to scale like the big web companies such as Google, Facebook or Netflix with a converged infrastructure bringing hypervisor, compute, storage and networking into a single appliance.

All the Nutanix hardware platforms can be "linked" together as a cluster.  The key to Nutanix's solution is distribution of operation thus making the infrastructure agile and resilience. .

Here is a "Simple Explanation of How Nutanix Works"

Nutanix Innovations
Nutanix does not have any special hardware, all their innovations are on the software - Nutanix Controller Virtual Machine.  At of now there are 3 flavors of virtual machines that are specially tuned to their respective hypervisor platform:
  • VMware vSphere
  • Microsoft Hyper-V
  • Linux KVM
The Nutanix Controller Virtual Machine has 2 main functions:
  • Nutanix Distributed File System
  • Cluster management

image source: http://cdn1.stevenpoitras.com/wp-content/uploads/2013/09/NDFS_NodeDetail2.png

Nutanix Distributed File System
A Nutanix cluster consist of one or more appliance which has a minimum of 3 nodes.  Together it form the Nutanix Distributed File system (NDFS).

image source: http://cdn.stevenpoitras.com/wp-content/uploads/2013/09/CVM_Dist.png

This distributed file system is to provide data efficiency and data protection.  To the virtual machine in this web-scale converged infrastructure, the NDFS is a single data store.  The data efficiency and protection is abstracted from the user.  With this architecture, there is no need to have a separate and dedicated hardware to perform inline deduplication and compression.  According to Nutanix website NDFS has the following advantages:
  • Self-healing
  • Built-in converged backup and disaster recovery
  • Scheduled snapshots to align with RPO and RTO
  • Data localization in which data moves with the VM
  • Elastic Deduplication Engine to perform deduplication in RAM
  • Array-side compression
This page has more detailed description of NDFS.

Cluster Management
The other main function of the Nutanix Controller Virtual Machine is the management, coordination and application of the key Nutanix technologies in the cluster.  This diagram shows the high level components of a Nutanix cluster
image source: http://cdn.stevenpoitras.com/wp-content/uploads/2013/09/NDFS_ClusterComponents.png
Nutanix has a good document on its technologies  - Nutanix Bible.  This document is an ongoing updated document provided by Steven Poitaris for the Nutanix product.  It has so much detail on a lot of subjects.  It has a good description of each of these components and I extra the text from the Nutanix Bible:
Cassandra
  • Key Role: Distributed metadata store
  • Description: Cassandra stores and manages all of the cluster metadata in a distributed ring like manner based upon a heavily modified Apache Cassandra.  The Paxos algorithm is utilized to enforce strict consistency.  This service runs on every node in the cluster.  Cassandra is accessed via an interface called Medusa.
Zookeeper
  • Key Role: Cluster configuration manager
  • Description: Zeus stores all of the cluster configuration including hosts, IPs, state, etc. and is based upon Apache Zookeeper.  This service runs on three nodes in the cluster, one of which is elected as a leader.  The leader receives all requests and forwards them to the peers.  If the leader fails to respond a new leader is automatically elected.   Zookeeper is accessed via an interface called Zeus.
Stargate
  • Key Role: Data I/O manager
  • Description: Stargate is responsible for all data management and I/O operations and is the main interface from the hypervisor (via NFS, iSCSI or SMB).  This service runs on every node in the cluster in order to serve localized I/O.
Curator
  • Key Role: Map reduce cluster management and cleanup
  • Description: Curator is responsible for managing and distributing tasks throughout the cluster including disk balancing, proactive scrubbing, and many more items.  Curator runs on every node and is controlled by an elected Curator Master who is responsible for the task and job delegation.
Prism
  • Key Role: UI and API
  • Description: Prism is the management gateway for component and administrators to configure and monitor the Nutanix cluster.  This includes Ncli, the HTML5 UI and REST API.  Prism runs on every node in the cluster and uses an elected leader like all components in the cluster.
Genesis
  • Key Role: Cluster component & service manager
  • Description:  Genesis is a process which runs on each node and is responsible for any services interactions (start/stop/etc.) as well as for the initial configuration.  Genesis is a process which runs independently of the cluster and does not require the cluster to be configured/running.  The only requirement for genesis to be running is that Zookeeper is up and running.  The cluster_init and cluster_status pages are displayed by the genesis process.
Chronos
  • Key Role: Job and Task scheduler
  • Description: Chronos is responsible for taking the jobs and tasks resulting from a Curator scan and scheduling/throttling tasks among nodes.  Chronos runs on every node and is controlled by an elected Chronos Master who is responsible for the task and job delegation and runs on the same node as the Curator Master.
Cerebro
  • Key Role: Replication/DR manager
  • Description: Cerebro is responsible for the replication and DR capabilities of NDFS.  This includes the scheduling of snapshots, the replication to remote sites, and the site migration/failover.  Cerebro runs on every node in the Nutanix cluster and all nodes participate in replication to remote clusters/sites.
Pithos
  • Key Role: vDisk configuration manager
  • Description: Pithos is responsible for vDisk (NDFS file) configuration data.  Pithos runs on every node and is built on top of Cassandra.

Nutanix Use Cases
Being a web-scale converged infrastructure, Nutanix has the following but not limited to the following use cases:
  • VDI
  • Enterprise Branch Offices
  • Big Data
  • Private Cloud
  • Disaster Recovery


Reference:
"Cameron Haight." Cameron Haight RSS. N.p., n.d. Web. 09 Dec. 2014 
"Architecture | Nutanix." Nutanix. N.p., n.d. Web. 10 Dec. 2014.
"The Nutanix Bible - StevenPoitras.com." StevenPoitrascom. N.p., n.d. Web. 10 Dec. 2014.

Sunday, November 16, 2014

OpenStack Series: Part 16 – Ceph in OpenStack

In the Ceph home page, Ceph is described as a unified, distributed storage system designed for excellent performance, reliability and scalability.
Ceph is a unified, distributed storage system designed for excellent performance, reliability and scalability. - See more at: http://ceph.com/#sthash.eFgRE0CM.dpuf
Ceph is a unified, distributed storage system designed for excellent performance, reliability and scalability. - See more at: http://ceph.com/#sthash.eFgRE0CM.dpuf
Ceph is a unified, distributed storage system designed for excellent performance, reliability and scalability. - See more at: http://ceph.com/#sthash.eFgRE0CM.dpuf
Ceph is a unified, distributed storage system designed for excellent performance, reliability and scalability. - See more at: http://ceph.com/#sthash.eFgRE0CM.dpuf

It is easy to understand distributed but what about unified?

Unified means Ceph is able to deliver object, block and file storage in one system using commodity hardware.  These commodity hardware is usually defined as Node and a cluster is Cluster is a collection of node.

Ceph terminologies can be found in here.

Ceph is an open source Software Defined Storage.  Inktank is the commercial company that delivers enterprise ready Ceph.  Inktank was bought by Red Hat in May 2014.  Dreamhost is also a major contributor to the Ceph open source software.

As a side not Ceph comes from the "cephalopod"Inktank is kind of related to because cephalopod can squirt ink.  Also the management and monitoring system for Ceph is called Calamari.

The attraction of Ceph is its ability to scale with commodity hardware and there is also build-in resiliency/High Availability. 

Ceph is deployed as Storage Clusters in which there is the RADOS (Reliable Autonomic Distributed Object Store) and Ceph software uses CRUSH (Controlled Replication Under Scalable Hashing) to determine how and where to store the data within the storage cluster.


A Ceph Storage Cluster consists of two types of daemons:
A Ceph Monitor maintains a master copy of the cluster map. A cluster of Ceph monitors ensures high availability should a monitor daemon fail. Storage cluster clients retrieve a copy of the
cluster map from the Ceph Monitor.

A Ceph OSD Daemon checks its own state and the state of other OSDs and reports back to monitors.

If CephFS is used there is also the Ceph Metadata Server (MDS).
 
Ceph Architecture
image source: http://ceph.com/docs/master/_images/stack.png

This diagram show that RADOS is the base of a Ceph Storage Cluster.  On top there are
  • LIBRADOS
  • RADOSGW
  • RBD (RADOS Block Device)
  • CephFS

Ceph and OpenStack

Ceph was integrated into OpenStack in the Folsom release.

Being a unified storage provider, Ceph is a storage solution of choice to be used in an OpenStack Infrastructure.

The diagram below shows the how OpenStack interface with Ceph:


image source: http://www.inktank.com/wp-content/uploads/2013/03/Diagram_v3.0_CEEPHOPENTSTACK11-1024x455.png

The Inktank blog has a good description on how Ceph fits into an OpenStack environment:

Block Storage for OpenStack

  • Ceph serves as a native Cinder block provider for images and volumes, and integrates with the virtualization infrastructure to connect the block devices to the VM’s.
  • The Ceph RBD block device (RBD) enables instant thin provisioning and cloning of images and volumes used by OpenStack Nova.
  • This makes booting new VM’s with highly available, fault-tolerant disks fast, easy, and efficient.
  • Volumes can also be cloned from volume snapshots.

Object Storage for OpenStack
  • Ceph Object Gateway (RGW) provides complete compatibility with the Swift API, integrates into Keystone for authentication and can be used as a backend to Glance.
  • Full compatibility with the Amazon S3 API, a more scalable and easier to manage architecture, and the ability to run a single system for object and block,

Ceph at CERN
CERN is a huge nuclear research institute in Europe.  CERN deploy OpenStack in its production environment.  CERN received the "OpenStack Superuser Award" at the OpenStack summit in Paris. Checkout their cloud infrastructure here. Being a research institute, storage is important.  Ceph is being used by CERN for image processing, storing and archiving research data as well as quick data retrieval. It has a 3 PB (petabyte) Ceph Cluster in production.

Note: 1 PB = 1000000000000000B = 1015bytes = 1000terabytes.

image source: https://pbs.twimg.com/media/B0a6e1CCAAED1Pq.png:large


Not exactly a reference architecture of Ceph but with this example we can see that Ceph has a lots of potential to be used along with OpenStack.

Ceph use cases
Ceph runs on the same Linux cluster that KVM is running on.  With OpenStack Heat to autoscale, it has all the right ingredients to be made into a hyperconvergence unit.  Recently Nutanix and Simplivity are gaining momentum in this hyperconvergence space.  One application of hyperconvergence is on VDI (Virtual Desktop Infrastructure) and big data market.

According to Mirantis, OpenStack Sahara is planing to have native Ceph support in the Kilo release.

It seems to me that due the nature of Ceph being able to support object, block and file system storage, it has huge potentials for different application and use cases.

Related Post:
OpenStack Series Part 1: How do you look at OpenStack?
OpenStack Series Part 2: What's new in the Juno Release?
OpenStack Series Part 3: Keystone - Identity Service
OpenStack Series Part 4: Nova - Compute Service
OpenStack Series Part 5: Glance - Image Service
OpenStack Series Part 6: Cinder - Block Storage Service
OpenStack Series Part 7: Swift - Object Storage Service
OpenStack Series Part 8: Neutron - Networking Service
OpenStack Series Part 9: Horizon - a Web Based UI Service
OpenStack Series Part 10: Heat - Orchestration Service
OpenStack Series Part 11: Ceilometer - Monitoring and Metering Service
OpenStack Series Part 12: Trove - Database Service
OpenStack Series Part 13: Docker in OpenStack
OpenStack Series Part 14: Sahara - Data Processing Service
OpenStack Series part 15: Messaging and Queuing System in OpenStack
OpenStack Series Part 17: Congress - Policy Service
OpenStack Series Part 18: Network Function Virtualization in OpenStack
OpenStack Series Part 19: Storage Polices for Object Storage
OpenStack Series Part 20: Group-based Policy for Neutron

Reference:
"Architecture¶." Architecture — Ceph Documentation. N.p., n.d. Web. 31 Oct. 2014.
"Home Ceph." Ceph Home Comments. N.p., n.d. Web. 31 Oct. 2014. 
"Ceph for OpenStack." Inktank Ceph for OpenStack Comments. N.p., n.d. Web. 31 Oct. 2014.