Alex Chalkias
on 18 February 2020
Ceph is a compelling open-source alternative to proprietary software-defined storage solutions from traditional vendors, with a vibrant community collaborating on the technology. Ubuntu was an early supporter of Ceph and its community. That support continues today as Canonical maintains premier member status and serves on the governing board of the Ceph Foundation.
With many global enterprises and telco operators running Ceph on Ubuntu, organisations are able to combine block and object storage at scale while tapping into the economic and upstream benefits of open source.
Why use Ceph?
Ceph is unique because it makes data available in multiple ways: as a POSIX compliant filesystem through CephFS, as block storage volumes via the RBD driver and for object store, compatible with both S3 and Swift protocols, using the RADOS gateway.
A common use case for Ceph is to provide block and object store to OpenStack clouds via Cinder and as a Swift replacement. Kubernetes has similarly adopted Ceph as a popular way for physical volumes (PV) as a Container Storage Interface (CSI) plugin.
Even as a stand-alone, Ceph is a compelling open-source storage alternative to closed-source, proprietary solutions as it reduces OpEx costs organisations commonly accrue with storage from licensing, upgrades and potential vendor lock-in fees.
How Ceph works
Ceph stores data in pools, which are accessible from users or other services to provide block, file or object storage. A Ceph pool backs each of these mechanisms. Replication, access rights and other internal characteristics (such as data placement, ownership, access, etc.) are expressed on a per pool basis.
The Ceph Monitors (MONs) are responsible to maintain the Cluster state. They manage the location of data using the CRUSH map. These operate in a cluster with quorum-based HA, with data stored and retrieved via Object Storage Devices (OSDs).
There is a 1:1 mapping of a storage device and a running OSD daemon process. OSDs utilise the CPU and RAM of the participating cluster member host aggressively. This is why it is important to carefully balance the number of OSDs with the number of CPU cores and memory when architecting a Ceph cluster. This is especially true when aiming for a hyper-converged architecture (such as for OpenStack or Kubernetes for example).
Using LXD as a container hypervisor helps to properly enforce resource limitations on most running processes on a given node. LXD is used extensively to provide the best economics in Canonical’s Charmed OpenStack distribution by isolating the Ceph MONs. Containerising the Ceph OSDs is currently not recommended.
Ceph storage mechanisms
Accessing each data pool equates choosing the mechanism to do so. For example, one pool may be used to store block volumes, while another provides the storage backend for object store or filesystems. In the case of volumes, the host that is seeking to mount the volume needs to load the RBD kernel module after which Ceph volumes can be mounted, just as local volumes would be.
Object buckets are not mounted generally – client-side applications can use overlay filesystems to simulate a ‘drive’, but it is not an actual volume being mounted. Instead, the RADOS gateway enables access to Object buckets. RADOSGW provides a REST API to access objects using the S3 or Swift protocols. Filesystems are created and formatted using CephFS after which they are exported in a similar fashion as NFS mounts, and available to local networks.
Volume and object store use cases have been in production at scale for quite some time. Using Ceph to combine volume and object store provides many benefits to operators. Aside from the obvious support for multiple storage use cases, it also allows for the best density when properly architected and scaled.
Ceph storage support with Canonical
Canonical provides Ceph support as part of Ubuntu Advantage for Infrastructure with Standard and Advanced SLAs, corresponding to business hours and 24×7 support, respectively. Each covered node includes support for up to 48TB of raw storage of a Ceph cluster.
This coverage derives from our reference hardware recommendation for OpenStack and Kubernetes in a hyper-converged architecture at an optimal price per TB range while preserving best performance across compute and network in an on-premise cloud. In case the number of nodes to TB ratio does not coincide with our recommendation and exceeding this limit, Canonical offers per TB pricing beyond the included accommodating our scale-out storage customers.
Ceph is available on Ubuntu in the main repository, and as such, users receive free security updates for up to five years on an LTS version. An additional five years of paid, commercial support beyond the standard support cycle is available through UA Infrastructure.
Discover how organisations benefit from Canonical’s Ceph support with these case studies:
- The Wellcome Sanger Institute turns to Canonical for high level Ceph support
- Yahoo! Japan builds their IaaS environment with Canonical
Learn more about Canonical’s Ceph storage solution and get more details on our Ceph support offering.