With every advance in technology, user expectations rise. Once upon a time, we accessed the Internet over dial-up modems, and knew that downloading a movie would take longer than watching it. Today, a momentary interruption in HD streaming is annoying enough to make us change internet providers.
The same holds true in business. End users expect websites, applications, and files to load instantaneously. When pages load slowly, retail customers abandon their carts. But while all-flash arrays have made seemingly instantaneous service possible, many organizations simply haven’t been able to migrate all of their application workloads to purely flash environments.
Tiering solves that problem. With tiered storage, different categories of data are assigned to various types of storage media. Only the most mission-critical data is placed on fast, high-value flash storage. Less demanding applications reside on slower spinning disk, while archived data is placed for long-term storage. This tiered storage architecture saves money by ensuring that organizations only purchase as much expensive storage as is required for the most critical workloads.
Implementing tiered storage is challenging. Without strategic focus and concrete understanding of the technology, the wrong data could end up in the wrong places, resulting in high cost, even higher latency, and possibly even data loss. Here are five best practices for optimizing storage tiers for high-performance and cost efficiency.
1. Know Your Storage Tiers
Your applications and data need to be organized business criticality and application performance requirements. Then, appropriate storage solutions need to be selected for each tier.
Tier 1: For most businesses, the top storage tier is used for transactional data. This type of data requires fast, accurate reads and writes to support customer transactions or to run high-speed applications. One example is online retail, where delays result in abandoned purchases and lost revenue. Generally latest-generation, high-speed flash systems are used for this tier, where the business needs and revenue justify the high costs.
Tier 2: This tier supports major business applications from email to ERP. It has to be fast and secure, but there’s no need for the subsecond response times of Tier 1. Email is a great example. Load times aren’t as business-critical as in e-commerce, but your colleagues will quickly lose patience with consistently slow loads. Tier 2 technologies balance cost and performance, offering larger storage volumes than Tier 1 and good-enough performance at a lower price point.
Tier 3: As data ages, users generally access it less frequently, if at all. Yet there is still a need to retain this data for historical or compliance reasons. For example, older financial transactions won’t be accessed every day, but still need to be accessible for analysis and financial reporting. Depending on exact business needs, this data might be stored on cheap JBOD or MAID technology or possibly even a storage system designed to support complex queries, such as the Sybase IQ systems.
Tier 4: Lastly, a huge volume of data needs to be kept around for a very long time for compliance. Tapes have served this purpose in the past, but without proper indexing and archival organization data retrieval can be inefficient or even impossible in real-world situations like litigation or compliance audits. The exceptionally low cost of tape storage is tempting, but even in lightly regulated markets, organizations find that these solutions fail to meet some of the basic requirements.
2. Don’t Underestimate Flash Caching
Keeping the most hotly needed data in the fastest type of storage available is an important guiding principle. In the past, flash storage has only been used for a standard Tier 1 approach. But when implemented properly, flash caching can greatly improve response times without a dramatic increase in cost.
With caching, data is only held in the cache for exactly as long as it’s needed, then moved to a lower tier storage solution. This stands, in contrast, to simply using a standard flash tier, where the data may sit on flash for months. There are several methods for identifying, writing, and transferring the most vital data, but the general outcome is the same. Because you’ve ensured that only the most vital data is ever hitting flash storage, you’ve maximized performance with a cost-efficient solution. Flash can then be used for more than just caching as the needs of the business grow and flash plays a larger role in your environment.
3. Use Disk Storage Wisely
Disk storage, often referred to as standard storage or disk to disk (D2D), is one of the best ways to optimize operations like backups and restores from a performance perspective. But bandwidth and storage capacity for backups can be a major cost. Disk backup should be deployed based on criticality, usually for Tier 2 and Tier 3 data.
Because disk storage is used for large volumes of data, you need to get the most out of it. Minor inefficiencies become painful and expensive when they’re multiplied across the enormous datasets in Tier 2 and Tier 3. It’s crucial to maximize the available memory to optimize read/write speed and to use high RPM drives for the best performance. Buffered reads and buffered writes may also increase performance, depending on the underlying disk structure.
4. Consider the Cloud
The cloud can no longer be ignored as it offers storage tiering with a predictable CapEx model. IT departments can now deploy applications in the public cloud without major hardware investment or eliminate hardware altogether by consuming the cloud using a pay-as-you-go software as a service (SaaS) model. Cloud storage is flexible, scalable, and often less expensive than comparable on-premises options. For many businesses, a hybrid architecture can be the most cost-effective. Data can be stored in multiple locations, on- and off-site, depending on performance and cost.
There are numerous storage options on the public cloud. Unstructured “raw” storage in the form of object stores, file, and block services are available, as well as a range of structured products. Those structured options may be proprietary or may be compatible with traditional VMware or MS SQL platforms. The diversity of options means that there is likely to be a cloud solution for your architecture and needs, and the cloud should be taken seriously in any modern tiered storage strategy.
Cloud also offers advantages in implementation, as some cloud backup offerings offer automated tiering. With Amazon Web Services (AWS) S3, aging data is automatically moved to infrequent access service levels, then eventually sent to Glacier for long-term retention. Each step reduces the storage cost and automation reduces, if not eliminates, administrative overhead.
5. Network and System Performance
The truth is, in a disaster situation, or even under unusually high periods of demand, this careful assessment and planning may come to nothing if the network isn’t up to the task. Network bandwidth and the tiered storage subsystem need to be capable of handling a huge amount of data. In a normal week, processes like system backups will be scattered and scheduled to avoid overtaxing the network. But what happens when there’s a disaster? In this worst-case scenario, a full restore may need to happen quickly and simultaneously across all systems. Networks need to be planned with this worst-case scenario in mind. In some cases, a cloud-based disaster recovery as a service (DRaaS) solution may be best, as it could allow you to immediately spin up and access application dependent virtual machines (VMs) directly in the cloud, without having to pull them back into the data center.
Not all data is the same. Application requirements, business needs and RTO and RPO goals should be set individually for every type of data your business touches, then storage solutions should be selected based on those requirements.
Recommended Resources
Dummies Guide: Cloud Information Management for Dummies
Blog: Object Storage versus Block Storage: Understanding the Technology Differences
Blog: Understanding RTO and RPO