ComputerWeekly.com/TechTarget IT Priorities Survey for 2016 has just released its findings, which questioned 194 IT professionals in the UK and Ireland. And they found that flash storage deployment appears to have plateaued. They stated that solid state storage (SSDs) is a key project for 16% of respondents, down slightly from just about plateauing across 2015 (19%), 2014 (20%) and 2013 (18%).
So is this a surprise? Have flash storage priorities changed? Let’s take a look at what is driving flash in the storage ecosystem right now. Naming conventions: The terms flash storage and solid state disk (SSD) storage are often used interchangeably to describe a type of storage that has no moving parts and can be erased and reprogrammed. SSD is a disk that does not have moving parts, and flash is the implementation that allows that to happen. Modern SSD hard drives are flash-based.
There are two main drivers for considering flash in the storage ecosystem: the increases in storage volumes required by higher definition in video and mobile computing, and the trend for higher performance for certain applications such as rapid SQL querying for customer online purchasing. Last year we saw more flash technology embedded in servers to increase response time for consumers, specifically those in transaction rich industries such as healthcare, finance and retail, where instant transactions can define the customer experience.
And flash is impacting market consolidation and arguments in the market place, with EMC acquiring XtremIO, NetApp acquiring SolidFire, EMC and Pure Storage continuing to wrangle over infringed deduplication technology, and Violin Memory recently taking a restructuring plan and reduced headcount by approximately 25 percent.
There is no argument in the datacenter that the first tier should incorporate flash from a performance perspective. Major storage suppliers already report that shipments of 15,000rpm disks have almost entirely been replaced by flash in the form of solid state discs (SSDs). The question is what should be chosen for the second (and perhaps third and fourth) tiers – disk, flash, hybrid flash combos, or what other option? And what is this uncertainty impacting flash adoption?
The storage ecosystem model explained
What data sits where, and why? Tiered storage is the foundation of information lifecycle management (ILM). Data is stored appropriately based on performance, availability and recovery requirements. For example, data intended for restoration in the event of data loss or corruption could be stored locally — for fast recovery — while data for regulatory purposes could be archived to lower cost disks, either at a remote location or not in the production environment.
Like the “death of the mainframe”, the “death of disk” is also exaggerated as there will always be a need for HDD in some form or fashion in the overall storage ecosystem, whether it be for archiving, recovery, remote storage, cold storage etc. The big thing in storage ecosystem design now is consolidation. But to build up the infrastructure could require adding hard disk drives (HDD) for performance, not capacity.
Traditionally, datacenter storage is tiered on the basis of latency, IOPS and relevancy of the data to operations. With flash in the mix, datacenter storage is becoming more tiered than now, meaning that there would be at least three tiers:
- Ultra-low or low latency, will be flash based if not all flash (AFA)
- Low latency/high IOPS, hybrid flash or HDD
- “Cool/cold,” long term storage, mainly HDD
A report that was published in 2012 by the University of California and Microsoft Research (UCMR), The bleak future of NAND flash memory, painted a pessimistic picture of the future of flash: “Building larger-capacity ﬂash-based SSDs that are reliable enough to be useful in enterprise settings and high-performance enough to justify their cost will become challenging. We show that future gains in density will come at signiﬁcant drops in performance and reliability. As a result, SSD manufacturers and users will face a tough choice in trading off between cost, performance, capacity and reliability.”
Even with flash technology coming down in price, it will still be a while before datacenters start moving to all flash. Many think that going all flash would be a mistake, because not every application is going to need it. But if enterprises are forward looking with their storage footprint in their own ecosystem, where does flash fit in the tiers? Is it price dependent, or application specific?
Price and performance
New flash arrays are generally available and priced competitively with traditional hard drive-based arrays. Which means that prices for SSD are dropping, but are also in line with the goals of the HDD business, which sets goals for itself mapping out the industry’s planned areal densities (bits per square millimeter of HDD surface) several years in advance. This is managed by IDEMA (International Disk Drive Equipment & Materials Association), and is called the ASTC (Advanced Storage Technology Consortium).
So price is a factor, but not alone. For something like an all-flash array (AFA), scalability, particularly scale-out, becomes a primary differentiator. Even blended, SSD-HDD combinations will favor flash. For hybrid arrays, the ratio of flash capacity continues to grow as customers seek a balance between the cost-effective capacity of hard disk drives and the business-aware performance of flash.
Trade-offs – the implementation of hybrid flash
The shift for flash at present is pragmatic and more for going hybrid. A hybrid array combines flash memory with traditional disk storage, with flash mostly used for applications that demand a high number of IOPS and traditional spinning disks used for storage. Hybrid combinations of flash drives and HDD have become a viable option for organizations of all structures. There are many ways IT can architect solid state discs (SSDs) into the storage mix. Many are simply adding SSD drives to the PCIe slots in the servers, which also have hard disks (or configuring new server purchases with flash drives) or adding them as direct-attached storage (DAS). Companies managing storage networks are also increasingly adding flash to their storage arrays. In a growing number of shops where performance is critical, some are stepping up to pure flash-based SSD storage.
AFAs will likely only needed for the most performance sensitive application workloads. The move to hybrid came after IT departments started to realize the challenges of what an all flash environment could face. For example, you have the “boot storms” that occur in the beginning of the work day with the simultaneous installation of software, patches or workloads with multiple people performing needed tasks on a schedule.
Reliability of SSDs
One of the remaining questions is if SSDs are reliable enough to replace disk drives completely. With an SSD, you must first erase the original data and then write the new data, so no overwriting. And ultimately memory cells wear out, limiting the number of times you can erase and write data. The more data stored on the drive and the greater the number of write operations, the sooner performance will begin to degrade. An important part of the management functionality provided by the controllers built into SSDs is wear leveling, a process that controls how data is written to prevent one set of them from wearing out before others. In this way, the controllers can help to significantly extend the life of the drives. SSDs are still susceptible to a number of vulnerabilities, and they tend to fail more spectacularly than HDDs. And data recovery can be much trickier with an SSD, often requiring specialized expertise or software to retrieve the lost data.
How to best configure the ecosystem
The key is understanding the I/O profiles of the workloads and then using I/O profiles to generate workload models that can be used to evaluate hybrid offerings versus AFA vendor offerings and to determine the optimal configuration mix of HDDs and SSDs in the ecosystem. A focus on use cases can best drive the specialization of storage systems and new tiers of storage. Best action is to look into the use of storage workload modeling tools and load generators. Tools such as Load DynamiX provide ways to determine the performance characteristics/limits for any storage system specific to your workload profiles. For any storage deployment, it is important to get a handle on peak workloads, specialized workloads such as backups and end of month/year patterns, and impactful events such as login/logout storms.
There is a paradigm shift from storage being stand alone to more software-defined storage (SDS), which is another blog post in itself. In the shift for enterprises to be application and use case led in defining their infrastructure, they are starting to understand the dynamic between applications, compute, network and storage. Storage is becoming an integral part of the entire stack and tightly integrated with the use case for the business and the necessary performance/ reliability trade-offs.