Return to site


Disk and Tape Are Here to Stay – So Use Them Willingly

· Articles

Introduction

Disk and tape are like paper; they’re always going away but never quite do because their advantages in cost and unpowered data retention are too large to ignore when designing infrastructures that support AI, LLM models, fine grained backups, and satisfy difficult governance requirements. Before reflexively dismissing the value of disk and tape in your environment, note that: disk and tape market revenues and capacity are growing; that leasing programs from vendors like NetApp, IBM, HPE, and Pure Storage include disk and tape; and that CSPs offer cold storage services based on disk and tape. This paper examines the continued relevance of disk and tape storage, focusing on economic considerations, technology retention periods, intra vs inter box tiering considerations, and software integrations.

Cost-Efficiency: A Compelling Economic Argument

One of the primary reasons disk and tape storage remain relevant in today’s data centers is their unmatched cost-efficiency for large-scale data storage. Figure 1 compiled from a variety of credible sources shows enterprise-grade HDDs in 2025 costing approximately $6–$9 per terabyte, and SSDs, costing $20–$40 per terabyte. Including disk and tape into an organization’s storage infrastructure can influence storage vendor selection, the choice of AIOps tools, SLAs, and the capacity of data lakes used to train AI and LLM models.

AIOps ability to integrate data from multiple monitoring and management tools into unified views of how data is exchanged, processed, and stored between applications will continue to reduce the ownership costs of disk and tape by automating provisioning, data tiers, and fixing failed backups. Lowering management costs increases acquisition costs share of ownership costs to further ensure that HDD will remain relevant until SSDs $/TB fall below HDD $/TB costs.

Tape storage takes cost-efficiency even further. LTO-9 (Linear Tape-Open) tapes offer up to 18TB of native capacity (45TB compressed) at a cost of roughly $2–$3 per terabyte uncompressed, making tape the most affordable option for long-term archival storage. Tape’s low cost is particularly appealing to organizations with massive datasets, such as those in healthcare or government, where data retention periods can span decades. Unlike cloud storage, which incurs recurring operational expenses (OpEx) and variable charges linked to retrievals, tape has minimal ongoing costs after the initial purchase, offering significant savings over time. For example, storing 1PB of data in the cloud for 10 years with only a single retrieval should cost around $221,800 versus $41,500/PB for tape (see Figure 1, below).

Figure 1: Bit Costs Versus Power Consumption/Bit Trade-off

Section image

Reliability and Longevity for Compliance

Since the birth of the enterprise AFA market circa 2012/2013, NAND flash vendors have focused on lowering bit costs by storing more bits/cell, shrinking cell size, and 3D stacking. Table 1 shows that lowering bit costs and power consumption/bit necessitates a trade-off against data durability. The reduction in data availability is caused by charge leakage when unpowered and the voltage gaps representing bits being smaller in multi-bit cells. These reductions in SSD data retention periods position disk and tape as media of choice for long-term data retention. Table 2 shows typical unpowered retention periods and technology refresh periods needed to ensure data integrity.

Table 1: Data Durability Trade-offs

Section image

Table 2: Unpowered Retention Periods

Section image

HDDs and tape excel in reliability and longevity, making them ideal for storing data that can no longer be recreated, such as oil field seismic data, data that would be too costly to restore, and data that requires lower storage costs and quick access times to justify keeping. Despite HDDs having retention periods measured in decades, their practical retention period is limited to 5–7 years because vendors typically terminate their support of older storage systems after 7 - 8 years. The pain of these forced technology refreshes has been greatly diminished by the availability of non-disruptive upgrades, data-in-place upgrades, and migration services, the costs of which are negotiable.

Tape’s 15–30+ year shelf life, along with long tape library service lives, makes tape unmatched for long-term data preservation. Air-gapped security and write-protect switches or tabs on cartridges further increase their appeal by preventing accidental erasure or overwriting. Regulations like GDPR, HIPAA, and SEC 17a-4 mandate data retention for 7–30 years, and tape’s immutability ensures compliance. A 2024 ransomware incident at a healthcare provider underscored tape’s value, as air-gapped LTO tapes enabled recovery without data loss. IBM’s Storage Assurance program leverages tape libraries (e.g., TS4500) for compliance, offering lease terms that align with regulatory timelines, reducing upfront costs compared to purchasing.

Scalability for AI and Big Data

AI and big data have amplified the need to implement tiered storage infrastructures that align performance, capacity, and costs with specific application needs. This exercise frequently results in dedicating AFAs to high-performance inference and active model training, hybrid arrays to mixed workloads (including backup/restores), and staging and preprocessing AI data, and tape for storing large amounts of historical data, retraining LLMs with older data, backups, and compliance. Commonly reported capacity ratios are 15% to 20% of capacity being AFAs, hybrid arrays being 65% to 70% of capacity, HDD arrays being 2% to 5% of capacity, and tape being used for cold storage.

Moving massive amounts of data between tiers and ensuring that network bandwidth limitations do not bottleneck parallel file systems strongly suggests the need for a robust extensible networking infrastructure that may include InfiniBand (200-400 Gbps) or high-speed Ethernet (400 Gbps) for GPU-to-GPU communication. It also suggests the use of hybrid arrays wherever practical to reduce data tiering network traffic. In large AI and mixed-load data centers, networking can account for 10% to20% of infrastructure costs.

Conclusion

Disk and tape remain vital in 2025 due to their cost-efficiency, reliability, scalability, energy efficiency. Using disk and tape to keep storage affordable provides competitive advantage by increasing the size of data lakes used for training and AI augmented analytics. Disk and tape also provide high performance when supporting applications with mostly serial access patterns. HDDs support secondary storage and hybrid cloud, while tape dominates archival with unmatched longevity and security.

---- ---- ---- ---- ---- ----

Stanley Zaffos

Advisor at Lionfish Tech Advisors, Inc.

_____________________________________

©2025 Lionfish Tech Advisors, Inc. All rights reserved.

Title image source:: Modified to show rotating magnetic drives.
https://www.loc.gov/static/exhibitions/treasures-from-the-library-of-congress/images/objects/mechanics-of-memory/tr0065-69_standard.jpg