Using Virtualization Concepts to Eliminate Wasteful Redundant Data

John Flores
October 29, 2019
Blogs

The relentless growth of data remains one of IT’s biggest challenges, with annual growth rates approaching 70 percent in some organizations. A good deal of this growth is the result of too many systems creating redundant copies of data for multiple purposes.

Various studies indicate that companies worldwide are spending about $50 billion every year on storage hardware and software to manage unnecessary “copy data.” Even with the deployment of deduplication and compression technologies, it is estimated that 85 percent of storage hardware purchases and 65 percent of storage software purchases are devoted to managing excess duplicate data.

Copy data virtualization (CDV) solutions are designed to address the cost and complexity issues created by redundant data. Introduced in 2009 by Actifio, CDV extends virtualization principles to data management, replacing physical copies of data stores with virtual copies to dramatically reduce storage footprints.

While a certain level of data redundancy is necessary to ensure operational resilience, unplanned redundancy can create significant burdens. The problem occurs as application infrastructures evolve to serve multiple business units. Each of these siloed apps require their own backup, mirroring, replication, thin provisioning and snapshot functions. According to one study, a single production application such as Oracle, Exchange or SAP can create up to 120 excess copies of production data, even with deduplication technologies deployed within individual silos.

Why Deduplication Isn’t Enough

There are important distinctions between deduplication and data virtualization. For one, deduplication is an intensive I/O operation, so it is typically reserved for backup and archived data. Production data can’t spare that type of real-time computational resources without significant performance degradation. In addition, deduplicated data is usually written to disk using backup software, which means it must be restored to a host in its native format before it can be accessed again.

CDV solutions such as Actifio’s Virtual Data Pipeline (VDP) captures data in its native format from production application sources in physical or virtual environments, on-premises or in the cloud. It creates a single “golden master” copy that is compatible with any storage infrastructure. Rather than writing copy data to a physical drive, Actifio creates an image or pointer that links back to a central archive of the original data.

This allows systems to access a virtual copy of application data from any point in time without having to touch the golden master. Accessing these virtual copies does not generate redundant physical copies or take up additional storage infrastructure. The golden master is only updated with changing blocks through an “incremental forever” model.

Improving Data Governance

Copy data virtualization solutions such as Actifio are fast becoming essential elements of a strong data governance program. By decoupling data from physical storage, much the same way a hypervisor decouples compute from physical servers, CDV makes the process of managing, accessing and protecting data faster, simpler and more efficient. It also supports governance and regulatory compliance for sensitive data, and it safeguards against potential data leaks simply by limiting the number of data sources.

The need for strong data governance is being amplified as data analytics, mobile computing and Internet of Things initiatives drive continued worldwide data growth. Organizations simply can’t afford to waste money and resources storing and managing redundant data that has little or no value.

Of growing concern is the effect that poor data quality has on the decision-making process. Organizations today are looking for ways to use predictive analytics tools to evaluate large amounts of data in search of patterns and insights that can be used to guide corporate decisions and policies. In our next post, we’ll discuss how Actifio can accelerate data access and analytics to improve data-driven decision making.

Follow Us

Recent Posts

Harnessing Next-Generation Mainframe Storage with IBM DS8000

Today, IBM has announced the next version of their DS8000 family enterprise-level block storage device. The first two models to be released, titled DS8A10 and DS8A50, are follow-ons to the DS8910 and DS8950 models.  Operating at the very high end of the block...

AI’s Paradigm Shift

In 18 months, we have gone from AI being emergent to vendors embedding it in every facet of technology. So much so that hardware manufacturers are redesigning products to accommodate on-board AI/LLM processing capabilities. In light of the rapidly changing landscape...

The Shape of Work to Come 

No one will deny that technology has radically changed “work” over the past century. The last 40 years have continually evolved from manual to automated processes and from physical to digital interactions. Letters and interoffice mail have become emails, phone calls...

Want To Read More?

Categories

You May Also Like…

Let’s Talk