Halt Data Hoarding In 2025 With These Tips From An IT Exec
An IT director shares ways to manage ROT (redundant, obsolete and trivial) data in your organization.
In 2024, 402.89 million terabytes of data were created, captured, copied, or consumed every day, according to a report from SOAX. In 2025, 463 exabytes of data will be created each day globally, the World Economic Forum estimates.
This staggering amount of data can lead to organizations hoarding data coined as “ROT” — meaning redundant, obsolete and trivial.
Heather Phelps is the director of IT and information security at Ribbon Communications, a provider of IP optical networking and communications solutions. Phelps shared with MES Computing the risks associated with allowing ROT in your infrastructure, how to create a clean digital storage policy, plus more data management tips. Phelps’ responses were sent to MES Computing via email.
[RELATED: Alkira, RestorePoint.AI Team Up To Deliver AI-Driven Data Management To The Mid-Market]
ROT Risks
What are the risks associated with having ROT data throughout an organization?
In today's data-driven world, organizations are collecting massive amounts of information to uncover insights that drive growth and innovation. However, not all data is valuable, nor should it be accessed by all. Organizations face a growing challenge managing all this data and identifying that which is redundant, obsolete, and trivial: managing ROT data—information. While often overlooked, ROT data silently eats into operational budgets, hampers analytics, and exposes organizations to significant security risks. From skyrocketing storage expenses to preventable cyber breaches, the consequences of unchecked ROT data are far-reaching. Some of these consequences include the following:
Skyrocketing Storage Expenses
The promise of business analytics lies in extracting actionable insights from data. However, ROT data muddies the water, creating noise in datasets that leads to skewed analytics and wasted resources.
[RELATED: Synology Updates Its Hybrid Share Platform, Says It Offers 80 Percent Faster Uploads]
Increased Breach Exposure
The 2017 Equifax breach, which exposed the personal data of 147 million individuals, including SSNs, birth dates, addresses, and, in some cases, driver’s license numbers, is a notable example. Equifax had retained multiple versions of customer records from different reporting sources, integrated outdated data into newer systems, and kept records for inactive users beyond their business or regulatory needs. This breach ended up costing Equifax $1.4 billion in notification and remediation efforts.
Regulatory Non-Compliance
Contractual and regulatory compliances like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), along with others emerging worldwide, require organizations to manage personal and confidential data responsibly. This includes the secure deletion of unnecessary information. Failing to address ROT data can result in hefty fines and legal consequences. In 2020, British Airways was fined £20 million ($25.4 million USD) for GDPR violations, partly because they retained outdated customer data that was later compromised in a cyberattack.
Environmental Costs
Data storage has a significant impact on environmental sustainability, and managing ROT data can exacerbate this issue. Data centers consume vast amounts of energy to power servers and maintain optimal operating conditions and often use energy from non-renewable sources. By storing ROT data, organizations increase their energy consumption and carbon footprint. This not only leads to higher operational costs but also contradicts sustainability goals and corporate social responsibility (CSR) initiatives.
[RELATED: Data Center Operators Grapple With Surging Capacity Demand]
Human and Technological Resource Drain
Managing and maintaining ROT data also consumes valuable human resources, leading to inefficiencies and increased costs. IT teams often find themselves spending a significant amount of time cataloging, organizing, and protecting irrelevant information, which takes attention from more strategic tasks.
Similarly, data analysts frequently must sift through large volumes of ROT data to find relevant information. This not only wastes time but also reduces the accuracy and effectiveness of their analyses. The presence of ROT data creates noise in datasets, making it difficult to identify genuine trends and patterns. This may lead to inaccurate conclusions, leading to poor decision-making.
On the technological side, overloaded systems struggle to manage vast amounts of ROT data, leading to performance issues such as downtime or lag. These performance issues can disrupt business operations and negatively impact productivity. Employees may experience delays in accessing critical information, which can frustrate them and hinder their ability to perform their tasks efficiently.
Impact On AI And ML
As we move into the era of AI, machine learning, and hyper-automation, ROT data becomes increasingly problematic. AI and ML rely on large data models for training and ‘learning,’ and are only as good as these models, making accuracy more essential than ever. Automated processes for data management will play a crucial role in maintaining data quality and relevance, thereby enhancing the effectiveness of AI and ML initiatives.
What is the most effective way to pinpoint this data, to find where it exists in your system?
To effectively manage data and mitigate risks associated with ROT data, organizations can leverage a combination of Data Security Posture Management (DSPM) and Data Loss Prevention (DLP) tools. These technologies, along with regular data audits, automated data classification and cleanup tools, and clear data management policies form a comprehensive strategy for data governance.
[RELATED: Rubrik Announces Data Security Management Support For Microsoft 365 Copilot]
How do you define "trivial" data?
Trivial data refers to low-value information that offers little to no business utility. This type of data does not contribute to corporate knowledge, business insights, or record-keeping requirements. Examples of trivial data include casual emails, chat messages, draft documents, unstructured personal files (personal photos, music, or videos employees may store on company devices but are unrelated to business activities), and temporary files.
Managing trivial data is crucial for organizations to maintain efficient data storage and retrieval systems. By identifying and removing trivial data, companies can reduce storage costs, improve data accessibility, and enhance overall data security. This process also helps in minimizing the risk of data breaches by ensuring that only valuable and necessary data is retained.
How do organizations establish a Clean Digital Storage Policy?
Organizations should start by defining clear objectives and the scope of the policy. This involves conducting a thorough data inventory to understand what data is stored and where, followed by classifying data based on its value and relevance. Next, organizations need to develop or enhance comprehensive policies and procedures for data creation, storage, access, and disposal. Implementing data security measures such as DSPM and DLP is crucial to protect sensitive information and prevent unauthorized access. By following these steps, organizations can enhance data management, support compliance, and reduce the risks associated with ROT data.