2024 - The Year We Need to Focus on Less
- Mark Molyneux, EMEA CTO at Cohesity
- 20.12.2023 01:00 pm #data #analytics
With each new year, we all receive predictions for more of something. In 2024 there will be more AI, more Cloud, more Cyber attacks, and more use of this technology or that. More, more, more… How about next year we focus on less?
Cheap computing and storage, increasing Hybrid Cloud adoption, and exponential growth of AI are leading to data volumes spiraling dangerously out of control. Data volumes are already growing by an average of 50 percent per year in more than half of all companies, and the majority of organizations, including financial organizations, have infrastructure crammed with data, where on average 70 percent of the content is completely unknown.
All that data requires power, and AI needs even more power; ChatGPT for example uses as much as 10x more than a standard data search, yet data centre efficiency (PUE) has not improved in line with increased workloads. We know that we live in a time of climate emergency, and yet there are no concerted efforts amongst enterprises or the IT industry to drive down those volumes of Data. Efficiency and management alone do not solve the issue that we are just storing too much of everything, for too long. If data were paper, we’d be buried under an Everest of it, but it’s out of sight and out of mind.
Financial organizations must start the year with resolutions to go on a data diet, cut the fat, and get compliant. Their first two actions should be:
Consolidate their data on a common platform instead of operating dozens or even hundreds of separate silos. There, this data can be further reduced using standard techniques such as deduplication and compression. Reduction rates of 96 percent are possible as a result.
Use AI to index and classify data according to its content and value for the company. Everything that is without value can be deleted.
Although energy-intensive, AI is proving to be a considerable help in clarifying the content and value of data, so it can automatically identify obsolete, orphaned, and redundant data that can be deleted immediately.
Most companies have their infrastructure crammed with data, where on average they don't even know 70 percent of the content. In this unstructured dark data, cat videos can be found as well as the menu from the last Christmas party, aged copies of databases, and research results, all mixed with data that must be retained for regulatory and commercial purposes.
This data needs to be cleaned, and not only to reduce its risk of litigation. Anyone who cleans up and disposes of data waste will be able to feed their AI with high-quality content and free up space for new data. To do this, the data must be indexed and classified according to its content and value for the company. AI also plays a key role here in classifying the content very accurately at pace.
2024 should be the year when we don’t end up with more, but take responsibility to reach the end of the year with less. Far less:
Index the data, and enable reporting to provide accurate curation and empower decisions. Everything that is without value can be deleted. Obsolete data, duplicates of systems, orphans, outdated test systems. Say goodbye to things you don’t need.
Reduce data volumes using technology; DeDuplicate and Compress data to eliminate redundant copies and/or lighten certain structures, automatically replacing original data with a thin version. The amount of data can be reduced by up to 97% depending on its type. Getting thin in the new year.
Classify that Data; For data owners to be able to make the right decisions, the type, content, and value of the data must be crystal clear. Classification according to your Relevant Records policy. This will allow Defensible Deletion Decisions, those decisions you need to make but have been unable to do so through lack of data intelligence. You will keep only that which you need to keep, for the prescribed period, and then automatically delete it. This will reduce your mountains of data, it will also give you strong intelligence when you experience a Cyber Event and need to know what has been compromised, encrypted, or taken. AI and Machine Learning can be truly enabled to defuse complex problems, their LLMs empowered by solid data.
What each individual can do
Every user can also help reduce overall power consumption and slow down data growth. Because everyone can search through their data in the cloud and delete what is useless. This can be X-fold versions of the same photo, with a slightly different perspective. Or videos that you once found funny and haven't watched since. That cat video perhaps.
Every bit we can save through reducing our stored data will reduce energy consumption. So let’s start cleaning up.
Technological innovations such as AI should also be approached as a tool to optimize on-premise and cloud storage, through a better understanding of the data they host. When integrated directly into a data management solution, AI can reduce the amount of data stored, and therefore the energy resources consumed.