What is Data Lineage and How Should it be Leveraged?

  • Philip Dutton, Co-Founder and Co-CEO at Solidatus

  • 17.11.2021 05:30 pm
  • #data

Precious gems and metals derive a substantial part of their value from their scarcity and the huge effort required to obtain them. Data is also a hugely valuable resource, critical to the success of any organization, from start-ups to global corporations to the public sector. But unlike platinum or pink diamonds, data isn’t in short supply.

According to Forbes, the volume of data created, captured, copied and consumed globally soared from 1.2 trillion gigabytes in 2010 to 59 trillion gigabytes in 2020 – a staggering increase of nearly 5000%. If a business is unable to manage and understand the data that fuels it, mining actionable intelligence from it will be impossible. But there is a solution – taking a ‘lineage-first’ approach to data management which can effectively increase efficiency by 60%.

What is data lineage? And why does it matter?

Data lineage encompasses far more than just data, it has to include the people, regulations, policies, applications, processes and controls that the data impacts and interacts with. Data lineage’s real value is providing rapid understanding and insight to the business so it can be agile and effective in its actions, not only providing the technical pathways that data travels through an organization. Next generation lineage provides all of the above plus enables past, current and future organizational states, enabling accelerated transformation with evidencable control. In short, it is the entire ecosystem with which data is consumed – the digital blueprint of an organization.

If a house is being built, there are different blueprints that each correspond to a different component of the house, from plumbing to electric to structural. Each of those is tied together in the overall composition of the house. Data lineage is a blueprint with each individual data set connecting to formulate the overall organization.    

But the value of data lineage doesn’t stop there. By creating transparency, lineage establishes confidence in data governance and the ability to demonstrate regulatory compliance. It is also essential for robust decision-making, which depends on the right people having access to accurate, high-quality data, whenever they need it. Using lineage in transformational projects that involve many systems across multiple departments accelerates gap analysis to mitigate risk. Additionally, by identifying redundancies, it improves operational efficiency. 


Data lineage is critical for understanding:

Who owns, edits and consumes your data.

What information the data represents, its metadata, its dependencies; and what data is compliance-related.

Why the data is needed and if there is redundant data in the data ecosystem.

Where the source of bad quality data is and where data is stored geographically.

When the data was created, modified or deleted, and if it is linked to a specific time zone.


Capture, visualize and manage your data

Data lineage enables organizations to create a ‘data landscape’ that dynamically connects and visualizes complex relationships. The ability to comprehend the organization’s entire data ‘estate’ in this way is fundamental for achieving the requisite level of understanding for driving business intelligence and demonstrating how you achieve results. 

Data lineage provides immediate value, it doesn’t require complete coverage at the enterprise level to start providing insights and actions. Starting small enables rapid evidencing of ROI to the business stakeholders to achieve buy-in for a larger collaborative exercise. Operational efficiency of a data lineage solution is critical to organizational success, functional capability is only one part of the equation for success, the speed and accuracy of execution of that capability is equally critical. 

Effective data lineage tools can enable organizations to develop a complete picture of their data landscape, giving them the ability to optimize and transform their organization more rapidly to address changing regulatory and market conditions. 

Subject matter experts augmented by operationally efficient software can discover and document to the highest levels of detail, reliably and speedily within achievable timescales. 

Once captured, the data lineage is rendered into diagrams that encompass all aspects of the data estate. It also connects with relevant contextual information such as regulatory models, business rules, business glossaries or enterprise modelling. To maintain the value of the lineage – and the trust in it – it is vital that any changes to it are reflected in the visualizations. 


What can data lineage do for your organization?

  • Deliver smart visualization of all data and its relationships

  • Optimize data management, including metadata, data quality and governance

  • Track data flow across the organization and over time

  • Build trust in data by identifying any gaps or errors and their impact

  • Accelerate location of required information

  • Empower more users across the enterprise to access, apply and understand data to drive better decisions


Understanding the value of a lineage-first approach is an important first step towards effective data management, but this doesn’t necessarily mean that it will be implemented properly or optimally. For example, it is not uncommon for organizations to track – or attempt to track – their data lineage by means of spreadsheets. But these typically only document the lineage that is thought to exist, rather than which actually exists, and this will always be incomplete and inaccurate. 

As the world becomes increasingly digitized, organizations require solutions that can help accelerate their digital transformation. The success or failure of an enterprise can hinge on whether or not its data architecture is optimized. And the only way to demonstrate objectively that this architecture has been implemented correctly and is delivering real business value is through a comprehensive data lineage culture.


Related Blogs

Other Blogs