Teradata to Apply New Approach for Data Lake Deployment

Data , Infrastructure
29.03.2016 12:00 pm

Teradata announced a new “design pattern” approach for data lake deployment which leverages years of experience in big data consulting and optimisation to help clients build and benefit from data lakes. The new design pattern approach is an industry first and positions Teradata at the forefront in helping business users, data scientists and IT professionals establish data lakes that produce exceptional business value.

Organisations are exploring the functionality of data lakes to create insight and opportunity from exploding data volumes, yet serious problems entangle their IT teams, including a lack of best practices, the shortage of data scientists, and even confusion regarding the definition of a data lake. In addition to these challenges, technology choices are multiplying. For example, data lakes are typically assumed synonymous with Hadoop – which is an excellent choice for many data lake workloads – however, a data lake can be built on multiple technologies such as Hadoop, NoSQL, Amazon Simple Storage Service (S3), a relational database (RDBMS), or various combinations thereof. And while technologies are critical to the outcome, a successful data lake needs a plan. A data lake design pattern is that plan. The design pattern consists of intellectual property based on enterprise-class best practices combined with products co-developed from a stream of successful customer engagements.

“Teradata has moved ahead of the curve in defining implementation patterns for data lakes,” said Tony Baer, Senior Analyst, OVUM. “A data lake is different from an operational data store. Teradata’s value proposition stems from practical, on-the-ground experience helping customers cope with managing data in heterogeneous environments. With the acquisition of Think Big, Teradata has added valuable IP -- design patterns that will help build transparent data lakes.”

By having access to new data such as customer service records, clickstream data, IP traffic, log information, and sensor data stored in a data lake, users can address cases that generally require multiple, simultaneous interpretations of the data to be tested against each other. A couple of use cases include:

Generating improved customer churn detection models by extracting text from customer service calls stored in the data lake, then applying predictive text analytics methods.
Providing for trend analytics against combinations of vast streams of machine data with consumer data. In the utility industry, for example, data lakes pave the way for running multiple data models against each other to examine the impact of installing energy-efficient appliances and the latent effect, months later, of reducing electricity consumption.

“Who hasn’t heard about data lake implementation nightmares? This is why we’re growing: we are asked to step in and help companies turn around ugly, costly data lake failures,” said Ron Bodkin, president of Think Big, a Teradata company. “We tailor our data lake design pattern approach to each set of circumstances – and these patterns and supporting software frameworks are strong, proven value accelerators. Sadly, so many companies find big data landmines the hard way. We get customers out of crisis mode and help business, IT and data scientists plan, execute and benefit from data lakes that actually generate great business value – as they should and will, when built from experience.”

Think Big has continued to be technology and platform neutral from its inception and is focused on producing tangible value through open-source technologies such as Apache™ Hadoop®, Apache Spark™, and NoSQL. A number of Data Lake Design Pattern services are available from Think Big including: Data Lake Foundation, for teams just getting started with a data lake or seeking best practices consulting; Data Lake Architecture, designed for organisations that are looking for recommendations for data lake best practices and technology choices; and Data Lake Analytics, which support data preparation for execution of analytics cycles.

Think Big has helped many industry leaders and innovators establish data lakes and engineer Hadoop/big data implementations, including engagements at: HGST, a Western Digital Company; one of the world’s largest financial services providers; a leading maker of semiconductors; a top computer storage and data management company; a renowned maker of athletic wear; and a well-known global producer of soft drinks.

Teradata also offers a variety of products and technologies enhanced for use with data lake environments. These include Teradata Listener, which simplifies streaming big data into the data lake with an intelligent, self-service software solution; Teradata Appliance for Hadoop, the low cost choice for storing data; Presto, which provides a modern SQL-on-Hadoop architecture; and data lake accelerators built from IP, referred to as Pipeline Controller and Buffer Server, which together orchestrate data movement efficiently from local servers into Hadoop.