A global leader in data-driven customer experience management amps up its data quality with Telmai
With Telmai and Snowflake, this Global Leader in Data-Driven Customer Experience Management builds a modern data stack that scales to 100,000,000 data points a day With data being the backbone of their products and services, high data quality standards are critical to their client’s success and retention. With Telmai, they can build a strong data […]
With Telmai and Snowflake, this Global Leader in Data-Driven Customer Experience Management builds a modern data stack that scales to 100,000,000 data points a day
With data being the backbone of their products and services, high data quality standards are critical to their client’s success and retention. With Telmai, they can build a strong data quality foundation in their new modernized data stack.
Key takeaways include:
- A buy vs build decision saving
- Using ML/AI in Telmai to detect unknown issues in 3rd party data
- Hight trust in data quality
Overview
Our customer is a global leader in data-driven customer experience management (CXM) that specializes in the delivery of unique, personalized customer experiences across platforms and devices. Their managed analytics help retailers, brands, and market research firms transform big data into valuable insights.
Key Benefits
Pricing intelligence: delivering real-time insights and pricing recommendations to retailers by monitoring competitive prices, consumer demands, and availability of in-stock items of competitor products.
Category intelligence: blending demand data such as search volumes, consumer reviews, ratings, and social signals with clients’ internal product data to predict the demand for a given product category.
Assortment intelligence: insights about what products to keep, carry and drop, based on real time data collection and analysis of demand for a retailer’s assortment of product categories, gaps, and opportunities to help differentiate their offerings and decide whether to carry a certain inventory or pick fast-moving categories with higher demand.
The Challenge: 100M data points crawled from web and 3rd party sources on a daily basis raised the stakes for a centralized and scalable data quality solution
With data being the backbone and core of their services and products, high standards of data quality have been critical in their success and customer retention.
Today they bring data together from websites such as Amazon, Walmart, and various brands, pulling products, categories, and pricing information from these sites along with clickstream data and user interactions with these marketplaces.
With a high number of retailers, sellers, and merchants that are selling the same or similar, competitive products, the type of data, frequency, and volume has increased dramatically in recent years. Today their data acquisition team receives about 100M data points on a daily basis from different components across distributed systems and stores everything in Hadoop and HBase.
With this much data coming at such a rapid pace, they built their data quality frameworks incrementally and across multiple teams and disciplines. This includes data quality checks and balances in their:
- Data acquisition layer: the web crawling and data ingestion layer
- Decision support systems: client-facing reports and analytics
- AI/ML platform JARVIS: an award-winning proprietary computer vision model built to automate client product listings and categorization.
As the data volumes grew to be much larger and as data collected from websites grew in its unpredictability, they wanted to create a more scalable solution that was AI/ML-enabled and did not require a prior understanding of data quality issues. This was critical as it would ensure that new data ingestion will not impact or break downstream workflows.
With this goal in mind, the VP of Cloud and Data Engineering – and his team started looking for a solution that could replace their existing rules-based data quality engine.
The move to Snowflake became the catalyst for a complete modern data stack with data observability in its center
With a rule-based data quality engine already established over the last 7-8 years and now kept up by a team of 5 data engineers, their mission to move to an AI/ML based solution was grounded in a vision to use advanced technology to detect data quality issues without having to code new rules for each new type of quality problem.
The catalyst for this change was their own data infrastructure migration. With Hadoop and Hbase aging in the background, the VP of Cloud and Data Engineering and his team selected Snowflake as their new cloud data warehouse solution.
With this complete re-architecture, they decided to re-establish their data quality framework with the lightweight solution of Telmai. Because all layers of data pipelines were being rebuilt, it was a great time to rethink the architecture and apply more advanced techniques to data quality.
In this new architecture, data acquisition is entirely streaming, not batch. Data is extracted and enriched from Kafka and placed into Snowflake staging tables. Telmai is embedded in the workflow to monitor the incoming data in Snowflake staging tables.
“We chose Telmai because of its flexibility and architecture. Telmai will remain our data observability platform as our data landscape changes. Today we are moving from Hadoop to Snowflake, tomorrow we might change to another system. With Telmai, we don’t need to recreate our data quality metrics and validation framework every time we change our stack.”
– VP of Cloud and Data
Engineering and business teams alike use Telmai to monitor their data
Previously their data quality framework could only catch the same types of data quality problems they had seen before and written tests for. For example, rules validating whether a field contains images by checking that the field starts with http or https and ends with a JPEG suffix or other file extensions. It was common for novel issues to slip through, requiring business teams to do their own data sampling and spot checks to raise errors to the engineering team when new issues were found.
In the new architecture, Telmai will be used by both business teams and engineering teams. Telmai’s no-code, the low-code interface allows both groups to leverage data observability in ways that work for them.
For data engineering, Telmai’s ML/AI algorithms will catch the new patterns in data without the developer having to think through potential changes or new patterns. This makes the system much easier to manage and catches data quality issues the first time they happen before they are passed along to business users.
With Telmai, engineering catches more issues upstream, so the business can now spend less time catching and debugging data quality issues and more time driving business decisions. They also save time and improve collaboration by using Telmai rather than Excel to review and sign off on data quality.
In total Telmai is used among 80-100 software engineers, architects, analysts, data scientists, and about 20-30 people across business teams.
“We wanted to validate the data before it gets into our Snowflake production environment. Once the data gets into the analytics and decision support systems, we have to chase the data down the chain. Telmai catches quality issues early in the pipeline, empowering our business teams to make decisions faster and with more ease.”
– AVP of Cloud and Data Engineering
The Results: A buy vs. build decision to accelerate the path to trusted data
Telmai’s centralized platform is used in both their engineering and business teams has created data observability across data acquisition, data processing, data warehousing, and decision making. Although building a system was possible for them, it was not their main focus or priority. The team decided to invest in Telmai to quickly get up and running with implementing data quality on their new modern data stack.
With Telmai, they have been able to achieve:
– Higher productivity of engineering resources
- To recreate the same rules-based data quality framework previously built in Hadoop, now in Snowflake, they had estimated the number of engineering and test resources.
- While 5 engineers are currently maintaining a well-oiled machine today, in order to maintain a newly built data quality environment, they estimated placing in additional engineers for the first year, which would add to TCO.
– Greater operational efficiency:
- Today the incident report, fix and revalidate cycles between the business and engineering happen a number of times a month which costs additional resources.
- With Telmai the team anticipates a reduction in these costs, as missed data quality issues are much less, given the AI/ML auto-detection, and business teams are also able to use Telmai in a self-service manner.
Additionally, Telmai data observability is able to provide more confidence and better trust in data in serving their business use cases such as pricing intelligence, category intelligence, and assortment intelligence.
“During the pilot with Telmai, we were impressed with the no-code, low-code, and visual interface that provided the right insights into our data over time. Moreover, we had excellent consulting support and fast turnaround times for the features that were requested. This helped us with the decision to go ahead and finalize Telmai as our centralized data observability solution.”
– VP of Cloud and Data Engineering
Why Telmai
They considered many options but ultimately selected Telmai for the following reasons:
- Number of integrations with data sources and technologies.
- Advance architecture with a separate metrics calculation layer that does not require a rebuild if the data infrastructure or pipeline changes.
- Use of ML/AL to automatically understand data patterns, especially for data coming from external and 3rd party sources.
- Visual investigator that catches all data patterns without prior knowledge about the data.
- Ability to identify data quality issues in semi-structured data sources.
- Data drifts and anomaly detection on 100M daily records.
- API integration to automate remediation and trigger corrective actions.
- The support and fast turnaround of requests by the Telmai team.
“We wanted to replace our rule-based system and leverage the power of ML/AL to automatically understand the data and its patterns and anomalies. Given our data is coming from a variety of 3rd party sources that have a lot of unknowns. This makes it very hard to create and maintain all possible validation rules. With Telmai we are able to catch data patterns automatically and decide where to take an action.”
– Manager of Cloud and Data Engineering
See what’s possible with Telmai
Request a demo to see the full power of Telmai’s data observability tool for yourself.