Big Data Analytics Tracking Platform Processes Nearly 100 Million Events per Hour

Case Study

Big Data Analytics Tracking Platform Processes Nearly 100 Million Events per Hour

Business Objective

Our client is one of the world’s largest online payment processing companies with annual revenue of USD 10 billion.

The client wanted to build a robust and highly scalable application to process and store incoming click events in real-time.

Challenges

  • Architectural challenges in combining stream and batch events to ensure the resiliency of the input data
  • Extracting, transforming, and merging unstructured events from multiple sources
  • High volume of data (~ 15 TB daily)

Solution Methodology 

  • Used Open Source Technologies like Apache Kafka, Hadoop HDFS, for data ingestion from various sources
  • Processed the incoming big-data using Spark computing framework
  • Stored the processed data into Apache Hive warehouse for querying and analysis

Business Impact

  • 15 mins for processing a day’s worth of data – processed 100 million real-time events/hour
  • Brought together data from several sources creating a single source of truth
  • The new streaming pipeline has fault tolerance enabled via checkpointing
Download Case Study

©2020 Tiger Analytics. All rights reserved.

Log in with your credentials

Forgot your details?