Putting Apache Kafka to Use: A practical approach to get kick-started with Apache Kafka and build huge real-time data streaming pipelines by Himani Arora, Prabhat Kashyap
English | February 13, 2019 | ISBN: N/A | ASIN: B07NQJZKNJ | 81 pages | AZW3 | 0.35 Mb
English | February 13, 2019 | ISBN: N/A | ASIN: B07NQJZKNJ | 81 pages | AZW3 | 0.35 Mb
Putting Apache Kafka to Use is a hands-on book for getting kickstarted and using Apache Kafka for building huge real-time data pipelines for your projects. The book is designed to make the readers familiar with all the fundamental concepts of Apache Kafka and guides them to work with Apache Kafka from scratch without any prior knowledge.
If you are unfamiliar with Apache Kafka, it is a distributed streaming platform which enables you to publish and subscribe to a stream of records in a scalable and fault tolerant way. It is primarily used to build real-time streaming data pipelines that reliably get data between applications and to build real-time streaming applications that can process and transform the records as they occur.
Kafka was developed at LinkedIn at around 2010 to solve the problem of low latency ingestion of large amounts of event data from the LinkedIn website and infrastructure into a lambda architecture that harnessed Hadoop and real-time event processing systems. At the time, there were not any solutions for this type of real-time processing for applications.
Kafka was developed to be the ingestion backbone for this type of use case. Back in 2011, Kafka was ingesting more than 1 billion events a day. And just a few years back, LinkedIn reported an ingestion rate of 1 trillion messages a day.
So naturally you are excited about using Apache Kafka and you would love to join the party!
Perhaps you would like to ingest data in your applications at real-time? Process and transform your records on the fly?
Or maybe your company has tons of data that needs to be used by various applications to perform log aggregation, analysis etc. and you want a single pipeline to do all the job for you rather than making separate pipelines for each use case.
What will you learn:
• What is Apache Kafka and its advantages
• Producing and Consuming Data from Apache Kafka
• Basic Operations performed on Apache Kafka cluster
• Integrating systems using Kafka Connect
• Stream Processing using Apache Kafka