Tags
Language
Tags
December 2024
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4

LiveLessons - Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi

Posted By: lucky_aut
LiveLessons - Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi

LiveLessons - Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and Nifi
Duration: 4h 28m | .MP4 1280x720, 30 fps(r) | AAC, 48000 Hz, 2ch | 988 MB
Genre: eLearning | Language: English

Data Engineering Foundations Part 2: Building Data Pipelines with Kafka and NiFi provides over four hours of video introducing you to creating data pipelines at scale with Kafka and NiFi. You learn to work with the Kafka message broker and discover how to establish NiFi dataflow. You also learn about data movement and storage. All software used in videos is open source and freely available for your use and experimentation on the included virtual machine.

About the Instructor

Doug Eadline, PhD, began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Dr. Eadline has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net website in 2005, he served as editor-in-chief for ClusterWorld Magazine and was Senior HPC Editor for Linux Magazine. Currently, he is a consultant to the HPC industry and writes a monthly column in HPC Admin Magazine. He has practical hands-on experience in many aspects of HPC, including hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing. He is the co-author of the Apache Hadoop YARN book and author of Hadoop Fundamentals LiveLessons and Apache Hadoop YARN LiveLessons.

Skill Level

Beginner
Intermediate

Learn How To

Understand Kafka topics, brokers, and partitions
Implement basic Kafka usage modes
Use Kafka producers and consumers with Python
Utilize the KafkaEsque graphical user interface
Understand the core concepts of NiFi
Understand NiFi flow and web UI components
Understand direct data movement with HDFS
Use HBase with Python Happybase
Use Sqoop for database movement

Who Should Take This Course

Users, developers, and administrators interested in learning the fundamental aspects and operations of date engineering and scalable systems

Course Requirements

Basic understanding of programming and development
A working knowledge of Linux systems and tools
Familiarity with Python

Lesson Descriptions

Lesson 7: Working with the Kafka Message Broker

In Lesson 7, Doug introduces introduce the Kafka message broker concept and describes the producer-consumer model that enables input data to be reliably decoupled from output requests. Kafka producers and consumers are developed using Python, and internal broker operations are displayed using the Kafkaesque graphical user interface.

Lesson 8: Working with NiFi Dataflow

Lesson 8 begins with a description of NiFi flow-based programming and then provides several examples that include writing pipeline data to the local file system, then to the Hadoop Distributed File System, and finally to Hadoop Hive tables. The entire flow process is constructed using the NiFi web Graphical User Interface. The creation of portable flow templates for all examples is also presented.

Lesson 9: Big Data Movement and Storage

Lesson 9 provides you with several methods for moving data to and from the Hadoop Distributed File System. Hands-on examples include direct web downloads and using Python Pydoop to move data. Basic data movement between Apache HBase, Hive, and Spark using Python Happybase and Hive-SQL is also presented. Finally, movement of relational data to and from the Hadoop Distributed File System is demonstrated using Apache Sqoop.