Tags
Language
Tags
September 2025
Su Mo Tu We Th Fr Sa
31 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 1 2 3 4
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Hands-On With Hadoop 2: 3-In-1

    Posted By: ELK1nG
    Hands-On With Hadoop 2: 3-In-1

    Hands-On With Hadoop 2: 3-In-1
    Last updated 11/2020
    MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
    Language: English | Size: 5.37 GB | Duration: 10h 56m

    Run your own Hadoop clusters on your own machine or in the cloud

    What you'll learn

    Understand the Hadoop 2.x Architecture

    Create Map-reduce jobs

    Plan, install and configure core Hadoop services on a Cluster

    Validate the Cluster using HDFS, Map Reduce and Spark

    Understand Cluster Life-Cycle and Performance tuning of a Hadoop Cluster

    Hands-on solutions to your perplexing, real-world big data problems

    Requirements

    Good knowledge of Java

    Description

    Hadoop is the most popular, reliable and scalable distributed computing and storage for Big Data solutions. It comprises of components designed to enable tasks on a distributed scale, across multiple servers and thousands of machines.
    This comprehensive 3-in-1 training course gives you a strong foundation by exploring Hadoop ecosystem with real-world examples. You’ll discover the process to set up an HDFS cluster along with formatting and data transfer in between your local storage and the Hadoop filesystem. Also get a hands-on solution to 10 real-world use-cases using Hadoop.





    Contents and Overview This training program includes 3 complete courses, carefully chosen to give you the most comprehensive training possible.

    The first course, Getting Started with Hadoop 2.x, opens with an introduction to the world of Hadoop, where you will learn Nodes, Data Sets, and operations such as map and reduce. The second section deals HDFS, Hadoop's file-system used to store data. Further on, you’ll discover the differences between jobs and tasks, and get to know about the Hadoop UI. After this, we turn our attention to storing data in HDFS and Data Transformations. Lastly, we will learn how to implement an algorithm in Hadoop map-reduce way and analyze the overall performance.


    The second course, Hadoop Administration and Cluster Management, starts by installing the Apache Hadoop for cluster installation and configuring the required services. Learn various cluster operations like validations, and expanding and shrinking Hadoop services. You will then move onto gain a better understanding of administrative tasks like planning your cluster, monitoring, logging, security, troubleshooting and best practices. Techniques to keep your Hadoop clusters highly available and reliant are also covered in this course.


    The third course, Solving 10 Hadoop'able Problems, covers the core parts of the Hadoop ecosystem, helping to give a broad understanding and get you up-and-running fast. Next, it describes a number of common problems as case-study projects Hadoop is able to solve. These sections are broken down into sections by different projects, each serving as a specific use case for solving big data problems.


    By the end of this Learning Path, you’ll be able to plan, deploy, manage and monitor and performance-tune your Hadoop Cluster with Apache Hadoop.


    About the Author
    A K M Zahiduzzaman is a software engineer with NewsCred Dhaka. He is a software developer and technology enthusiast. He was a Ruby on Rails developer, but now working on NodeJS and angularJS and python. He is also working with a much wider vision as a technology company. The next goal is introducing SOA within the current applications to scale development via microservices. Zahiduzzaman has a lot of experience with Spark and is passionate about it. He is also a guitarist and has a band too. He was also a speaker for an international event in Dhaka. He is very enthusiastic and love to share his knowledge.


    Gurmukh Singh is a technology professional with 14+ years of industry experience in infrastructure design, distributed systems, performance optimization, and networks. He has worked in big data domain for the last 5 years and provides consultancy and training on various technologies. He has worked with companies such as HP, JP Morgan, and Yahoo and has authored the book Monitoring Hadoop.               


    Tomasz Lelek is a Software Engineer and Co-Founder of InitLearn. He mostly does programming in Java and Scala. He dedicates his time and efforts to get better at everything. He is currently delving into big data technologies. Tomasz is very passionate about everything associated with software development. He has been a speaker at a few conferences in Poland-Confitura and JDD, and at the Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference. He was also a speaker at an international event in Dhaka. He is very enthusiastic and loves to share his knowledge.



    Overview

    Section 1: Getting Started with Hadoop 2.x

    Lecture 1 The Course Overview

    Lecture 2 Installing Hadoop in Local

    Lecture 3 Bring Process to Data

    Lecture 4 NameNode Versus DataNode

    Lecture 5 Map and Reduce Operations

    Lecture 6 Order of Execution and Parallel Thinking

    Lecture 7 Formatting a HDFS

    Lecture 8 Formatting a HDFS

    Lecture 9 Some Helpful Commands to Communicate with the HDFS

    Lecture 10 HDFS Protocol and Using It in Applications

    Lecture 11 Hadoop Jobs Versus Tasks

    Lecture 12 The Hadoop UI for Task Progress

    Lecture 13 Running a Couple of Example Jobs

    Lecture 14 Analyze the Work Flow/Data Flow/Process Flow

    Lecture 15 Introduction to the Movie Dataset

    Lecture 16 Data Transformation and Storing to HDFS

    Lecture 17 Devise a Simple Algorithm for Recommendation

    Lecture 18 Implement the Algorithm in Hadoop Map-Reduce Way and Analyze Performance

    Section 2: Hadoop Administration and Cluster Management

    Lecture 19 The Course Overview

    Lecture 20 Navigation of GitBash

    Lecture 21 Navigation of Vagrant

    Lecture 22 Navigation of VirtualBox

    Lecture 23 Planning a Single Node Setup

    Lecture 24 Install Apache Hadoop

    Lecture 25 Apache Hadoop Overview

    Lecture 26 Hadoop Distributed File System (HDFS)

    Lecture 27 YARN Overview

    Lecture 28 MapReduce

    Lecture 29 Planning Hadoop Services Placement

    Lecture 30 Planning ZooKeeper Placement

    Lecture 31 Planning HDFS Service Placement

    Lecture 32 Planning YARN

    Lecture 33 Planning Spark Services

    Lecture 34 HDFS Concepts

    Lecture 35 HDFS Data Movement

    Lecture 36 HDFS Admin Commands

    Lecture 37 MapReduce Jobs

    Lecture 38 Spark Jobs

    Lecture 39 Start/Stop Services

    Lecture 40 Manage Cluster Using Ambari

    Lecture 41 Hadoop Upgrade

    Lecture 42 Scaling Cluster – Part 1

    Lecture 43 Scaling Cluster – Part 2

    Lecture 44 HDFS Masters

    Lecture 45 HA Configuration

    Lecture 46 YARN Masters

    Lecture 47 Linux ACLs

    Lecture 48 HDFS ACLs Security – Part 1

    Lecture 49 HDFS ACLs Security – Part 2

    Lecture 50 Hadoop Users and Groups

    Lecture 51 NameNode UI

    Lecture 52 Apache Hadoop Auditing

    Lecture 53 Hadoop Metrics

    Lecture 54 Hadoop Logs and Monitoring

    Lecture 55 Hadoop Troubleshooting – Part 1

    Lecture 56 Hadoop Troubleshooting – Part 2

    Section 3: Solving 10 Hadoop'able Problems

    Lecture 57 The Course Overview

    Lecture 58 Hadoop Distributed File System (HDFS)

    Lecture 59 Distributed Compute Capability YARN

    Lecture 60 Apache Hive for ETL and SQL Like

    Lecture 61 Message Queuing and Data Ingestion Kafka

    Lecture 62 NoSQL Datastores – Hadoop HBase, Accumulo

    Lecture 63 Machine Learning – Spark and Spark MLlib

    Lecture 64 Stream Processing – Spark Streaming

    Lecture 65 Processing Payment Data from an Event Stream

    Lecture 66 Advanced Aggregations Using Streaming API – PaymentAnalyzer

    Lecture 67 Storing Time Series Data in HBase

    Lecture 68 Detecting BOT Traffic Using Spark Streaming

    Lecture 69 Make Web Log Data Queryable – Hive Sink

    Lecture 70 Investigating Customers Data in Hive

    Lecture 71 Trending Supply Chain – Finding Top Seller Item in a Streaming Way

    Lecture 72 Enriching Top Sellers with Additional Information

    Lecture 73 Analyzing Customer Churn (Quantitative) Using DataFrame Queries

    Lecture 74 Analyzing Customer Churn (Amounts) Using DataFrame Queries

    Lecture 75 Storing Low Granularity Structured Sensor Data in HBase

    Lecture 76 Consuming Sensor Data Stored in HBase – Scan and Count

    Lecture 77 Building Summaries on Data Streaming from Devices

    Lecture 78 Introducing Spark GraphX – How to Represent a Graph?

    Lecture 79 Perform Graph Operations Using GraphX

    Lecture 80 Counting Degree of Vertices

    Lecture 81 Neighborhood Aggregations – Collecting Neighbors

    Lecture 82 Structural Operators – Connected Components

    Lecture 83 Page Rank Using Spark GraphX

    Lecture 84 Anomaly Detection

    Lecture 85 Analyzing Web Logs for Suspicious Activity and Loading into Spark

    Lecture 86 Implementing Clustering – Choosing Number of Clusters

    Lecture 87 Detecting Anomalies in Network Traffic

    Lecture 88 Analyzing Post for an Author

    Lecture 89 Extracting Information from Unstructured Text

    Lecture 90 Extracting Information Via Spark DataFrame

    Lecture 91 Sentiment Analysis of Posts Using Logistic Regression

    Lecture 92 Finding an Author of a Post

    Lecture 93 Downloading and Setting Cloudera Sandbox

    Lecture 94 Finding What Products Users Wants to Buy Using Cloudera Sandbox Toolkit

    Lecture 95 Using Movies History to Suggest Interesting Content

    Lecture 96 Testing and Experimenting with Recommendation Engine

    This course is perfect for budding data scientists and data analysts with a firm understanding of Java and wants to get started with Hadoop