Tags
Language
Tags
June 2025
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 1 2 3 4 5
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Big Data Processing And Machine Learning With Apache Spark

    Posted By: ELK1nG
    Big Data Processing And Machine Learning With Apache Spark

    Big Data Processing And Machine Learning With Apache Spark
    Last updated 4/2019
    MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
    Language: English | Size: 4.36 GB | Duration: 8h 54m

    Leverage the power of Apache Spark to perform data processing, analytics, and machine learning on your data in real-time

    What you'll learn

    Query your structured data using Spark SQL and work with the DataSets API

    Uncover what RDDs (Resilient Distributed Datasets) are and how to perform operations on them

    Train machine learning models with streaming data, and use them for making real-time predictions

    Implement high-velocity streaming and data processing use cases while working with streaming API

    Dive into MLlib– the machine learning functional library in Spark with highly scalable algorithm

    See analytical use case implementations using MLLib, GraphX, and Spark streaming

    Examine a number of real-world use cases with hands-on projects

    Build Hadoop and Apache Spark jobs that process data quickly and effectively

    Requirements

    Knowledge of Python programming is assumed but prior experience of working with Apache Spark is not required.

    Description

    Apache Spark is highly configurable and is gaining rapid popularity in the Big Data markets because of its in-memory data processing that makes it high-speed data processing engine. It also has well-built libraries for machine learning and graph analytics algorithms. This brings in Apache Spark to solve scalable machine learning problems and also work with high streaming real-time data. If you want to get the most out of the trending Big Data framework for all your data processing and machine learning needs, then this course is for you.This course focuses on performing data streaming, data analytics, and machine learning with Apache Spark. You will learn to load data from a variety of structured sources such as JSON, Hive, and Parquet using Spark SQL and schema RDDs. You will also build streaming applications and learn best practices for managing high-velocity streaming and external data sources. Next, you will explore Spark machine learning libraries and GraphX where you will perform graphical processing and analysis. Finally, you will build projects which will help you put your learnings into practice and get a stronghold of the topic.Contents and OverviewThis training program includes 4 complete courses, carefully chosen to give you the most comprehensive training possible.The first course, Apache Spark in 7 Days, is designed to give you a fundamental understanding of and hands-on experience in writing basic code as well as running applications on a Spark cluster. You will work on interesting examples and assignments that will demonstrate and help you understand basic operations, querying machine learning, and streaming.In the second course, Big Data Processing using Apache Spark, you will learn how to leverage Apache Spark to be able to process big data quickly. You will learn the basics of Spark API and its architecture in detail. You will then learn about Data Mining and Data Cleaning, wherein you will understand the Input Data Structure and how Input data is loaded. You will also write actual jobs that analyze data.The third course, Big Data Analytics Projects with Apache Spark, contains various projects that consist of real-world examples. The first project is to find top selling products for an e-commerce business by efficiently joining data sets in the paradigm. Next, a Market Basket Analysis will help you identify items likely to be purchased together and find correlations between items in a set of transactions. Moving on, you will learn about probabilistic logistic regression by finding an author for a post. Next, you will build a content-based recommendation system for movies to predict whether an action will happen, which you will do by building a trained model. Finally, you will use the MapReduce Spark program to calculate mutual friends on the social network.In the fourth course, Hands-On Machine Learning with Scala and Spark, you will go through day-to-day challenges that programmers face while implementing ML pipelines and consider different approaches and models to solve complex problems. You will learn about the most effective machine learning techniques and implement them in your favour. You will also implement algorithms with practical hands-on projects wherein you will build data models and understand how they work by using different types of algorithms.By the end of this course, you will be able to process large datasets, extract features from it, and apply a machine learning model that is well suited to your problem.Meet Your Expert(s):We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:Karen Yang has been a passionate self-learner in computer science for over 6 years. She has programming, big data processing, and engineering experience. Her recent interests include cloud computing. She previously taught for 5 years in a college evening adult program.Tomasz Lelek is a Software Engineer and Co-Founder of InitLearn. He mostly does programming in Java and Scala. He dedicates his time and effort to get better at everything. He is currently diving into Big Data technologies. Tomasz is very passionate about everything associated with software development. He has been a speaker at a few conferences in Poland-Confitura and JDD, and at the Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference. He was also a speaker at an international event in Dhaka. He is very enthusiastic and loves to share his knowledge.

    Overview

    Section 1: Apache Spark in 7 Days

    Lecture 1 The Course Overview

    Lecture 2 Setting Up an AWS Account

    Lecture 3 Launching a Spark Cluster on EC2

    Lecture 4 Setting Up Your Environment

    Lecture 5 Running a Test Application

    Lecture 6 Creating RDDs

    Lecture 7 Actions

    Lecture 8 Transformations

    Lecture 9 Joins, Set, and Numeric Operations

    Lecture 10 Shared Variables

    Lecture 11 Installing Jupyter Notebook

    Lecture 12 RDDs and DataFrames

    Lecture 13 DataFrame Row Operations

    Lecture 14 DataFrame Column Operations

    Lecture 15 DataFrame Manipulation

    Lecture 16 Views

    Lecture 17 Schemas

    Lecture 18 SQL Operations

    Lecture 19 I/O Options

    Lecture 20 HIVE

    Lecture 21 Basic Statistics

    Lecture 22 Pipelines

    Lecture 23 Feature Extractors

    Lecture 24 Feature Transformers

    Lecture 25 Feature Selectors

    Lecture 26 Classification

    Lecture 27 Regression

    Lecture 28 Clustering

    Lecture 29 Collaborative Filtering

    Lecture 30 Model Selection and Tuning

    Lecture 31 DStreams

    Lecture 32 DStream Window Operations

    Lecture 33 Structured Streaming

    Lecture 34 Window Operations

    Lecture 35 Joining Batch and Streaming Data

    Section 2: Big Data Processing using Apache Spark

    Lecture 36 The Course Overview

    Lecture 37 Overview of the Apache Spark and Its Architecture

    Lecture 38 Start a Project Using Apache Spark, Look at build.sbt

    Lecture 39 Creating the Spark Context

    Lecture 40 Looking at API of Spark

    Lecture 41 Looking at the Input Data Structure

    Lecture 42 Using RDD API in the Data Mining Process

    Lecture 43 Loading Input Data

    Lecture 44 Cleaning Input Data

    Lecture 45 Logic for Counting Words

    Lecture 46 Using RDD API Transformations and Actions to Solve a Problem

    Lecture 47 Testing Spark Job

    Lecture 48 Summary of Data Processing

    Section 3: Big Data Analytics Projects with Apache Spark

    Lecture 49 The Course Overview

    Lecture 50 Explaining Ways of Joining Datasets

    Lecture 51 Developing Spark Algorithm for Joining/Windowing Datasets

    Lecture 52 Testing Logic in MapReduce Spark — Finding Top Sellers

    Lecture 53 Drawing Conclusions from Top Sellers Data

    Lecture 54 Market Basket Analysis Goals

    Lecture 55 Where MBA Algorithms Are Useful?

    Lecture 56 Implementing MBA MapReduce Algorithm in Spark

    Lecture 57 Finding Association Rules Between Products

    Lecture 58 Analyzing Post for an Author

    Lecture 59 Extracting Information from Unstructured Text

    Lecture 60 Extracting Information via Spark DataFrame

    Lecture 61 Sentiment Analysis of Posts Using Logistic Regression

    Lecture 62 Finding an Author of a Post

    Lecture 63 Content-Based Recommendation Systems Explanation

    Lecture 64 Finding Correlation Between Movies and Users

    Lecture 65 Testing Logic in MapReduce Spark

    Lecture 66 Finding Recommendation for Given User

    Lecture 67 Finding Common Friends Problem — Graph Approach

    Lecture 68 Creating a Graph Using GraphX and Property Graph

    Lecture 69 Solution — Examining Available Methods

    Lecture 70 Finding Closest Friend for Given User Using Page Rank

    Section 4: Hands-On Machine Learning with Scala and Spark

    Lecture 71 The Course Overview

    Lecture 72 Analyzing Text Input Data

    Lecture 73 Feature Generation from Text – Count Vectorizer, TFIDF, LDA

    Lecture 74 Extracting Features from Data – Transforming Text into Vector of Numbers

    Lecture 75 Bag-of-Words and Skip Gram

    Lecture 76 Training Classification Models – Implementing Word2Vect Using Apache Spark

    Lecture 77 Logistic Regression Explanation

    Lecture 78 Writing a Logistic Regression Model Per Author in Apache Spark

    Lecture 79 Training Regression Model

    Lecture 80 Key Concepts, Machine Learning Pipelines, and Operations

    Lecture 81 Learn How to Validate Models Using Cross-Validation

    Lecture 82 Analyzing Time of Post Using Clustering – (GMM Explanation)

    Lecture 83 Implementing GMM in Apache Spark

    Lecture 84 K-Means Clustering Explanation and Use Cases

    Lecture 85 Implementing K-Means Clustering in Apache Spark

    Lecture 86 Measure Accuracy Using Area Under ROC

    Lecture 87 Dimensionality Reduction Using Singular Value Decomposition (SVD)

    Lecture 88 Building Recommendation Engine in Spark Using Collaborative Filtering

    Lecture 89 Using Recommendation Engine to Get Top Recommendations

    Lecture 90 Dense and Sparse Vectors

    Lecture 91 LabeledPoints, Rating, and Other Data Types

    Lecture 92 The Spark versus Deep Learning Use Case

    Lecture 93 Spark for Parallelizing Deep Learning Evaluation

    Lecture 94 Deep Learning As a Feature Generator for Existing Spark ML Algorithms

    Lecture 95 Spark/Deep Learning Made Simple

    This course will be particularly useful if you are a developer, data analyst, data engineer, or data scientist. However, anyone interested in learning how to use Spark will also benefit from this course.