Tags
Language
Tags
May 2024
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1

Learning Apache Spark 2

Posted By: AlenMiler
Learning Apache Spark 2

Learning Apache Spark 2 by Muhammad Asif Abbasi
English | 6 Jun. 2017 | ASIN: B01M7RO7US | 356 Pages | AZW3 | 16.22 MB

Key Features

Exclusive guide that covers how to get up and running with fast data processing using Apache Spark
Explore and exploit various possibilities with Apache Spark using real-world use cases in this book
Want to perform efficient data processing at real time? This book will be your one-stop solution.

Book Description

Spark juggernaut keeps on rolling and getting more and more momentum each day. The core challenge are they key capabilities in Spark (Spark SQL, Spark Streaming, Spark ML, Spark R, Graph X) etc. Having understood the key capabilities, it is important to understand how Spark can be used, in terms of being installed as a Standalone framework or as a part of existing Hadoop installation and configuring with Yarn and Mesos.

The next part of the journey after installation is using key components, APIs, Clustering, machine learning APIs, data pipelines, parallel programming. It is important to understand why each framework component is key, how widely it is being used, its stability and pertinent use cases.

Once we understand the individual components, we will take a couple of real life advanced analytics examples like:

Building a Recommendation system
Predicting customer churn
The objective of these real life examples is to give the reader confidence of using Spark for real-world problems.

What you will learn

Overview Big Data Analytics and its importance for organizations and data professionals.
Delve into Spark to see how it is different from existing processing platforms
Understand the intricacies of various file formats, and how to process them with Apache Spark.
Realize how to deploy Spark with YARN, MESOS or a Stand-alone cluster manager.
Learn the concepts of Spark SQL, SchemaRDD, Caching, Spark UDFs and working with Hive and Parquet file formats
Understand the architecture of Spark MLLib while discussing some of the off-the-shelf algorithms that come with Spark.
Introduce yourself to SparkR and walk through the details of data munging including selecting, aggregating and grouping data using R studio.
Walk through the importance of Graph computation and the graph processing systems available in the market
Check the real world example of Spark by building a recommendation engine with Spark using collaborative filtering
Use a telco data set, to predict customer churn using Regression

About the Author

Asif Abbasi has worked in the industry for over 15 years, in a variety of roles starting from engineering solutions to selling solutions and everything in between. Asif is currently working with SAS a Market Leader in Analytic Solutions as a Principal Business Solutions Manager for the Global Technologies Practice.

Based out of London, Asif has vast experience in consulting for major organizations & industries across the globe, and running proof-of-concepts across various industries including but not limited to Telecommunications, Manufacturing, Retail, Finance, Services, Utilities and Government.

Asif has presented at various conferences and delivered workshops on topics such as Big Data, Hadoop, Teradata, and Analytics using Aster on Teradata and Hadoop. Asif is a Oracle Certified Java EE 5 Enterprise Architect, Teradata Certified Master, PMP, Hortonworks Hadoop Certified developer and Administrator. Asif also holds a Masters degree in Computer Science and Business Administration.