The Resource Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
Resource Information
The item Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu represents a specific, individual, material embodiment of a distinct intellectual or artistic creation found in Sydney Jones Library, University of Liverpool.This item is available to borrow from 1 library branch.
Resource Information
The item Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu represents a specific, individual, material embodiment of a distinct intellectual or artistic creation found in Sydney Jones Library, University of Liverpool.
This item is available to borrow from 1 library branch.
- Summary
- Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it. Along the way, you'll discover resilient distributed datasets (RDDs); use Spark SQL for structured data; and learn stream processing and build real-time applications with Spark Structured Streaming. Furthermore, you'll learn the fundamentals of Spark ML for machine learning and much more. After you read this book, you will have the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications
- Language
- eng
- Extent
- 1 online resource.
- Note
- Includes index
- Contents
-
- Intro; Table of Contents; About the Author; About the Technical Reviewer; Chapter 1: Introduction to  Apache Spark; Overview; History; Spark Core Concepts and Architecture; Spark Clusters and the Resource Management System; Spark Application; Spark Driver and Executor; Spark Unified Stack; Spark Core; Spark SQL; Spark Structured Streaming and Streaming; Spark MLlib; Spark Graphx; SparkR; Apache Spark Applications; Example Spark Application; Summary; Chapter 2: Working with Apache Spark; Downloading and Installing Spark; Downloading Spark; Installing Spark; Spark Scala Shell
- Spark Python ShellHaving Fun with the Spark Scala Shell; Useful Spark Scala Shell Commands and Tips; Basic Interactions with Scala and Spark; Basic Interactions with Scala; Spark UI and Basic Interactions with Spark; Spark UI; Basic Interactions with Spark; Introduction to Databricks; Creating a Cluster; Creating a Folder; Creating a Notebook; Setting Up the Spark Source Code; Summary; Chapter 3: Resilient Distributed Datasets; Introduction to RDDs; Immutable; Fault Tolerant; Parallel Data Structures; In-Memory Computing; Data Partitioning and Placement; Rich Set of Operations; RDD Operations
- Creating RDDsTransformations; Transformation Examples; map(func); flatMap(func); filter(func); mapPartitions(func)/mapPartitionsWithIndex(index, func); union(otherRDD); intersection(otherRDD); substract(otherRDD); distinct(); sample(withReplacement, fraction, seed); Actions; Action Examples; collect(); count(); first(); take(n); reduce(func); takeSample(withReplacement, n, [seed]); takeOrdered(n, [ordering]); top(n, [ordering]); saveAsTextFile(path); Working with Key/Value Pair RDD; Creating Key/Value Pair RDD; Key/Value Pair RDD Transformations; groupByKey([numTasks])
- ReduceByKey(func, [numTasks])sortByKey([ascending],[numTasks]); join(otherRDD); Key/Value Pair RDD Actions; countByKey(); collectAsMap(); lookup(key); Understand Data Shuffling; Having Fun with RDD Persistence; Summary; Chapter 4: Spark SQL (Foundations); Introduction to DataFrames; Creating DataFrames; Creating DataFrames from RDDs; Creating DataFrames from a Range of Numbers; Creating DataFrames from Data Sources; Creating DataFrames by Reading Text Files; Creating DataFrames by Reading CSV Files; Creating DataFrames by Reading JSON Files; Creating DataFrames by Reading Parquet Files
- Creating DataFrames by Reading ORC FilesCreating DataFrames from JDBC; Working with Structured Operations; Working with Columns; Working with Structured Transformations; select(columns); selectExpr(expressions); filler(condition), where(condition); distinct, dropDuplicates; sort(columns), orderBy(columns); limit(n); union(otherDataFrame); withColumn(colName, column); withColumnRenamed(existingColName, newColName); drop(columnName1, columnName2); sample(fraction), sample(fraction, seed), sample(fraction, seed, withReplacement); randomSplit(weights); Working with Missing or Bad Data
- Isbn
- 9781484235799
- Label
- Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library
- Title
- Beginning Apache Spark 2
- Title remainder
- with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library
- Statement of responsibility
- Hien Luu
- Language
- eng
- Summary
- Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it. Along the way, you'll discover resilient distributed datasets (RDDs); use Spark SQL for structured data; and learn stream processing and build real-time applications with Spark Structured Streaming. Furthermore, you'll learn the fundamentals of Spark ML for machine learning and much more. After you read this book, you will have the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications
- Cataloging source
- N$T
- http://library.link/vocab/creatorName
- Luu, Hien
- Dewey number
- 005.75/8
- Index
- index present
- LC call number
- QA76.9.D3
- Literary form
- non fiction
- Nature of contents
- dictionaries
- http://library.link/vocab/subjectName
- Distributed databases
- Label
- Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
- Note
- Includes index
- Antecedent source
- unknown
- Carrier category
- online resource
- Carrier category code
-
- cr
- Carrier MARC source
- rdacarrier
- Color
- multicolored
- Content category
- text
- Content type code
-
- txt
- Content type MARC source
- rdacontent
- Contents
-
- Intro; Table of Contents; About the Author; About the Technical Reviewer; Chapter 1: Introduction to  Apache Spark; Overview; History; Spark Core Concepts and Architecture; Spark Clusters and the Resource Management System; Spark Application; Spark Driver and Executor; Spark Unified Stack; Spark Core; Spark SQL; Spark Structured Streaming and Streaming; Spark MLlib; Spark Graphx; SparkR; Apache Spark Applications; Example Spark Application; Summary; Chapter 2: Working with Apache Spark; Downloading and Installing Spark; Downloading Spark; Installing Spark; Spark Scala Shell
- Spark Python ShellHaving Fun with the Spark Scala Shell; Useful Spark Scala Shell Commands and Tips; Basic Interactions with Scala and Spark; Basic Interactions with Scala; Spark UI and Basic Interactions with Spark; Spark UI; Basic Interactions with Spark; Introduction to Databricks; Creating a Cluster; Creating a Folder; Creating a Notebook; Setting Up the Spark Source Code; Summary; Chapter 3: Resilient Distributed Datasets; Introduction to RDDs; Immutable; Fault Tolerant; Parallel Data Structures; In-Memory Computing; Data Partitioning and Placement; Rich Set of Operations; RDD Operations
- Creating RDDsTransformations; Transformation Examples; map(func); flatMap(func); filter(func); mapPartitions(func)/mapPartitionsWithIndex(index, func); union(otherRDD); intersection(otherRDD); substract(otherRDD); distinct(); sample(withReplacement, fraction, seed); Actions; Action Examples; collect(); count(); first(); take(n); reduce(func); takeSample(withReplacement, n, [seed]); takeOrdered(n, [ordering]); top(n, [ordering]); saveAsTextFile(path); Working with Key/Value Pair RDD; Creating Key/Value Pair RDD; Key/Value Pair RDD Transformations; groupByKey([numTasks])
- ReduceByKey(func, [numTasks])sortByKey([ascending],[numTasks]); join(otherRDD); Key/Value Pair RDD Actions; countByKey(); collectAsMap(); lookup(key); Understand Data Shuffling; Having Fun with RDD Persistence; Summary; Chapter 4: Spark SQL (Foundations); Introduction to DataFrames; Creating DataFrames; Creating DataFrames from RDDs; Creating DataFrames from a Range of Numbers; Creating DataFrames from Data Sources; Creating DataFrames by Reading Text Files; Creating DataFrames by Reading CSV Files; Creating DataFrames by Reading JSON Files; Creating DataFrames by Reading Parquet Files
- Creating DataFrames by Reading ORC FilesCreating DataFrames from JDBC; Working with Structured Operations; Working with Columns; Working with Structured Transformations; select(columns); selectExpr(expressions); filler(condition), where(condition); distinct, dropDuplicates; sort(columns), orderBy(columns); limit(n); union(otherDataFrame); withColumn(colName, column); withColumnRenamed(existingColName, newColName); drop(columnName1, columnName2); sample(fraction), sample(fraction, seed), sample(fraction, seed, withReplacement); randomSplit(weights); Working with Missing or Bad Data
- Dimensions
- unknown
- Extent
- 1 online resource.
- File format
- unknown
- Form of item
- online
- Isbn
- 9781484235799
- Level of compression
- unknown
- Media category
- computer
- Media MARC source
- rdamedia
- Media type code
-
- c
- Quality assurance targets
- not applicable
- Reformatting quality
- unknown
- Sound
- unknown sound
- Specific material designation
- remote
- System control number
-
- on1049171799
- (OCoLC)1049171799
- Label
- Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
- Note
- Includes index
- Antecedent source
- unknown
- Carrier category
- online resource
- Carrier category code
-
- cr
- Carrier MARC source
- rdacarrier
- Color
- multicolored
- Content category
- text
- Content type code
-
- txt
- Content type MARC source
- rdacontent
- Contents
-
- Intro; Table of Contents; About the Author; About the Technical Reviewer; Chapter 1: Introduction to  Apache Spark; Overview; History; Spark Core Concepts and Architecture; Spark Clusters and the Resource Management System; Spark Application; Spark Driver and Executor; Spark Unified Stack; Spark Core; Spark SQL; Spark Structured Streaming and Streaming; Spark MLlib; Spark Graphx; SparkR; Apache Spark Applications; Example Spark Application; Summary; Chapter 2: Working with Apache Spark; Downloading and Installing Spark; Downloading Spark; Installing Spark; Spark Scala Shell
- Spark Python ShellHaving Fun with the Spark Scala Shell; Useful Spark Scala Shell Commands and Tips; Basic Interactions with Scala and Spark; Basic Interactions with Scala; Spark UI and Basic Interactions with Spark; Spark UI; Basic Interactions with Spark; Introduction to Databricks; Creating a Cluster; Creating a Folder; Creating a Notebook; Setting Up the Spark Source Code; Summary; Chapter 3: Resilient Distributed Datasets; Introduction to RDDs; Immutable; Fault Tolerant; Parallel Data Structures; In-Memory Computing; Data Partitioning and Placement; Rich Set of Operations; RDD Operations
- Creating RDDsTransformations; Transformation Examples; map(func); flatMap(func); filter(func); mapPartitions(func)/mapPartitionsWithIndex(index, func); union(otherRDD); intersection(otherRDD); substract(otherRDD); distinct(); sample(withReplacement, fraction, seed); Actions; Action Examples; collect(); count(); first(); take(n); reduce(func); takeSample(withReplacement, n, [seed]); takeOrdered(n, [ordering]); top(n, [ordering]); saveAsTextFile(path); Working with Key/Value Pair RDD; Creating Key/Value Pair RDD; Key/Value Pair RDD Transformations; groupByKey([numTasks])
- ReduceByKey(func, [numTasks])sortByKey([ascending],[numTasks]); join(otherRDD); Key/Value Pair RDD Actions; countByKey(); collectAsMap(); lookup(key); Understand Data Shuffling; Having Fun with RDD Persistence; Summary; Chapter 4: Spark SQL (Foundations); Introduction to DataFrames; Creating DataFrames; Creating DataFrames from RDDs; Creating DataFrames from a Range of Numbers; Creating DataFrames from Data Sources; Creating DataFrames by Reading Text Files; Creating DataFrames by Reading CSV Files; Creating DataFrames by Reading JSON Files; Creating DataFrames by Reading Parquet Files
- Creating DataFrames by Reading ORC FilesCreating DataFrames from JDBC; Working with Structured Operations; Working with Columns; Working with Structured Transformations; select(columns); selectExpr(expressions); filler(condition), where(condition); distinct, dropDuplicates; sort(columns), orderBy(columns); limit(n); union(otherDataFrame); withColumn(colName, column); withColumnRenamed(existingColName, newColName); drop(columnName1, columnName2); sample(fraction), sample(fraction, seed), sample(fraction, seed, withReplacement); randomSplit(weights); Working with Missing or Bad Data
- Dimensions
- unknown
- Extent
- 1 online resource.
- File format
- unknown
- Form of item
- online
- Isbn
- 9781484235799
- Level of compression
- unknown
- Media category
- computer
- Media MARC source
- rdamedia
- Media type code
-
- c
- Quality assurance targets
- not applicable
- Reformatting quality
- unknown
- Sound
- unknown sound
- Specific material designation
- remote
- System control number
-
- on1049171799
- (OCoLC)1049171799
Library Links
Embed
Settings
Select options that apply then copy and paste the RDF/HTML data fragment to include in your application
Embed this data in a secure (HTTPS) page:
Layout options:
Include data citation:
<div class="citation" vocab="http://schema.org/"><i class="fa fa-external-link-square fa-fw"></i> Data from <span resource="http://link.liverpool.ac.uk/portal/Beginning-Apache-Spark-2--with-resilient/3BZTTkcfKxY/" typeof="Book http://bibfra.me/vocab/lite/Item"><span property="name http://bibfra.me/vocab/lite/label"><a href="http://link.liverpool.ac.uk/portal/Beginning-Apache-Spark-2--with-resilient/3BZTTkcfKxY/">Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu</a></span> - <span property="potentialAction" typeOf="OrganizeAction"><span property="agent" typeof="LibrarySystem http://library.link/vocab/LibrarySystem" resource="http://link.liverpool.ac.uk/"><span property="name http://bibfra.me/vocab/lite/label"><a property="url" href="http://link.liverpool.ac.uk/">Sydney Jones Library, University of Liverpool</a></span></span></span></span></div>
Note: Adjust the width and height settings defined in the RDF/HTML code fragment to best match your requirements
Preview
Cite Data - Experimental
Data Citation of the Item Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
Copy and paste the following RDF/HTML data fragment to cite this resource
<div class="citation" vocab="http://schema.org/"><i class="fa fa-external-link-square fa-fw"></i> Data from <span resource="http://link.liverpool.ac.uk/portal/Beginning-Apache-Spark-2--with-resilient/3BZTTkcfKxY/" typeof="Book http://bibfra.me/vocab/lite/Item"><span property="name http://bibfra.me/vocab/lite/label"><a href="http://link.liverpool.ac.uk/portal/Beginning-Apache-Spark-2--with-resilient/3BZTTkcfKxY/">Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu</a></span> - <span property="potentialAction" typeOf="OrganizeAction"><span property="agent" typeof="LibrarySystem http://library.link/vocab/LibrarySystem" resource="http://link.liverpool.ac.uk/"><span property="name http://bibfra.me/vocab/lite/label"><a property="url" href="http://link.liverpool.ac.uk/">Sydney Jones Library, University of Liverpool</a></span></span></span></span></div>