Coverart for item
The Resource Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu

Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu

Label
Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library
Title
Beginning Apache Spark 2
Title remainder
with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library
Statement of responsibility
Hien Luu
Creator
Author
Subject
Language
eng
Summary
Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it. Along the way, you'll discover resilient distributed datasets (RDDs); use Spark SQL for structured data; and learn stream processing and build real-time applications with Spark Structured Streaming. Furthermore, you'll learn the fundamentals of Spark ML for machine learning and much more. After you read this book, you will have the fundamentals to become proficient in using Apache Spark and know when and how to apply it to your big data applications
Cataloging source
N$T
http://library.link/vocab/creatorName
Luu, Hien
Dewey number
005.75/8
Index
index present
LC call number
QA76.9.D3
Literary form
non fiction
Nature of contents
dictionaries
http://library.link/vocab/subjectName
Distributed databases
Label
Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
Instantiates
Publication
Distribution
Copyright
Note
Includes index
Antecedent source
unknown
Carrier category
online resource
Carrier category code
  • cr
Carrier MARC source
rdacarrier
Color
multicolored
Content category
text
Content type code
  • txt
Content type MARC source
rdacontent
Contents
  • Intro; Table of Contents; About the Author; About the Technical Reviewer; Chapter 1: Introduction to  Apache Spark; Overview; History; Spark Core Concepts and Architecture; Spark Clusters and the Resource Management System; Spark Application; Spark Driver and Executor; Spark Unified Stack; Spark Core; Spark SQL; Spark Structured Streaming and Streaming; Spark MLlib; Spark Graphx; SparkR; Apache Spark Applications; Example Spark Application; Summary; Chapter 2: Working with Apache Spark; Downloading and Installing Spark; Downloading Spark; Installing Spark; Spark Scala Shell
  • Spark Python ShellHaving Fun with the Spark Scala Shell; Useful Spark Scala Shell Commands and Tips; Basic Interactions with Scala and Spark; Basic Interactions with Scala; Spark UI and Basic Interactions with Spark; Spark UI; Basic Interactions with Spark; Introduction to Databricks; Creating a Cluster; Creating a Folder; Creating a Notebook; Setting Up the Spark Source Code; Summary; Chapter 3: Resilient Distributed Datasets; Introduction to RDDs; Immutable; Fault Tolerant; Parallel Data Structures; In-Memory Computing; Data Partitioning and Placement; Rich Set of Operations; RDD Operations
  • Creating RDDsTransformations; Transformation Examples; map(func); flatMap(func); filter(func); mapPartitions(func)/mapPartitionsWithIndex(index, func); union(otherRDD); intersection(otherRDD); substract(otherRDD); distinct(); sample(withReplacement, fraction, seed); Actions; Action Examples; collect(); count(); first(); take(n); reduce(func); takeSample(withReplacement, n, [seed]); takeOrdered(n, [ordering]); top(n, [ordering]); saveAsTextFile(path); Working with Key/Value Pair RDD; Creating Key/Value Pair RDD; Key/Value Pair RDD Transformations; groupByKey([numTasks])
  • ReduceByKey(func, [numTasks])sortByKey([ascending],[numTasks]); join(otherRDD); Key/Value Pair RDD Actions; countByKey(); collectAsMap(); lookup(key); Understand Data Shuffling; Having Fun with RDD Persistence; Summary; Chapter 4: Spark SQL (Foundations); Introduction to DataFrames; Creating DataFrames; Creating DataFrames from RDDs; Creating DataFrames from a Range of Numbers; Creating DataFrames from Data Sources; Creating DataFrames by Reading Text Files; Creating DataFrames by Reading CSV Files; Creating DataFrames by Reading JSON Files; Creating DataFrames by Reading Parquet Files
  • Creating DataFrames by Reading ORC FilesCreating DataFrames from JDBC; Working with Structured Operations; Working with Columns; Working with Structured Transformations; select(columns); selectExpr(expressions); filler(condition), where(condition); distinct, dropDuplicates; sort(columns), orderBy(columns); limit(n); union(otherDataFrame); withColumn(colName, column); withColumnRenamed(existingColName, newColName); drop(columnName1, columnName2); sample(fraction), sample(fraction, seed), sample(fraction, seed, withReplacement); randomSplit(weights); Working with Missing or Bad Data
Dimensions
unknown
Extent
1 online resource.
File format
unknown
Form of item
online
Isbn
9781484235799
Level of compression
unknown
Media category
computer
Media MARC source
rdamedia
Media type code
  • c
Quality assurance targets
not applicable
Reformatting quality
unknown
Sound
unknown sound
Specific material designation
remote
System control number
  • on1049171799
  • (OCoLC)1049171799
Label
Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark Machine Learning library, Hien Luu
Publication
Distribution
Copyright
Note
Includes index
Antecedent source
unknown
Carrier category
online resource
Carrier category code
  • cr
Carrier MARC source
rdacarrier
Color
multicolored
Content category
text
Content type code
  • txt
Content type MARC source
rdacontent
Contents
  • Intro; Table of Contents; About the Author; About the Technical Reviewer; Chapter 1: Introduction to  Apache Spark; Overview; History; Spark Core Concepts and Architecture; Spark Clusters and the Resource Management System; Spark Application; Spark Driver and Executor; Spark Unified Stack; Spark Core; Spark SQL; Spark Structured Streaming and Streaming; Spark MLlib; Spark Graphx; SparkR; Apache Spark Applications; Example Spark Application; Summary; Chapter 2: Working with Apache Spark; Downloading and Installing Spark; Downloading Spark; Installing Spark; Spark Scala Shell
  • Spark Python ShellHaving Fun with the Spark Scala Shell; Useful Spark Scala Shell Commands and Tips; Basic Interactions with Scala and Spark; Basic Interactions with Scala; Spark UI and Basic Interactions with Spark; Spark UI; Basic Interactions with Spark; Introduction to Databricks; Creating a Cluster; Creating a Folder; Creating a Notebook; Setting Up the Spark Source Code; Summary; Chapter 3: Resilient Distributed Datasets; Introduction to RDDs; Immutable; Fault Tolerant; Parallel Data Structures; In-Memory Computing; Data Partitioning and Placement; Rich Set of Operations; RDD Operations
  • Creating RDDsTransformations; Transformation Examples; map(func); flatMap(func); filter(func); mapPartitions(func)/mapPartitionsWithIndex(index, func); union(otherRDD); intersection(otherRDD); substract(otherRDD); distinct(); sample(withReplacement, fraction, seed); Actions; Action Examples; collect(); count(); first(); take(n); reduce(func); takeSample(withReplacement, n, [seed]); takeOrdered(n, [ordering]); top(n, [ordering]); saveAsTextFile(path); Working with Key/Value Pair RDD; Creating Key/Value Pair RDD; Key/Value Pair RDD Transformations; groupByKey([numTasks])
  • ReduceByKey(func, [numTasks])sortByKey([ascending],[numTasks]); join(otherRDD); Key/Value Pair RDD Actions; countByKey(); collectAsMap(); lookup(key); Understand Data Shuffling; Having Fun with RDD Persistence; Summary; Chapter 4: Spark SQL (Foundations); Introduction to DataFrames; Creating DataFrames; Creating DataFrames from RDDs; Creating DataFrames from a Range of Numbers; Creating DataFrames from Data Sources; Creating DataFrames by Reading Text Files; Creating DataFrames by Reading CSV Files; Creating DataFrames by Reading JSON Files; Creating DataFrames by Reading Parquet Files
  • Creating DataFrames by Reading ORC FilesCreating DataFrames from JDBC; Working with Structured Operations; Working with Columns; Working with Structured Transformations; select(columns); selectExpr(expressions); filler(condition), where(condition); distinct, dropDuplicates; sort(columns), orderBy(columns); limit(n); union(otherDataFrame); withColumn(colName, column); withColumnRenamed(existingColName, newColName); drop(columnName1, columnName2); sample(fraction), sample(fraction, seed), sample(fraction, seed, withReplacement); randomSplit(weights); Working with Missing or Bad Data
Dimensions
unknown
Extent
1 online resource.
File format
unknown
Form of item
online
Isbn
9781484235799
Level of compression
unknown
Media category
computer
Media MARC source
rdamedia
Media type code
  • c
Quality assurance targets
not applicable
Reformatting quality
unknown
Sound
unknown sound
Specific material designation
remote
System control number
  • on1049171799
  • (OCoLC)1049171799

Library Locations

Processing Feedback ...