The Ultimate Apache Spark with Java Course – Hands On!
Learn how to slice and dice data using the next generation big data platform – Apache Spark!
In this course you’ll learn everything you need to know about using Apache Spark in your organization while using their latest and greatest Java Datasets API.
Best Seller Course: Apache Kafka Series – Learn Apache Kafka for Beginners v2
Below are some of the things you’ll learn:
- How to develop Spark Java Applications using Spark SQL Dataframes
- Understand how the Spark Standalone cluster works behind the scenes
- How to use various transformations to slice and dice your data in Spark Java
- How to marshall/unmarshall Java domain objects (pojos) while working with Spark Datasets
- Master joins, filters, aggregations and ingest data of various sizes and file formats (txt, csv, Json etc.)
- Analyze over 18 million real-world comments on Reddit to find the most trending words used
- Develop programs using Spark Streaming for streaming stock market index files
- Stream network sockets and messages queued on a Kafka cluster
- Learn how to develop the most popular machine learning algorithms using Spark MLlib
- Covers the most popular algorithms: Linear Regression, Logistic Regression and K-Means Clustering
You May Also Need This Course: All NoSQL (HBase + Cassandra + MongoDB + Redis) for Big Data