Ir al contenido principal

Implementing Predictive Analytics with Spark in Azure HDInsight

Inscribirse en DAT202.3x

About This Course

Are you ready for big data science? In this course, learn how to implement predictive analytics solutions for big data using Apache Spark in Microsoft Azure HDInsight. See how to work with Scala or Python to cleanse and transform data and build machine learning models with Spark ML (the machine learning library in Spark).

Note: To complete the hands-on elements in this course, you will require an Azure subscription and a Windows client computer. You can sign up for a free Azure trial subscription (a valid credit card is required for verification, but you will not be charged for Azure services). Note that the free trial is not available in all regions.

Prerequisites

  • Familiarity with Azure HDInsight.
  • Familiarity with databases and SQL.
  • Some programming experience.
  • A willingness to learn actively in a self-paced manner.

What you'll learn

  • Using Spark to explore data and prepare for modeling
  • Build supervised machine learning models
  • Evaluate and optimize models
  • Build recommenders and unsupervised machine learning models

Course Syllabus

  • Introduction to Data Science with Spark
    Get started with Spark clusters in Azure HDInsight, and use Spark to run Python or Scala code to work with data.

  • Getting Started with Machine Learning
    Learn how to build classification and regression models using the Spark ML library.

  • Evaluating Machine Learning Models
    Learn how to evaluate supervised learning models, and how to optimize model parameters.

  • Recommenders and Unsupervised Models
    Learn how to build recommenders and clustering models using Spark ML.

Meet the instructor

Graeme Malcolm

Graeme Malcolm

Senior Content Develope
Microsoft Learning Experiences


Graeme has been a trainer, consultant, and author for longer than he cares to remember, specializing in SQL Server and the Microsoft data platform. He is a Microsoft Certified Solutions Expert for the SQL Server Data Platform and Business Intelligence. After years of working with Microsoft as a partner and vendor, he now works in the Microsoft Learning Experiences team as a senior content developer, where he plans and creates content for developers and data professionals who want to get the best out of Microsoft technologies.

  1. Código del curso

    DAT202.3x
  2. Inicio de clases

  3. Término de clases

  4. Esfuerzo estimado

    18-24 hours in total
Enroll