In this course, students will learn about data science as it relates to working with batch and real-time analytics solutions using Azure data platform technologies. Participants will begin with fundamentals of the key computing and storage technologies used to create an analytical solution. Participants will learn how to interactively explore data stored in files in a sea of data. They will learn the different collection techniques that can be used to load data using the Apache Spark function in Azure Synapse Analytics or Azure Databricks, and how to perform collection using Azure Data Factory or Azure Synapse Pipelines. Attendees will also learn about the different ways to transform data using the same technologies that are used to ingest it. They will understand the importance of implementing security to ensure that data is protected at rest or in transit. You will then be shown how to create a real-time analytics system to create real-time analytics solutions.
In this course, participants will gain the following skills:
- Explore compute and storage options for data science workloads in Azure.
- Run interactive queries using serverless SQL pools
- Perform data exploration and transformation in Azure Databricks
- Explore, transform and load data in the data warehouse using Apache Spark
- Capturing and loading data in the data warehouse
- Transforming data with Azure Data Factory or Azure Synapse pipelines
- Integrate data from notebooks with Azure Data Factory or Azure Synapse pipelines
- Support Hybrid Transactional Analytical Processing (HTAP) with Azure Synapse Link
- Implement end-to-end security with Azure Synapse Analytics
- Perform real-time stream processing with Stream Analytics
- Build a stream processing solution with Event Hubs and Azure Databricks