Alison's New App is now available on iOS and Android! Download Now

Module 1: Big Data Managed Services in the Cloud

    Study Reminders
    Support

    Big Data Managed Services - Introduction
    Hi welcome to you have data but what do you gonna do with, I’m Evan a technical curriculum developer here at Googe and in this module you will learn how you can gain insight through data using managed big data services. So far in this Google Cloud computing foundation’s course you discussed what Cloud computing is, the Google Cloud platform and using gcp to build apps. You then explored storage options the role of API, cloud security, networking and the role GCP can play in automating the creation and management of your GCP resources. In this module you look at some of the managed services that Google offers to process your big data. The objective of ths module is for you to
    Discover a variety of managed big data services in the cloud
    The specific learning objectives to achieve this include being able to:
    Discuss big data managed services available in the cloud
    Describe the use of cloud dataproc to run Spark, Hive, Pig and MapReduce as a managed service in the cloud
    Explain the building of extract transform and load pipelines as a managed service using cloud dataflow
    Discuss BigQuery as a managed data warehouse and analytics engine
    This agenda shows the topics that make up the module, the module starts with an introduction to big data managed services in the Cloud before moving onto how big data operations can be leveraged through cloud dataproc. You will then complete two labs were you will use the gcp console and then the G-Cloud command line tool to create a cloud dataproc cluster and perform various tasks. After the labs you will explore the use of cloud data flow to perform, extract, transform and load operations. The next two labs provide an opportunity to learn more about cloud data flow in the first you will create a streaming pipeline using a cloud dataflow template and in the second you will set up a python-development environment, get the cloud dataflow SDK for python and run an example pipeline using the gcp console. In the final topic you will learn about the role of bigquery in the data warehouse. You will complete the module with the short quiz and a review of the main learning points of the module.