Fatskills
Practice. Master. Repeat.
Study Guide: Google Cloud Certified Data Engineer: 4. Designing a Data Processing Solution - Important Things To Know
Source: https://www.fatskills.com/google-cloud-certified-professional-data-engineer/chapter/google-cloud-certified-data-engineer-4-designing-a-data-processing-solution-important-things-to-know

Google Cloud Certified Data Engineer: 4. Designing a Data Processing Solution - Important Things To Know

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~3 min read

1. Know the four main compute GCP products. a-service (IaaS) product.
Compute Engine is GCP’s infrastructure-as-With Compute Engine, you have the greatest amount of control over your infrastructure relative to the other GCP compute services.
Kubernetes is a container orchestration system, and Kubernetes Engine is a managed Kubernetes service. With Kubernetes Engine, Google maintains the cluster and assumes responsibility for installing and configuring the Kubernetes platform on the cluster. Kubernetes Engine deploys Kubernetes on managed instance groups.
App Engine is GCP’s original platform-as-a-service (PaaS) offering. App Engine is designed to allow developers to focus on application development while minimizing their need to support the infrastructure that runs their applications. App Engine has two versions: App Engine Standard and App Engine Flexible.
Cloud Functions is a serverless, managed compute service for running code in response to events that occur in the cloud. Events are supported for Cloud Pub/Sub, Cloud Storage, HTTP events, Firebase, and Stackdriver Logging.
2. Understand the definitions of availability, reliability, and scalability. Availability is defined as the ability of a user to access a resource at a specific time. Availability is usually measured as the percentage of time a system is operational. Reliability is defined as the probability that a system will meet service-level objectives for some duration of time.
3. Reliability is often measured as the mean time between failures. Scalability is the ability of a system to meet the demands of workloads as they vary over time.
4. Know when to use hybrid clouds and edge computing. The analytics hybrid cloud is used when transaction processing systems continue to run on premises and data is extracted and transferred to the cloud for analytic processing. A variation of hybrid clouds is an edge cloud, which uses local computation resources in addition to cloud platforms. This architecture pattern is used when a network may not be reliable or have sufficient bandwidth to transfer data to the cloud. It is also used when low-latency processing is required.
5. Understand messaging. Message brokers are services that provide three kinds of functionality: message validation, message transformation, and routing. Message validation is the process of ensuring that messages received are correctly formatted. Message transformation is the process of mapping data to structures that can be used by other services. Message brokers can receive a message and use data in the message to determine where the message should be sent. Routing is used when hub-and-spoke message brokers are used.
6. Know distributed processing architectures. SOA is a distributed architecture that is driven by business operations and delivering business value. Typically, an SOA system serves a discrete business activity. SOAs are self-contained sets of services. Microservices are a variation on SOA architecture. Like other SOA systems, microservice architectures use multiple, independent components and common communication protocols to provide higher-level business services. Serverless functions extend the principles of microservices by removing concerns for containers and managing runtime environments.
7. Know the steps to migrate a data warehouse. At a high level, the process of migrating a data warehouse involves four stages:
- Assessing the current state of the data warehouse
- Designing the future state
- Migrating data, jobs, and access controls to the cloud
- Validating the cloud data warehouse



ADVERTISEMENT