Fatskills
Practice. Master. Repeat.
Study Guide: Compute and Containers
Source: https://www.fatskills.com/google-professional-cloud-architect-certification/chapter/compute-and-containers

Compute and Containers

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~28 min read

Computing enables us to surface our technological innovations to the world. We’ve leveraged computing to solve some of the most advanced, most complex, challenges that humanity has faced. As companies move to the cloud, the speed of problem-solving continues to grow at such a rapid pace that it defies Moore’s Law, which states that the number of transistors on a microchip will double every two years, reflecting the speed of innovation. Moore’s Law has always been an accurate historical observation of future projection in the computer industry, but as the pace of digital innovation has improved tenfold with cloud computing, hardware seems unable to keep up with it. For that reason, computing workloads are finding more ways to optimize and become more efficient, without depending solely on the growth of their hardware counterparts.
It started with buying servers and data centers and hiring the appropriate networking and technical staff to build, deploy, and manage all of this hardware. Then, we worked with third-party companies and placed orders directly through them, and their engineers would manage our physical infrastructures. Virtual machines then simplified this greatly by enabling organizations to do more with less, which translated to hiring more resources in-house to manage the hypervisor and the instances beneath it. Now, in the cloud, we are seeing a shift to fully managed services to solve some problems and semi-managed platforms to solve others, and customer-managed infrastructure to solve the remaining issues. As the strategy has evolved, the needs of businesses have evolved with new rapid development cycles and feedback loops. All of these solutions continue to optimize the rate at which computing can solve problems. In the last dozen years, we’ve seen the most significant evolutionary growth. What do the next dozen years hold for us? How can we pair the most effective computing technologies with the business challenges we are trying to solve to create innovations that remain unseen in society?

Google’s Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), and Function as a Service (FaaS) offerings. How you manage virtual machines, Google App Engine (GAE), Kubernetes, and other managed service offerings. How you can effectively manage and secure your APIs in the Google Cloud Platform (GCP). Because there’s never been a one-size-fits-all solution when it comes to computing, think about where you can leverage each of these products in your organization to improve your computing power.

Google Compute Engine
Google Compute Engine (GCE) is Google Cloud’s IaaS solution that enables users to launch virtual machines on demand. With GCE, users must manage the entire underlying infrastructure associated with virtual machine (VM) instances, including the machine types. VMs can be launched on predefined or custom machine sizes. GCE supports live migrations, OS patch management, preemptible VMs (PVMs), and more, similar to Amazon Elastic Compute Cloud (EC2). GCE is a core element of computing in the cloud, and for most organizations, migrating completely to a container-based architecture is a major goal—but there is still a strong need for leveraging a standard computing infrastructure.
You can configure the underlying infrastructure on GCE to your liking. You don’t have to worry about calling SoftLayer anymore and putting an order in for a ton of CPU, memory, and storage just to be on the safe side. With GCE, you can make adjustments as you need to. You also get to decide on what kind of operating system you’d like to run, how you want to log in, which images you want to leverage, what type of software you want to run on startup, and more.

Virtual Machine Instances
It’s a good thing to know about a variety of possible configurations when you’re configuring your VM infrastructure. Let’s start with the basics. A. instance is a virtual machine that is hosted on Google’s infrastructure. When you’re creating a VM instance in Google Cloud, you first assign an instance name, then assign key/value pair labels, assign the instance to a region and a zone, and then select the machine configuration. Each of your instances belongs to a project, and you can have one or more instances in each project.
Your machine type is a set of virtualized hardware resources that includes your system memory size, virtual CPU (vCPU) count, and persistent disk limits for your VM instances. Machine types are sorted according to families, and there are subtypes within each family. Each family is organized in a manner that enables you to understand what the machine types are optimized for—general-purpose computing, memory-intensive workloads, or compute-intensive workloads.
You can manage your instances through the Cloud Console, by using the gcloud command-line tool, and via the Representational State Transfer (REST) API. You can also connect to your instances via Secure Shell (SSH) in Linux and Remote Desktop Protocol (RDP) in Windows Server instances. There are a variety of states that an instance can move through in the instance life cycle.

Machine Types
General-purpose machine types
offer the best price-to-performance ratio for various workloads. There are four families of general-purpose machines: E2, N2, N2D, and N1. The E2 machines are typically used for day-to-day computing at a low cost, often for serving web applications, backend applications, small to medium-sized databases, microservices, virtual desktops, and development environments. The N2, N2D, and N1 general-purpose machines offer a balanced price-to-performance ratio and can be used for web applications, backend applications, medium-sized to large databases, caching, and media/streaming.
Memory-optimized machine types are optimized for memory-intensive workloads, and they offer more memory per core than other machine types (up to 12TB of RAM). There are two families of memory-optimized machines: M1 and M2. M1 and M2 machines are optimized for ultra-high-memory workloads—think of large in-memory databases such as SAP HANA, Redis, or in-memory analytics.
Compute-optimized machine types are optimized for compute-intensive workloads. They offer more performance per core than other machine types. These machine types offer Intel Scalable processors and up to 3.8GHz of sustained all core turbo, which essentially means that the chips will be able to run all of their cores at a consistent maximum rate. There is one family of compute-optimized machine types, C2, and they’re typically used for high-performance computing, gaming, and single-threaded applications that are heavily CPU-intensive.
Shared-core machine types are optimized for cost. These are machine types that share a physical core and are often used for running very small, non–resource-intensive applications. Shared-core machines are available only in the N1 and E2 families. Within the E2 family are e2-micro, e2-small, and e2-medium shared-core machine types that have two vCPUs available for short periods of bursting. Within the N1 family are f1-micro and g1-small shared-core machine types that have only one vCPU available for short periods of bursting.

Take a look at the “Machine Types” public documentation page and get familiar with some of the common bounds of each machine type. You may see a question or two that has to do with optimizing your VM environment.
After you select your machine configuration, you can add GPUs for your compute workloads if you have a graphically intensive workload. These come in handy for things like processing rendering jobs, virtual applications, and machine learning. Google Cloud offers NVIDIA Tesla GPUs in its environments as of 2020. In Figure 7-2, you can see a few of the initial configuration options you have when creating your VM instance through the Cloud Console.

Preemptible VMs
You may have some workloads that don’t necessarily need to be run with four nines of uptime. Saving money should always be at the top of your mind when you’re building a cloud environment, and cloud computing can get very expensive. Running very powerful virtual machines can cost a lot of money over time, so think about some workloads that may not need the extremely high availability. These highly cost-effective instances are designed for non–fault-tolerant workloads that can withstand possible instance preemptions. Batch jobs are a great use case for PVMs.

If you see questions about the cost efficiency of your virtual machine environment paired up with no requirements for availability, think about how PVMs fit in.
It’s also a great idea not to use PVMs for workloads that cannot be terminated. Compute Engine might terminate PVMs at any time because of system events. Compute Engine also terminates PVMs after they run for 24 hours. Also, PVMs are not covered by any SLA, and they are explicitly excluded from the GCE SLA.

When you’re using PVMs, it’s always wise to consider leveraging shutdown scripts so that a proper procedure can be followed when the instances are preempted. This way, you can ensure that your application is properly shut down and any cleanup actions are performed before the instance stops. You can add this shutdown script to your instance metadata.

Shielded VMs
Shielded VMs are a security feature designed to offer a verifiable integrity of your VM instances, so that you can be sure your instances are not compromised by boot- or kernel-level malware or rootkits. These are designed for more highly sensitive workloads or for organizations that have stricter compliance requirements. Shielded VMs leverage Secure Boot, with a virtual Trusted Platform Module (vTPM), and integrity monitoring to ensure that your virtual machine has not been tampered with.

Confidential VMs
Confidential VMs are a breakthrough technology that Google Cloud developed that offers to encrypt data that is in use. This technology has never before been available to this extent. Encryption traditionally has been permitted only for data at rest or data in transit. When you’re actively using data or an application is processing data, you would have to decrypt the data in CPU and memory for your system to do anything with it. With Confidential VMs, GCE is able to work on encrypted data without having to decrypt it. This is possible by leveraging the Secure Encrypted Virtualization feature of second-generation AMD EPYC CPUs. Basically, the CPU natively encrypts and decrypts all the in-process memory. So if a bad actor were able to get a memory dump from your system, they would not be able to forensically make sense of anything.

You won’t see any questions about Confidential VMs on the exam, but if you’re working for a highly sensitive organization or you have highly sensitive workloads, think about how you can leverage them. It’s incredibly simple to migrate to them, as there’s no need for any special refactoring. When creating a VM, you just have to check a box. When migrating, it’s a simple lift and shift.

Sole-Tenant Nodes
Some organizations, particularly those in the public sector, often require dedicated hardware to ensure that even the most sophisticated of attackers cannot access their computing infrastructure by exploiting Google Cloud and the VM’s host environment. In standard GCE environments, your VM may run alongside other customers’ VMs, all on the same hardware. This is called multitenancy. There is no reason to be alarmed, though, because Google Cloud has very strong security controls to ensure that other tenants cannot laterally move through or exploit the hypervisor to get access to another tenant’s data or environment. For highly sensitive workloads, hardware isolation is important to ensure that other customer workloads are not running on the same physical server as yours. Sole-tenant nodes may also be beneficial for some workloads, such as gaming, where the performance requirements may benefit from being isolated on their own entire hardware stack.

Images
You’ve set up your instances, but now you need to think about your actual virtual machine images. GCE offers both public images and the ability to leverage custom images. Public images are provided and maintained by Google, the open source community, or third-party vendors that are available upon image selection and also in the GCP Marketplace. There are a variety of public images, each with its own flavor, from various operating systems such as Debian, Red Hat Enterprise Linux, and Windows Server, to images optimized for SQL Server. Public images are typically hardened to an extent, with minimal services running on them. The benefit of leveraging public images is that they’ve already done most of the work for you, and the images are maintained through their life cycle, which also includes maintaining security updates and other updates for the operating system.
Custom images are boot disk images that you create, own, and control access to. You’ve probably heard the terminology “Golden Image” to refer to the hardened, secure, and optimized image used for respective applications in your organization. Most companies opt to leverage custom images, as they typically have their own software, services, and configuration that these images are optimized for. A lot of organizations would also like to manage the image life cycle themselves, often storing their images in a secure image repository (or image factory) so that they can maintain who has access to the image, who can authorize the image to be deployed, what characteristics are validated before an image is deployed, and the process of identifying and remediating vulnerabilities in the image. When you’re building a custom image, your goal is to harden the image by minimizing the amount of services, functionality, and configuration that your business needs.

The Center for Internet Security produces “CIS Hardened Images,” which are images that are hardened based on their benchmarks. Their benchmarks are known in the industry as the go-to vendor agnostic and internationally recognized secure configuration guidelines. They offer CIS hardened images for most major public cloud computing companies.

Instance Templates and Instance Groups
Instance templates are sets of configurations that define the machine type, boot disk image, labels, and other image properties that VM instances and managed instance groups (MIGs)
can conveniently use to deploy the right configuration as needed. Instance templates are seen as resources in Google Cloud. You want to use instance templates when you need to create VMs from pre-existing configurations. It may be easy to use the Cloud Console to create a VM in 30 seconds or less, but don’t forget the importance of infrastructure as code. When you standardize your deployments, it makes you more operationally efficient, and it enables you to manage change effectively. Since these are resources, you can control access to the instance templates, you can monitor them to prevent anyone from changing the configuration and detecting any misconfigurations, and you can store them in a repository to stay organized when you have a massive infrastructure that requires a variety of configurations for instance groups.
You would also need to use an instance template if you want to create a MIG. Instance groups are a collection of VM instances that you can manage as a single entity. There are two kinds of instance groups, managed and unmanaged. MIGs enable you to operate multiple identical virtual machines to make your workloads scalable and highly available. Unmanaged instance groups enable you to load-balance across a fleet of VMs that you manage yourself.
With MIGs, you can take advantage of some features such as autoscaling, autohealing, regional deployment, and automatic updating. Autoscaling MIGs automatically scale up or down by adding instances to meet demand or by removing instances to reduce your costs. Autohealing is a policy that relies on application-based health checks, similar to firewall health checks, which will periodically check whether your MIGs instances are responding or not. If a MIG is not responding, it will delete and re-create that instance. You can also launch MIGs across multiple zones to protect against zonal failures—that way, you’re achieving higher-availability objectives to ensure reliability. MIGs also can use internal load balancing services to distribute traffic evenly across the instances in your MIG. You should be careful with health checks, though, because if you push an update that breaks connectivity to an instance, your health checks might trigger your instances to all be re-created endlessly until they start working again.

You can create MIGs with a minimum and maximum instance count set to 1. This enables your MIG to guarantee that at all times one instance of your VM is up and running within a region. This is a very cost-effective way to enable high availability without incurring the extra costs of having to keep two or more instances running at all times.

Storage Options
You can leverage a variety of storage options in your VM instances, all of which have their own benefits and caveats. It’s recommended that you dive into the various storage options on the public documentations site so you can get a deeper understanding of how your storage choices vary depending on the type of machines you leverage and their best use cases. But don’t worry about diving too deep into the technicalities of storage for the exam.
Following are storage options you should be familiar with:
- Persistent disks
These hard drives offer reliable, high-performance block storage for your virtual machines. You can attach persistent disks to provide storage for your instances. You can leverage zonal persistent disks or regional persistent disks that replicate your storage over two zones.
- Local SSD This local solid-state drive offers the highest performance, with transient, local block storage.
- Cloud Storage buckets This option can be leveraged to give your instances an affordable object-based storage option. It has to be created within the operating system by configuring Cloud Storage FUSE to mount the storage to your VM.
- Filestore This high-performance, network-attached file storage, like a traditional network file system, can be attached to your instances.

OS Login
OS Login is a mechanism to simplify SSH access management by linking your SSH users in Linux to their respective Google identities in Cloud Identity.
If you aren’t using OS Login, your users will need to have separate credentials to log in to their respective VMs. You use OS Login for the same reason you’d want to use single sign-on (SSO) to simplify access management for your users. It enables you to manage the full life cycle of your Linux accounts through the governance of your Google identities. That way, you can manage identity and access management (IAM) permissions centrally through Cloud IAM, you can do automatic permission updates, and you can also import existing Linux accounts from your on-premises Active Directory and LDAP servers to ensure that they are synchronized for VMs across your environments.
You can also enable an organization policy to enforce OS Login to prevent a malicious user who does not have proper authorizations through a Google identity from getting direct SSH access to your VMs. It’s a lot more difficult to manage the full SSH key life cycle if you are manually provisioning privileged users on your VM instances.

Google App Engine
Google App Engine (GAE) is a PaaS solution
that offers a fully managed serverless application platform on which users can build and deploy applications without having to manage the underlying infrastructure. There is no server management and no configuration deployments, enabling developers to focus on building. GAE supports popular development languages such as Go, Ruby, PHP, Java, Node.js, Python, C#, and .NET, and you can bring your own language runtimes and frameworks.
GAE is a great solution for development teams that want to build an application that is deployed and managed by Google. It’s incredibly simple to leverage, and it was one of Google Cloud’s most major offerings that a lot of organizations and development teams were initially interested in using. However, as the computing evolution continues and new offerings are being introduced, more organizations are moving toward models in which they still can maintain the right amount of control and also leverage fully serverless and ZeroOps solutions for workloads or events that don’t necessitate the need for a platform. PaaS solutions often create vendor lock-in, making it difficult to port applications between cloud service providers in the event that certain vendors no longer meet new business objectives.
In GAE, you create apps that are composed of one or more services. Each service uses different runtime configurations and can operate with varying performance settings. Within each service, you can deploy versions of that service to manage version control, and within each version, there are instances that your user traffic can be routed to accordingly. This level of version control is great for testing, managing rollbacks, and other temporary events. Also, your instances will scale up or down on demand. In App Engine, you can also leverage a memory cache (memcache) service to minimize the number of queries to your database backend.
App Engine Flex vs. App Engine Standard
App Engine offers the ability to run applications in two environments, App Engine Flex and App Engine Standard. You can run your applications in either environment or in both at the same time, depending on your needs.

Here’s when you’d use the standard environment:
- Instances run in a sandbox, using the runtime of a supported programming language
- Sudden spikes of traffic require rapid scaling
- You pay for only what you need and when you need it, and your application can scale to zero when there’s no traffic

And you’d use the flexible environment in these conditions:
- Instances run within Docker containers on GCE VMs
- You have consistent traffic that deals with gradual scaling
- You’re using a custom runtime and/or using an unsupported GAE Standard programming language
- You need better integration with your GCE environments
- You need to route traffic from another cloud or on-premises environment to a specific instance through a VPN

The standard environment can scale to zero, but the flexible environment must have at least one instance running for each active version.

Google Kubernetes Engine
Google Kubernetes Engine (GKE) is a managed Kubernetes solution for deploying, managing, and scaling containerized applications on GCP.
Kubernetes is an open source container orchestration system intended for automating application deployment, scaling, and management. Google has been running Borg, a large-scale cluster-management system, for its internal applications before the advent of Kubernetes. From this experience, Google originally designed Kubernetes (commonly stylized as K8s), and now it’s maintained by the open source community.
Turning the open source K8s product into an enterprise-grade solution is where GKE comes into the picture. This is where Google manages several aspects of the underlying GKE architecture while still giving organizations the responsibility to own the necessary elements of Kubernetes to build, deploy, and manage their applications securely. Kubernetes falls into an interesting space between IaaS and PaaS in that it’s more of a hybrid between the two. As a cloud architect, you still have to concern yourself with the deployment, the services, and the overall architecture. However, applications are built cloud-native into containers such as Docker. Moreover, the development pipelines and solutions are supported across all cloud options, allowing for easy application portability and no vendor lock-in. This is a main business reason why Kubernetes is growing in popularity.
Before we dive into the cluster architecture, let’s talk through the shared responsibility model for GKE. Some of the details may be hazy until you get through the rest of the GKE section, so feel free to return back here for a review. At a high level, Google is responsible for protecting the following:
- The underlying GKE infrastructure that powers Google Cloud, similar to the underlying infrastructure Google manages in Compute Engine.
- The GKE nodes operating system, particularly for the public images, such as Container-Optimized OS (COS) images. If you have auto-upgrade enabled, your nodes will automatically upgrade as Google releases new patches to these images.
- The overall K8s distribution. Google takes the open source K8s distributions and provides many updates to them as they’re leveraged across GKE.
- The control plane, including the master VMs, the API server, the etcd database, and other components running on the master VMs.
- Integrations into other Google Cloud services, such as IAM, Audit Logging, Cloud Key Management Service (KMS), and so on. A. a high level, GKE customers are responsible for protecting the following:
- The nodes that run your Kubernetes workloads, including any extra software or configuration you run on the nodes. You’re responsible for keeping these nodes up-to-date, whether you’re leveraging COS images or not.
- The workloads themselves, including your applications, container images, IAM configurations, network configurations, and any of the features you deploy across your GKE environment.

Cluster Architecture
A lot of the lingo used previously won’t make sense until you understand what the GKE architecture looks like.
- A cluster is the foundation of GKE. All Kubernetes objects run on top of your GKE cluster. Within each cluster is at least one cluster master.
- A cluster master runs all of the K8s control plane processes, including the API server, scheduler, and resource controllers. This includes the updates to the K8s version running on your cluster master.
- The K8s API server is the endpoint for your cluster, where you can interact with the cluster via HTTP/gRPC API calls through the K8s command-line client kubectl. This K8s API server is the central hub for all communications within your cluster, including all requests sent to your cluster nodes, controllers, and system components.
- A node is known as your worker machine and runs your containerized applications and other workloads. Your node essentially is a Compute Engine VM instance. You can choose your machine type and your OS images, just as you do in Compute Engine.
- Within your nodes you will run pods, which are groups of one or more container applications, such as Docker, Containerd, or rkt. Your pod will also have an IP address and can have shared storage volumes and other configuration information.
- You can leverage GKE’s native integrations that Google manages into other tools such as persistent disk, load balancers, and Cloud monitoring.

Remember the command syntax

kubectl [command] [TYPE] [NAME] [flags] that you use when interfacing with your Kubernetes environment. And gcloud container is the syntax of the command you use when you interact with GKE’s clusters, node pools, images, subnets, and operations. These are the Google-specific infrastructure configuration parameters. Know when to use these versus kubectl.

Configuration
GKE is a revolutionary product that enables businesses to build highly customizable, scalable, and resilient applications while taking away much of the burden from managing the underlying elements of the GKE infrastructure. When you’re building out your GKE environment, most of the work you’ll do will involve designing a strong network, identity and access model, and operational management process for your GKE environment. Do a proper GKE deep dive to understand most of the work that goes into building your GKE environment.
For the context of the exam, there are a few key configuration items you should be aware of that can be related to cluster-level configuration or pod-level configuration hosted in pod templates. Here are a few examples of items that would be useful to know:
- You can attach persistent disks to your cluster nodes and determine which pods can access these shared storage volumes as well.
- You can leverage StatefulSets, which are unique, persistent identities and stable host names that are maintained by GKE that you can use to deploy stateful applications and clustered applications that save data to your persistent storage.
- Your Kubernetes clusters can leverage the Horizontal Pod Autoscaler, which is an autoscaling feature that enables your workloads to increase or decrease the number of pods automatically based on demand.
- You can also scale your pods vertically using the Vertical Pod Autoscaler, which will make changes to your pods’ CPU and memory as needed.
 

Node Upgrades
When you’re doing rolling updates of your nodes, you can control how disruptive upgrades are to your workloads by leveraging two parameters, MaxSurge and MaxUnavailable. MaxSurge enables you to determine how many nodes GKE can upgrade at one time, and MaxUnavailable enables you to set a limit of the maximum GKE nodes that can be unavailable during an update.
Remember that reliability is one of the most important metrics for your business. When you’re doing upgrades, you’ll need to determine your performance needs and right-size to understand how you can provide minimally disruptive upgrades for the nodes that host all of your applications. Most organizations don’t trust themselves enough to do auto-upgrades, but if you can build a habit of using auto-upgrades and maintaining your optimal surge configurations, you’ve solved one of the biggest challenges of patch management.

All new node pools are automatically configured to use maxSurge=1 and maxUnavailable=0 configuration specifications. You can modify these yourself.

Cloud Functions
Not having to manage servers is great. Not managing platforms is even better. But not managing a single thing and paying only for the time your code executes and the duration it takes to run—well, that’s priceless. Cloud Functions is a FaaS offering that is an event-driven serverless execution environment. With Cloud Functions, you can run your code locally or in the cloud without having to provision any servers. It scales up or down on demand, so it is cost-effective and you pay only for what you use. Developers write the code, and Google Cloud does the rest. Cloud Functions is based on an open source FaaS framework that enables you to run functions across multiple environments to prevent lock-in, including local development environments, on-premises, Cloud Run, and other Knative-based serverless environments.
FaaS has its limitations, though, and there are certain use cases for which you’d want to be leveraging Cloud Functions. It’s event-driven, so you can use Cloud Functions to trigger single-purpose functions that are attached to events emitted from your environment. The code itself executes in a fully managed environment that falls under Google Cloud’s purview in the shared responsibility matrix. You can use JavaScript, Python 3, Go, or Java runtimes to write your functions. Some of the use cases for which you’d want to consider using a Cloud Function include extraction, transformation, and load (ETL) jobs; building webhooks; creating lightweight APIs; or more common use cases like triggering a Cloud Function to perform an action when a certain message has been received on a Cloud Pub/Sub topic.
One of the most popular use cases for Cloud Functions is simply to extend the capabilities of GCP services. Basically, you can tap into any GCP event you want to react to. By default, Cloud Functions supports events natively from HTTP, Cloud Storage, Cloud Pub/Sub, Cloud Firestore, and Firebase. With Cloud Logging, you can forward log entries of interest to a Pub/Sub topic by creating a sink and then triggering a Cloud Function off that event. This gives you unlimited ability to react to anything generating a log within your environment. Think of all the possibilities of building automations that react to incidents or any unplanned events.

Talk about scaling up or down. Cloud Functions can literally scale to zero so you don’t incur any costs when there is no activity. This is incredibly cost-efficient.

Cloud Run
Cloud Run is a serverless compute platform that enables you to run stateless containers, abstracting away all of the infrastructure management and built on an open source Knative framework. Essentially, you’re bringing serverless to containers by not having to manage the entire container infrastructure. You probably won’t see any questions about Cloud Run on the exam.

API Management
Creating, publishing, and managing APIs is quite a challenge for many organizations. If you’re in the business of creating APIs, whether internal or external, it’s not easy to govern and manage all of your API endpoints in a consistent manner. This is where using API management platforms comes in handy. Modern API management platforms offer the tools to develop, secure, publish, and manage your APIs in a consistent and often policy-based manner. API management platforms such as Apigee are fully developed platforms that will handle the full API life cycle, whereas other solutions such as Cloud Endpoints are not in parity when it comes to features and functionality.

Apigee
Apigee, a Google acquisition, is a full end-to-end OpenAPI-compliant API management platform
that enables you to manage the full API life cycle in any cloud, including multi- and hybrid-cloud environments. This incredibly robust product was ranked by Forrester as a leader in API management solutions in Q3 2020.
The most useful capability that Apigee offers is the fact that it is an API proxy. As a result, for API clients, it presents your business services as managed “facades” for backend services. Technically, your backend need not support HTTP, be modernized, or even understand the concept of microservices. Apigee can act as a translation layer between the modern client-facing REST API presentation layer your business is exposing to its clients and whatever new or old technologies you have lurking in the back corners of your enterprise. Furthermore, as a proxy, it can control the end-to-end security, transaction rates, access controls, and so on, for your business-exposed APIs, enabling a company to modernize its business face without necessarily having to reinvent its backend.

Cloud Endpoints
Cloud Endpoints is an API management platform that enables you to secure, manage, and monitor your APIs on Google Cloud. This greatly simplifies the need to manage all of your APIs manually in Google Cloud. You can build all of your API documentation in a developer-accessible portal. With Cloud Endpoints, you can leverage three communications protocols——OpenAPI, gRPC, or Cloud Endpoints Frameworks for App Engine standard environments. This offering gives your development teams the ability to focus on developing their APIs instead of building custom frameworks.

Secure Your APIs
APIs expose application logic and sensitive data by their intended design, and this continues to become a target for attackers. Products such as Apigee and Cloud Endpoints can prevent a lot of these flaws by default or provide you the ability to configure your APIs in a consistent, secure manner. Here are some recommendations to leverage when you’re thinking about securing your APIs:
- Classify your APIs and design a reference architecture for the required controls and approved patterns based on the classification of the API. For example, you can have public-facing APIs, internal APIs, and partner-facing APIs, each with a different level of security controls.
- Implement rate limiting to prevent denial-of-service attacks.
- Be aware of excessive data exposure, and think about how you can filter unnecessary data before it’s displayed back to the end user of the API.
- Use a common configuration and monitor your APIs for security misconfigurations.
- Validate your inputs—ensure that your API is not the subject of injection flaws, such as SQL injections, to avoid malicious code being executed by an attacker.
- Leverage IP whitelisting where you can.
- If you’re not using an API management platform that manages performance, scalability, and distribution, put your APIs behind a load balancer.
- Be very wary of how you’re managing authentication tokens and other authentication mechanisms.

Additional References
If you’d like more information about the topics discussed in this chapter, check out these sources:
- GKE Overview https://cloud.google.com/kubernetes-engine/docs/concepts/kubernetes-engine-overview
- GCE Concepts https://cloud.google.com/compute/docs/concepts
- Machine Types https://cloud.google.com/compute/docs/machine-types
- Storage Options for Compute Engine https://cloud.google.com/compute/docs/disks



ADVERTISEMENT