By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.
DevOps is not just a process improvement or a combination of development and operations teams.
Life before DevOps 1. Gather features, enhancements, and bug fixes, and plan your development, testing, and deployment phases around your monthly deployment cycle. 2. Hand off your code to quality assurance (QA) after your development team has done enough coding for the month. 3. Watch the QA team scramble to test everything and send back defects. Then scramble to fix the defects and try to get the code back to QA to retest before the release timeframe. 4. During release, watch as your QA and Operations teams bicker a bit back and forth on the quality of code and whether or not it’s ready for deployment. 5. Listen to QA swear they did their job properly, and if there’s anything they didn’t identify, it’s the developers’ fault. When they finally agree to deploy the code, the Operations team deploys it. 6. Pray that it doesn’t blow up. Otherwise, your Operations team is all hands on deck trying to minimize the impact, your Dev team is trying to fix the bugs, and your QA team is scrambling to retest. 7. Spend a cycle trying to do a postmortem to figure out what went wrong and how to prevent it next time. This, of course, takes time away from your new development cycle. 8. Rinse and repeat this around every monthly release cycle time, and battle to the death with your QA and Ops colleagues about who is to blame for the last month’s issues. 9. Deal with Security, who finally musters up the courage to get involved. They’re demanding you redesign the entire application from scratch because it violates all of their policies. Uh oh.
On the other extreme are organizations that are so fearful of change—because change introduces the potential for new issues—that the cycle time to release new functionality is not measured in days or months, but in years. In those companies, the enemy of good enough is perfection. The idea that nothing should ever go wrong introduces a blame culture, where everyone is doing CYA (cover your “rear”) rather than innovating to create business value. As an architect, you need to understand your organization’s risk tolerance and look for opportunities that enable your company to be fast and highly available. These are not necessarily mutually incompatible requirements.
The DevOps Philosophy The philosophy of DevOps challenges decades of software development, where the people, process, and technology were built on longstanding belief systems based on clearly defined roles for developers, QA, and operations teams, and a rigid structure around the deployment process. Traditional software development uses a factory model of moving work through a conveyor belt of roles (processes), which produces a consistent level of quality as an output. This has frequently proved to be a false axiom, however, as the individuals involved were often the reason why processes either succeeded or failed.
DevOps includes the philosophies, practices, and tools that empower your organization to deliver experiences to its consumers at high velocity and improve service reliability. DevOps is not a role, and it doesn’t eliminate the traditional roles of developers, testing, and operations teams. Matter of fact, if you see a job posting for a DevOps engineer, it’s antithetical to the idea of DevOps, unless that role is hiring for a DevOps evangelist or someone who can help build the organization’s cultural DevOps practice. Instead, DevOps eliminates the rigidity around the development, testing, and production operations teams by including feedback loops across each team and providing more influence over the entire development life cycle, so that team members are no longer siloed by knowing only what goes on in their individual role. In DevOps, the combination of creating feedback loops and shifting left (i.e., shifting tasks as early in the life cycle as possible) enables teams to operate in a more dynamic and fluid fashion. Rather than moving work through people performing steps on a conveyor belt, DevOps development focuses on optimizing the entire end-to-end workflow. As a consequence of the repetitive work of traditional development operations, the work came to be seen as toil, which created an opportunity for automation. But humans should focus instead on continuous improvement or value-creating activities to keep improving efficiencies. Thus, the practice of automating as much of the development, testing, and operational process came to be accepted as the ideal situation so that organizations could focus on building great applications rather than performing repetitive work to create applications that were only good enough.
There are five key pillars of success in the DevOps philosophy: - Reduce organizational silos - Accept failure as normal - Implement gradual changes - Leverage tooling and automation - Measure everything
DevOps does not eliminate the role of a developer, QA, or production support team. Instead, this philosophy is focused on bringing visibility into the entire life cycle and eliminating as many pitfalls as possible so that teams can quickly build, test, deploy, and identify clear, actionable feedback through the entire life cycle to minimize code defects and improve the velocity of development.
Continuous Integration and Continuous Deployment The two aspects of the DevOps infinity loop are the continuous integration and continuous deployment of code. Continuous integration/continuous deployment (CI/CD) is the combined practices of integrating and deploying code that bridges the gap between development and operations activities by enforcing automation, seamless handoffs between teams, and continuous feedback throughout each phase. This concept describes the best practices for delivering code with more velocity, fewer defects, and more business value to your consumers.
To build a CI/CD pipeline of end-to-end workflow, you should do the following: 1. Plan your development cycle. 2. Write and manage your code. 3. Orchestrate the build process. 4. Test your build to identify defects and resolve them. 5. Prepare your release for deployment. 6. Deploy your code. 7. Operate and manage your application.
Get familiar with some of the common DevOps tools at a high level and know what they do—Jenkins, Travis CI, GitHub, Chef, Puppet, Ansible, Terraform, and Spinnaker. Because the exam is geared toward seasoned professionals, you’re expected to be familiar with a wide variety of technologies that aren’t necessarily Google Cloud technologies. Regardless of your application, whether it is a simple web application or a fully baked customer-facing application, you probably have a massive amount of code that is organized into functions and modules and that is constantly going through updates and iterations based on customer feedback. It’s incredibly difficult to manage all of your code if it doesn’t follow a central proofing and validation process. The purpose of continuous integration is to integrate code into a shared repository so that developers can collaborate effectively to write code, builds can be automated, tests can be automatically performed, bugs can be fixed, and your release can be prepared for an automated and continuous deployment. Basically, you’re optimizing every stage of development up to the point of validating your update in a staging environment and having your release ready to be pushed into production.
You should be aware of two key tenets for agile development: - Many smaller code changes are better to manage than a few huge changes. - Following a DevOps model will exercise existing and new code much more, because each small change is put through the whole end-to-end QA process. In the end, you let the computers do most of the work for you.
Continuous Integration Building the CI aspect of your pipeline involves centralizing the tools leveraged to plan your development cycles (such as JIRA and Asana) and then providing your team with tools to manage source code (such as Git) and storing code in repositories (such as GitHub, GitLab, or Cloud Source Repositories). Next you orchestrate your build process, compile all of your binaries and packaging into a build with tools such as Maven, and store your binaries in a repository such as Artifactory. After the build is created, your QA team will likely conduct tests. You should be familiar with the following types of tests: - Unit tests: These tests ensure that the smallest testable aspect of your code works as expected, even in isolation, by running tests against individual units of code without the full environment. External variables are mocked with fake versions—for instance, a database would be mocked during a unit test to ensure that the code does exactly what it’s intended to do, even with a fake integration. Unit tests are typically run by developers regularly throughout the development life cycle, often immediately before a code commit or as part of the build process after every commit. As they write new code, they create new tests and/or update existing tests to validate the new code changes. Good developers often develop their tests before they make code changes (but that’s a conversation for another time). - Integration tests: These tests ensure that the components and modules of code integrate and work properly with one another and are typically run before major commits that involve many components or the builds for new releases. Integration tests are based on the environment, including all the integrations. For instance, if your application depends on a database and your integration test fails, the test results could identify issues with any of the variables in the environment. Integrations tests are the responsibility of your overall team. QA individuals can develop and build them, but be aware that as developers create and modify code, this can lead to integration tests breaking as a result of improper new code or the need to update your integration tests to incorporate the newly introduced changes. In high-functioning DevOps organizations, developers run published integration tests to ensure that their code changes don’t cause problems with other components of the architecture. They don’t wait for someone else to uncover an integration bug included in their code changes. Some organizations are able to allocate complete end-to-end testing capability to developers, but this is possible only when everyone works together to focus on operational efficiency!
- End-to-end tests In these tests, an application is run from beginning to end to test the flow of the application and to ensure that the system can be validated for integration and data integrity.
Many other types of tests are often used, triggered by humans or applications automatically throughout this phase. After testing, you can usually do static code analysis by running code-scanning applications to scan for security vulnerabilities, resource leaks, null pointer references, and many other areas of code that may have been overlooked. Assuming that all these steps are successful, your build is pushed into a repository, where your automation server will be told that your release is ready to be deployed.
Continuous Deployment In the continuous deployment phase, your pipeline is designed to get all the changes to your builds into production, including new features, configuration changes, and bug fixes. By the time your code is in this phase, the pipeline will have validated that it is in a deployable state. Although it’s possible that mistakes will be missed and a build that shouldn’t have been deployed gets deployed, in the DevOps model, the number of defects is minimized well beyond the scope of any other traditional development philosophy. Organizations that have not embraced DevOps or high-velocity IT often don’t realize that properly designed CI/CD frameworks actually “overtest” code. Code is continuously being tested in the background after each little change is introduced. Each developer is responsible for testing and validating their code changes as thoroughly as possible. Pushing out responsibility for testing to people other than developers can slow things down: other people may not share in your priorities and deadlines. The best practice is to automate as much of the end-to-end testing and validation as possible.
Jenkins and Spinnaker are a great combination for building CI/CD pipelines. Jenkins will handle the continuous integration, and Spinnaker will handle the continuous deployment. Jenkins and Spinnaker go hand-in-hand.
Infrastructure as Code You’ve done a great job of conceptually building your pipelines, but before you share your application with the world, you need to create it in an appropriate environment that develops, tests, and sends it to production. Applications require computing power, storage, and databases using Google Cloud products that are either managed or unmanaged. The cloud surely makes it easier to right-size infrastructure and leverage a variety of services, with native integrations, scalability, and near-instant deployments. But you can imagine that all of this infrastructure will get incredibly complex to manage, and a single misconfiguration could jeopardize your entire business. This is where leveraging infrastructure as code (IaC) comes in. IaC is the practice of writing the elements of your infrastructure in code form, which can be interpreted by tools such as Terraform, Ansible, and Google Deployment Manager. You’re basically treating your infrastructure as you would treat your software—with clear code, source code repositories, approved patterns and strong change management, misconfiguration detection and prevention, and rapid deployment.
There are many reasons why you should codify your infrastructure: - Having the ability to spin up any scale infrastructure by deploying a template speeds up the entire task of deploying infrastructure by an unmatched margin. - Managing configurations for thousands of servers, services, and beyond can and will always lead to humans making errors and misconfigurations, which could cause a whole slew of issues for your applications, from operational issues to security issues. Using IaC enables you to centrally manage these configurations, scan them for deviances, and govern them centrally. - Eliminating the need to manually provision new servers and services minimizes the time to deploy your applications. By using IaC, if your application goes through a massive iteration and you need to attach a new database, Pub/Sub sink, or VMs, you can modify your templates rather than having to plan to perform these activities as part of your deployment. - Freeing up development time for your team to focus on building, testing, and managing your applications, rather than constantly provisioning infrastructure manually, saves money. - Stop looking at infrastructure as immutable. For high-velocity IT development teams to work as efficiently as possible, give them their own temporary environments to test their new code. Then destroy the environment when they are done. The whole point of public cloud is on-demand infrastructure. This can be achieved only via automation and IaC.
In 2020 and beyond, using IaC tools such as Terraform is a no-brainer. While the benefits of using IaC are incommensurable compared to not using IaC, you will still have risks that need to be managed. Think about it: You use a tool to create templates, you store templates in a repository, and you have a service account that has godlike access to provision the infrastructure across your entire GCP environment. If your templates are improperly modified and you do not have proper prevention and detection controls in place, a template could be deployed that pwns your entire infrastructure. In addition, your repository itself needs to be monitored, tightly access controlled, and ensured that it is not tampered with. Lastly, your service accounts and their service account keys need to be controlled and monitored very rigorously to ensure that they aren’t compromised. If a service account with access to the provision infrastructure across your whole stack is compromised, you’ve given an attacker the keys to the kingdom. Two approaches can be used to limit risk here: First, you could lock down the underlying OS completely that hosts IaC platforms such as Terraform or Ansible, or, better yet, use a GCP-managed service such as Deployment Manager instead. Or you could create multiple service accounts with the least privileged access needed to run the specific playbooks/workflows. Thus, if the service account is compromised, the attacker could, at most, access or modify what the specific workflow job had privileges to. The key here is never to make life easier for a hacker who manages to gain some access to your environment. Godlike account powers should be under the highest level of protection, or just avoided.
Protect your service accounts, templates, and repositories to ensure that every action is deliberately taken and validated before being authorized to commit.
Deployment Strategies There is no one-size-fits-all approach to deploying applications and changes into production. Oftentimes, based on availability requirements, performance requirements, complexities of your application, and your knowledge and success in deploying applications, you may need to leverage different strategies to deploy code at different times. For the purpose of your exam, you’ll need to be familiar with a few of the main deployment strategies.
Blue-Green Deployment Blue-green deployment involves creating two identical environments—a blue environment and green environment—so that when a release is deployed to one environment, the other environment can be held as a reserve. The idea is that you can deploy the release to one environment and switch all your users over to the new release, while still maintaining your old environment in case you need to fall back to it, without having to do a full rollback. As you can imagine, the infrastructure costs double, and if the application footprint is too large, this is not always a feasible deployment strategy.
Blue-green deployments are a very effective way to avoid having to do unplanned rollbacks by having another production environment available in the event things go wrong. Rollbacks take a lot of time and are very disruptive. It’s easier to push your users onto your green deployment if the blue one goes bad. On the exam, you’ll see scenario-based questions that ask for the recommended deployment strategy, so ensure you have a high-level understanding of these strategies.
Rolling Deployment In a rolling deployment, you maintain one production environment that may consist of many servers with a load balancer in front of them. When you deploy your application, you stagger the deployment across servers, so that some servers run the new application version and others continue to host the old version. This enables you to test real-life traffic and load and potentially identify issues before the application is fully deployed. If you do have an issue, you can just divert all of your users to the servers that do not have the latest release (rather than having to roll back servers entirely). This can be a complicated process, especially around major changes, because the support team managing your application will have to understand how to troubleshoot both users on the older versions as well as users who’ve been routed to the new version.
Canary Deployment Canary deployment involves making the new release available to a subset of users before other users. It is similar to a rolling deployment in the sense that some of your users will get access to the new release before others. But in the canary deployment, you’re targeting users, not servers. Your infrastructure costs will be higher with this type of deployment because you are maintaining two sets of infrastructure, though your usage on the infrastructure where you target your canary users probably won’t be too high if your application is designed to scale on demand.
A/B Deployment A/B deployment is more focused on testing different changes on end users to understand which they prefer. The idea here is to have half of your users work with version A, while the other half works with version B. It’s a way for you to understand how your customers are using your new version and derive insights from their usage patterns to drive customer happiness.
In Tesla vehicles, users can opt in to an “advanced” software delivery method, which is similar to A/B deployment. This gives some customers early access to software, ahead of others. This is Tesla’s way of offering more risk-tolerant users the ability to get their hands on software sooner than other users, and it helps Tesla ensure that its software is ready before deploying it to everybody.
Deployment Tools Don’t underestimate the power of Google-managed tools within GCP over those you have to self-administer. Many of these are serverless, which makes them operationally and administratively more cost-effective than even open source solutions. There are a few Google-native tools you can leverage for various aspects of the CI/CD process, all of which have their pros and cons.
Google Cloud Deployment Manager Cloud Deployment Manager enables you to create and manage cloud resources using deployment templates by treating your infrastructure as code and simplifying the deployment process. It’s similar to AWS CloudFormation Templates or Terraform templates. Cloud Deployment Manager is Google’s homegrown tool to help you do IaC provisioning and management in GCP. This tool isn’t that popular, however. Most organizations opt to use Terraform because of their multi-cloud nature, and the idea of using an open source tool makes it easier for developers to code and prevent vendor lock-in. Cloud Deployment Manager does not integrate across multi-cloud environments (nor does AWS CloudFormation). Cloud Deployment Manager uses YAML (Yet Another Markup Language) to orchestrate a deployment, similar to Ansible and Kubernetes (K8s) deployments. YAML inherently allows for jinja2 extensions along with custom Python code integration into your deployments, making it very flexible and powerful.
You can leverage Kubernetes deployment files as a way to create a declarative template to provision your Pod infrastructure using gcloud to provision your clusters and kubectl to run your deployment template.
Cloud Build Cloud Build is a serverless CI/CD platform that has curated the steps to build, test, and deploy code into GCP. You can build software using all programming languages, have complete control over the CI/CD workflow, and deploy across multiple environments, whether they are VMs, K8s, or managed services. Cloud Build integrates with many third parties across the workflow, from build tools such as Apache Maven, to continuous delivery platforms such as Spinnaker, as well as across multiple cloud service providers. Although most organizations have not yet hit the point of using cloud-native CI/CD tools, Cloud Build is a simple and cutting-edge product that should certainly be explored by organizations as an easy way to manage their pipelines.
Spinnaker is an open source continuous delivery platform originally developed by Netflix and then picked up by Google. It has been validated by thousands of teams with millions of deployments, and it has consistently proved its worth in improving velocity and deployment confidence. Spinnaker supports multiple cloud providers.
Cloud Source Repositories Cloud Source Repositories is a private Git repository that you can use to design, develop, and securely manage your code. It enables you to extend your Git workflow by connecting to other tools such as Pub/Sub, Cloud Monitoring/Logging, and more. You can mirror code from GitHub or BitBucket to get powerful code search, browsing, and diagnostic capabilities. You can also use regular expressions to refine your search across the directories. TIP For those security-conscious architects out there, please don’t underestimate the value of mirroring repositories from public repositories. You can create an effective air-gapped, change management–controlled architecture. This can prevent unauthorized access and the use of dangerous and untested code that often lives in these public repositories, and it’s an effective way to prevent the exfiltration of data out of your environment to these repositories.
Container Registry Container Registry is a private Docker repository in which you can store, manage, and secure your Docker container images. You can also perform vulnerability analysis and manage access control to the container images. With Container Registry, you can integrate your CI/CD pipelines to design fully automated Docker pipelines. When you’re using the Container Registry, you can automatically build and push images to your private registries immediately upon committing code to your source code repository tools such as Cloud Source Repositories, GitHub, or BitBucket.
Additional References If you’d like more information about the topics discussed in this chapter, check out this source: - Software Delivery for Beginners Series Intro https://medium.com/@gwright_60924/software-delivery-for-beginners-series-intro-751b90fbe078
Join 4M+ learners. Unlock unlimited quizzes, wrong-answer tracking, flashcards + reminders, study guides, and 1-on-1 challenges.