Fatskills
Practice. Master. Repeat.
Study Guide: Principles of Product Management: Cloud Computing (IaaS, PaaS, SaaS – AWS, GCP, Azure Fundamentals for PMs)
Source: https://www.fatskills.com/product-management/chapter/product-management-cloud-computing-iaas-paas-saas-aws-gcp-azure-fundamentals-for-pms

Principles of Product Management: Cloud Computing (IaaS, PaaS, SaaS – AWS, GCP, Azure Fundamentals for PMs)

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~8 min read

Cloud Computing (IaaS, PaaS, SaaS – AWS, GCP, Azure Fundamentals for PMs)


Cloud Computing (IaaS, PaaS, SaaS – AWS, GCP, Azure Fundamentals for PMs)

What This Is

Cloud computing is the on-demand delivery of computing resources (servers, storage, databases, networking, software) over the internet, eliminating the need for physical infrastructure. For PMs, it’s the backbone of scalable, cost-efficient, and flexible product development—enabling rapid iteration, global reach, and pay-as-you-go pricing. Example: Slack migrated from on-premise servers to AWS to handle millions of concurrent users, reducing latency and improving reliability. Without cloud fundamentals, PMs risk over-engineering, underestimating costs, or missing opportunities to leverage managed services (e.g., AWS Lambda for serverless functions).


Key Terms & Frameworks

  1. IaaS (Infrastructure as a Service)
  2. Definition: Raw computing resources (virtual machines, storage, networking) delivered over the cloud. You manage the OS, apps, and data; the provider handles hardware.
  3. Example: AWS EC2, Azure Virtual Machines, Google Compute Engine.

  4. PaaS (Platform as a Service)

  5. Definition: A managed platform for developing, testing, and deploying apps without managing underlying infrastructure. Includes OS, middleware, and runtime.
  6. Example: Heroku, Google App Engine, AWS Elastic Beanstalk.

  7. SaaS (Software as a Service)

  8. Definition: Fully managed software delivered via the cloud (no installation or maintenance). Users access it via a browser or API.
  9. Example: Salesforce, Dropbox, Zoom.

  10. Serverless Computing

  11. Definition: Event-driven execution of code (e.g., AWS Lambda, Google Cloud Functions) where you pay per invocation, not for idle servers.
  12. Use Case: Processing file uploads (e.g., resizing images when a user uploads to a SaaS app).

  13. CAPEX vs. OPEX

  14. CAPEX (Capital Expenditure): Upfront costs for physical infrastructure (e.g., buying servers).
  15. OPEX (Operational Expenditure): Recurring cloud costs (e.g., AWS bills). Cloud shifts spending from CAPEX to OPEX.

  16. Shared Responsibility Model

  17. Definition: Cloud providers secure the infrastructure (physical data centers, networking), while customers secure their data, apps, and access (e.g., IAM policies, encryption).
  18. Trap: Assuming the cloud provider handles all security (e.g., misconfigured S3 buckets leaking data).

  19. Multi-Cloud vs. Hybrid Cloud

  20. Multi-Cloud: Using multiple cloud providers (e.g., AWS + GCP) to avoid vendor lock-in or leverage best-of-breed services.
  21. Hybrid Cloud: Mixing on-premise infrastructure with cloud (e.g., sensitive data on-prem, scalable apps in the cloud).

  22. Cost Optimization Formula (Cloud Unit Economics)

  23. Formula: Cost per User = (Total Cloud Spend) / (Active Users)
  24. Variables:
    • Total Cloud Spend: Sum of all cloud services (compute, storage, bandwidth).
    • Active Users: DAU/MAU or other relevant metric.
  25. Use Case: Justify cloud costs to finance teams or prioritize cost-saving features (e.g., auto-scaling).

  26. Well-Architected Framework (AWS/GCP/Azure)

  27. 5 Pillars (AWS):

    1. Operational Excellence (automate deployments, monitor systems).
    2. Security (least privilege, encryption, IAM).
    3. Reliability (multi-AZ deployments, backup strategies).
    4. Performance Efficiency (right-size resources, use caching).
    5. Cost Optimization (reserved instances, spot instances).
  28. SLOs (Service Level Objectives) & SLAs (Service Level Agreements)

    • SLO: Internal target for reliability (e.g., "99.9% uptime for the API").
    • SLA: Contractual promise to customers (e.g., "99.5% uptime or we refund 10%").
    • Example: If your SLO is 99.9%, you can have 8.76 hours of downtime/year; 99.95% allows 4.38 hours.
  29. Cloud Migration Strategies (6 R’s)

    • Rehost ("Lift and Shift"): Move apps to cloud without changes (fastest, least efficient).
    • Replatform: Optimize for cloud (e.g., move from self-managed DB to RDS).
    • Repurchase: Switch to a SaaS alternative (e.g., replace on-prem CRM with Salesforce).
    • Refactor: Rewrite apps for cloud-native (e.g., microservices, serverless).
    • Retire: Decommission unused apps.
    • Retain: Keep some apps on-prem (e.g., legacy systems).
  30. Cloud Pricing Models

    • On-Demand: Pay per use (e.g., $0.10/hour for an EC2 instance).
    • Reserved Instances: Commit to 1- or 3-year terms for discounts (up to 75%).
    • Spot Instances: Bid for unused capacity (up to 90% cheaper, but can be terminated).
    • Savings Plans: Flexible discounts for consistent usage (e.g., AWS Savings Plans).

Step-by-Step / Process Flow

How to Apply Cloud Knowledge in a Product Scenario Example: Your startup wants to launch a new AI-powered feature (e.g., real-time document translation) but needs to scale globally while controlling costs.

  1. Assess Requirements & Constraints
  2. Actions:

    • Define non-functional requirements (NFRs): latency (<200ms), uptime (99.9%), data residency (GDPR compliance).
    • Identify user segments (e.g., free vs. enterprise users) and their expected usage (e.g., 10K requests/day).
    • List technical dependencies (e.g., GPU for AI inference, CDN for global delivery).
  3. Choose the Right Cloud Service Model

  4. Actions:
    • SaaS? If the feature is a standalone product (e.g., a translation API), consider selling it as a SaaS (e.g., Google Translate API).
    • PaaS? If you’re building a custom app, use PaaS (e.g., Google App Engine) to avoid managing servers.
    • IaaS? If you need fine-grained control (e.g., custom GPU clusters), use IaaS (e.g., AWS EC2).
  5. Example: For the translation feature, use PaaS (AWS Lambda + API Gateway) for serverless scaling.

  6. Design for Scalability & Cost Efficiency

  7. Actions:

    • Auto-scaling: Configure AWS Auto Scaling to spin up/down Lambda functions based on demand.
    • Caching: Use Amazon ElastiCache (Redis) to cache frequent translations.
    • Multi-Region Deployment: Deploy in us-east-1 (Virginia) and eu-west-1 (Ireland) to reduce latency.
    • Cost Controls: Set AWS Budgets alerts to avoid bill shock (e.g., notify if spend > $1K/month).
  8. Implement Security & Compliance

  9. Actions:

    • IAM: Follow least privilege (e.g., Lambda role can only access S3 buckets for translations).
    • Encryption: Enable KMS for data at rest and TLS 1.2+ for data in transit.
    • Compliance: Use AWS Artifact to download compliance reports (e.g., SOC 2, GDPR).
  10. Monitor & Optimize

  11. Actions:

    • CloudWatch: Set up dashboards for latency, error rates, and cost.
    • SLOs: Define SLOs (e.g., "99.9% of requests <200ms") and set up alerts.
    • Cost Optimization: Use AWS Cost Explorer to identify idle resources (e.g., unused EBS volumes).
    • User Feedback: Monitor NPS or support tickets for performance issues (e.g., "Translations are slow in Asia").
  12. Plan for Failure

  13. Actions:
    • Chaos Engineering: Use AWS Fault Injection Simulator to test failure scenarios (e.g., kill a Lambda function).
    • Backup & DR: Enable S3 versioning and cross-region replication for critical data.
    • Fallback Mechanism: If the AI model fails, return a cached translation or a "Try again later" message.

Common Mistakes

  1. Mistake: Assuming cloud = infinite scalability (e.g., "We’ll just throw more Lambda functions at it!").
  2. Correction: Cloud has soft limits (e.g., AWS Lambda has a 15-minute timeout, 10GB memory). Design for throttling (e.g., queue requests with SQS) and cost spikes (e.g., set budget alerts).

  3. Mistake: Ignoring data egress costs (e.g., "We’ll just move data between AWS and GCP for free!").

  4. Correction: Cloud providers charge $0.01–$0.12/GB for data leaving their network. Minimize cross-cloud transfers or use multi-cloud CDNs (e.g., Cloudflare).

  5. Mistake: Over-provisioning resources (e.g., "Let’s use a 32-core EC2 instance just in case!").

  6. Correction: Start with small instances and use auto-scaling. Use AWS Compute Optimizer to right-size resources.

  7. Mistake: Treating cloud costs as "someone else’s problem" (e.g., "Engineering handles the AWS bill").

  8. Correction: PMs must own cloud unit economics. Track cost per user and cost per feature to justify spend to finance teams.

  9. Mistake: Not designing for failure (e.g., "Our app is 100% reliable because it’s in the cloud!").

  10. Correction: Cloud services do fail (e.g., AWS outages in 2021). Design for redundancy (multi-AZ), retries, and graceful degradation.

PM Interview / Practical Insights

  1. Tricky Distinction: IaaS vs. PaaS vs. SaaS
  2. Interviewer Probe: "Should we build our new analytics dashboard on IaaS, PaaS, or SaaS?"
  3. Answer:
    • SaaS: If you want a turnkey solution (e.g., Tableau, Power BI) with no dev work.
    • PaaS: If you need customization (e.g., AWS QuickSight + Lambda) but don’t want to manage servers.
    • IaaS: If you need full control (e.g., custom Kubernetes cluster for real-time analytics).
  4. Why It Matters: Wrong choice leads to technical debt (e.g., building on IaaS when PaaS would suffice) or vendor lock-in (e.g., SaaS with no export options).

  5. Cost Trade-offs: On-Demand vs. Reserved Instances

  6. Interviewer Probe: "Our app has predictable traffic. Should we use on-demand or reserved instances?"
  7. Answer:
    • Reserved Instances (RIs): If traffic is stable (e.g., 100 EC2 instances 24/7), RIs save up to 75%.
    • On-Demand: If traffic is spiky (e.g., Black Friday sales), use on-demand + auto-scaling.
  8. Trap: RIs are non-refundable—if your app pivots, you’re stuck paying for unused capacity.

  9. Vendor Lock-in: Multi-Cloud vs. Single Cloud

  10. Interviewer Probe: "Should we use AWS, GCP, or both for our new product?"
  11. Answer:
    • Single Cloud: Simpler, better discounts, and deeper integration (e.g., AWS’s AI/ML services).
    • Multi-Cloud: Avoids lock-in, but adds complexity (e.g., managing IAM across providers) and costs (e.g., data transfer fees).
  12. Rule of Thumb: Start with one cloud, but design for portability (e.g., use Terraform for infrastructure-as-code).

  13. Security: Shared Responsibility Model

  14. Interviewer Probe: "Who’s responsible if our customer data is leaked from an S3 bucket?"
  15. Answer:
    • AWS: Secures the infrastructure (physical servers, networking).
    • You: Secure your data (e.g., enable S3 encryption, restrict IAM policies).
  16. Real-World Example: In 2017, Verizon leaked 14M customer records due to a misconfigured S3 bucket (AWS wasn’t at fault).

Quick Check Questions

  1. Scenario: Your team wants to launch a new feature that requires GPU-intensive processing (e.g., video transcoding). Should you use IaaS, PaaS, or SaaS?
  2. Answer: IaaS (e.g., AWS EC2 with GPU instances) or PaaS (e.g., AWS Elemental MediaConvert). SaaS isn’t viable unless you’re using a third-party service (e.g., Mux).
  3. Why: You need low-level control over GPU resources, which SaaS doesn’t provide.

  4. Scenario: Your cloud bill spiked 300% this month. What’s the first thing you check?

  5. Answer: AWS Cost Explorer to identify the top cost drivers (e.g., unused EBS volumes, over-provisioned RDS instances, or data transfer fees).
  6. Why: Cost spikes are usually due to zombie resources (unused but still running) or unexpected traffic (e.g., a DDoS attack).

  7. Scenario: Your app’s latency is high in Asia. How do you fix it?

  8. Answer: Deploy in a closer region (e.g., AWS ap-southeast-1 (Singapore)) and use a CDN (e.g., CloudFront) to cache static assets.
  9. Why: Latency is primarily driven by physical distance and network hops.

Last-Minute Cram Sheet

  1. IaaS = Infrastructure (EC2, S3), PaaS = Platform (Heroku, App Engine), SaaS = Software (Salesforce, Zoom).
  2. Serverless = Pay per use (Lambda, Cloud Functions), not per server.
  3. Shared Responsibility Model: Cloud secures infrastructure, you secure data/apps.
  4. CAPEX (upfront)-OPEX (recurring) with cloud.
  5. Multi-Cloud = Multiple providers; Hybrid = Cloud + on-prem.
  6. Cost per User = Total Cloud Spend / Active Users.
  7. AWS Well-Architected Framework: Operational Excellence, Security, Reliability, Performance, Cost.
  8. SLO (internal) vs. SLA (contractual). 99.9% uptime = 8.76h downtime/year.
  9. Cloud Migration: Rehost (lift & shift), Replatform (optimize), Refactor (rewrite).
  10. Data egress fees ($0.01–$0.12/GB) can kill your budget if you move data between clouds.