Fatskills
Practice. Master. Repeat.
Study Guide: Cloud ML - Google Cloud Professional Machine Learning Engineer: GCP AI Services Cheat Sheet (Which service to use when)
Source: https://www.fatskills.com/hesi/chapter/cloud-ml-cert-gcp-ml-gcp-ai-services-cheat-sheet-which-service-to-use-when

Cloud ML - Google Cloud Professional Machine Learning Engineer: GCP AI Services Cheat Sheet (Which service to use when)

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~8 min read

GCP_ML – GCP AI Services Cheat Sheet (Which service to use when)

GCP AI Services Cheat Sheet: Which Service to Use When

(Google Cloud Professional Machine Learning Engineer Exam – Highly Practical Study Guide)


What This Is

This cheat sheet helps you instantly pick the right GCP AI/ML service for common real-world scenarios—like deploying a low-latency fraud detection model (Vertex AI Endpoints), extracting text from invoices (Document AI), or building a recommendation engine (Vertex AI Matching Engine). The exam tests your ability to match business needs to GCP services, so this guide focuses on decision rules, trade-offs, and exam traps (e.g., when to use AutoML vs. custom training, or BigQuery ML vs. Vertex AI).


Key Terms & Services

Core AI/ML Services

  • Vertex AI: GCP’s unified ML platform for training, deploying, and monitoring models (replaces older services like AI Platform). Best for end-to-end ML workflows (custom training, AutoML, pipelines, endpoints).
  • AutoML (Vertex AI AutoML): No-code/low-code model training for vision, NLP, tabular, and video data. Best for quick prototyping or when you lack ML expertise.
  • Vertex AI Training: Custom model training (TensorFlow, PyTorch, XGBoost) with managed infrastructure (GPUs/TPUs). Use when AutoML isn’t flexible enough (e.g., custom loss functions).
  • Vertex AI Prediction (Endpoints): Deploy models as REST APIs for real-time inference. Supports online (low-latency) and batch predictions.
  • Vertex AI Pipelines: Orchestrate ML workflows (data prep-training-deployment) using Kubeflow Pipelines (KFP) or TFX. Best for reproducible, automated ML.
  • Vertex AI Feature Store: Centralized repository for ML features (like AWS SageMaker Feature Store). Reduces feature drift and duplication in training/inference.
  • Vertex AI Matching Engine: Vector database for low-latency similarity search (e.g., recommendations, RAG, semantic search). Uses approximate nearest neighbor (ANN) for speed.
  • BigQuery ML: Train ML models directly in BigQuery (SQL-based). Best for simple models (linear regression, classification) on structured data without moving data out of BQ.
  • Document AI: Pre-trained models for document processing (invoices, receipts, forms, contracts). Extracts structured data from unstructured documents (OCR + NLP).
  • Vision AI / Natural Language API / Speech-to-Text / Translation API: Pre-trained APIs for image analysis, text sentiment, speech recognition, and translation. Use when you don’t need custom models.
  • Recommendations AI: Pre-built recommendation engine for e-commerce (like Amazon Personalize). Trains on user-item interactions (clicks, purchases).
  • TensorFlow Enterprise: Optimized TensorFlow runtime on GCP (faster training, better GPU/TPU support). Use for large-scale deep learning.

Key Concepts

  • Online vs. Batch Prediction:
  • Online (Vertex AI Endpoints): Real-time, low-latency (e.g., fraud detection).
  • Batch (Vertex AI Batch Prediction): Scheduled, high-throughput (e.g., nightly churn predictions).
  • Feature Drift vs. Data Drift:
  • Feature drift: Input data distribution changes (e.g., user behavior shifts).
  • Data drift: Model performance degrades due to concept drift (e.g., fraud patterns change).
  • Cold Start vs. Warm Start:
  • Cold start: Model takes time to load (first request is slow).
  • Warm start: Model is pre-loaded (faster, but costs more).

Step-by-Step: How to Choose the Right GCP AI Service

1. Define Your Use Case

Ask: - Is this a pre-built task? (e.g., OCR, sentiment analysis)-Use pre-trained APIs (Vision AI, NLP API). - Do I need a custom model?-Use Vertex AI Training or AutoML. - Is the data structured (tables) or unstructured (images/text)?-Structured-BigQuery ML or Vertex AI Tabular. Unstructured-AutoML Vision/NLP or custom training. - Do I need real-time or batch predictions?-Real-time-Vertex AI Endpoints. Batch-Vertex AI Batch Prediction or BigQuery ML.

2. Check Data Requirements

  • Data size:
  • Small dataset (<100K rows)?-BigQuery ML (SQL-based, no data movement).
  • Large dataset?-Vertex AI Training (supports distributed training).
  • Data type:
  • Images?-AutoML Vision or Vertex AI Training (TF/PyTorch).
  • Text?-AutoML NLP or Vertex AI Training (BERT, etc.).
  • Tabular?-AutoML Tables or Vertex AI Training (XGBoost, etc.).
  • Data location:
  • Already in BigQuery?-BigQuery ML (avoid data movement).
  • In Cloud Storage?-Vertex AI Training/AutoML.

3. Evaluate Cost & Complexity

Service Best For Cost Complexity
Pre-trained APIs Quick, no ML expertise needed Pay-per-use (cheap for low volume) Low
AutoML Fast prototyping, low-code Higher than custom training Medium
BigQuery ML Simple models, SQL users Low (BQ pricing) Low
Vertex AI Training Custom models, full control High (GPU/TPU costs) High
Vertex AI Endpoints Real-time inference Pay-per-use + instance costs Medium

4. Deploy & Monitor

  • Real-time inference?-Vertex AI Endpoints (with autoscaling).
  • Batch inference?-Vertex AI Batch Prediction or BigQuery ML.
  • Monitor drift?-Vertex AI Model Monitoring (tracks feature/data drift).
  • Retrain models?-Vertex AI Pipelines (automate retraining).

5. Optimize for Performance

  • Low-latency needs?-Vertex AI Endpoints (warm start) or Vertex AI Matching Engine (for vector search).
  • High throughput?-Batch prediction or distributed training.
  • Cost-sensitive?-Preemptible VMs (for training) or BigQuery ML (for simple models).

Common Mistakes

Mistake 1: Using AutoML for Everything

  • Why it’s wrong: AutoML is expensive for large datasets and lacks flexibility (e.g., custom loss functions).
  • Correction: Use AutoML for quick prototyping, but switch to Vertex AI Training for production (cheaper, more control).

Mistake 2: Ignoring BigQuery ML for Structured Data

  • Why it’s wrong: Moving data from BigQuery to Vertex AI adds complexity and cost.
  • Correction: If your data is already in BigQuery and you need a simple model (linear regression, classification), use BigQuery ML (no data movement, SQL-based).

Mistake 3: Deploying Every Model as a Real-Time Endpoint

  • Why it’s wrong: Real-time endpoints cost more (always-on instances) and add latency (cold starts).
  • Correction:
  • Batch predictions?-Use Vertex AI Batch Prediction (cheaper, scheduled).
  • Real-time but low traffic?-Use Vertex AI Endpoints with min instances = 0 (scales to zero when idle).

Mistake 4: Not Using Vertex AI Feature Store for Shared Features

  • Why it’s wrong: Recomputing features in training and inference causes drift and wastes compute.
  • Correction: Use Vertex AI Feature Store to share features between training and serving (reduces drift and duplication).

Mistake 5: Choosing Vertex AI Matching Engine for Exact Searches

  • Why it’s wrong: Matching Engine is optimized for approximate nearest neighbor (ANN), not exact matches.
  • Correction:
  • Exact search?-Use BigQuery or Cloud SQL.
  • Semantic search (RAG, recommendations)?-Use Vertex AI Matching Engine.

Certification Exam Insights

1. Service Selection Traps

  • AutoML vs. Vertex AI Training:
  • AutoML: Best for quick, low-code models (e.g., a marketing team needs a churn model fast).
  • Vertex AI Training: Best for custom models (e.g., a research team needs a novel architecture).
  • BigQuery ML vs. Vertex AI:
  • BigQuery ML: Best for SQL users with structured data (e.g., a data analyst wants to predict sales).
  • Vertex AI: Best for unstructured data (images, text) or complex models (deep learning).
  • Vertex AI Endpoints vs. Batch Prediction:
  • Endpoints: Real-time (e.g., fraud detection).
  • Batch: Scheduled (e.g., nightly churn predictions).

2. Key Constraints to Know

  • AutoML Tables:
  • Max dataset size: 100M rows (for training).
  • Max columns: 1,000.
  • Vertex AI Training:
  • Max training time: 24 hours (for custom jobs).
  • GPU/TPU limits: Depends on quota (request increases if needed).
  • Vertex AI Endpoints:
  • Cold start latency: ~5-10s (mitigate with min instances).
  • Max model size: 10GB (for online prediction).

3. "Which Service?" Scenarios

  • Scenario: A retail company wants to deploy a product recommendation engine with low-latency vector search.
  • Answer: Vertex AI Matching Engine (ANN for recommendations).
  • Scenario: A bank needs to extract data from loan applications (PDFs) and store it in BigQuery.
  • Answer: Document AI (OCR + NLP)-BigQuery.
  • Scenario: A data scientist wants to train a simple logistic regression model on sales data in BigQuery.
  • Answer: BigQuery ML (no data movement, SQL-based).
  • Scenario: A gaming company needs real-time fraud detection with sub-100ms latency.
  • Answer: Vertex AI Endpoints (with warm start) + Vertex AI Model Monitoring.

Quick Check Questions

1. A healthcare startup needs to classify X-ray images with a custom deep learning model. They have 50,000 labeled images and need full control over the architecture. Which GCP service should they use?

  • Answer: Vertex AI Training (custom model, full control, supports TF/PyTorch).
  • Why? AutoML is too restrictive; Vertex AI Training allows custom architectures.

2. A logistics company wants to predict package delivery times using historical data in BigQuery. The model should be simple (linear regression) and maintained by SQL-savvy analysts. Which service is best?

  • Answer: BigQuery ML (SQL-based, no data movement, simple models).
  • Why? BigQuery ML is cheaper and easier for SQL users than Vertex AI.

3. An e-commerce site needs to deploy a real-time recommendation system that suggests products based on user behavior (clicks, purchases). Latency must be <100ms. Which service should they use?

  • Answer: Vertex AI Matching Engine (low-latency ANN for recommendations).
  • Why? Matching Engine is optimized for vector similarity search (better than Vertex AI Endpoints for this use case).

Last-Minute Cram Sheet

  1. AutoML vs. Vertex AI Training:
  2. AutoML = quick, low-code (but expensive).
  3. Vertex AI Training = custom, full control (cheaper for large datasets).

  4. BigQuery ML vs. Vertex AI:

  5. BigQuery ML = SQL users, structured data, simple models.
  6. Vertex AI = unstructured data, complex models.

  7. Real-time vs. Batch Prediction:

  8. Real-time-Vertex AI Endpoints (pay for uptime).
  9. Batch-Vertex AI Batch Prediction (cheaper, scheduled).

  10. Vertex AI Matching Engine:

  11. ANN for similarity search (RAG, recommendations).
  12. Not for exact matches (use BigQuery instead).

  13. Document AI:

  14. Pre-trained OCR + NLP for documents (invoices, receipts, forms).
  15. Not for custom document layouts (use AutoML Vision instead).

  16. Vertex AI Feature Store:

  17. Centralized features (reduces drift, avoids recomputation).
  18. Overkill for small projects (use BigQuery views instead).

  19. Vertex AI Pipelines:

  20. Orchestrate ML workflows (Kubeflow/TFX).
  21. Not for simple jobs (use Cloud Scheduler + Cloud Functions).

  22. Cold Start Mitigation:

  23. Min instances = 1 (keeps model warm, reduces latency).
  24. Costs more (always-on instance).

  25. AutoML Limits:

  26. Max dataset size: 100M rows (for tabular).
  27. Max training time: 24 hours.

  28. Vertex AI Endpoints:

    • Max model size: 10GB.
    • Default autoscaling: 0 to 10 instances.
    • Cold starts add latency (~5-10s).