By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.
Topic: Vertex Pipelines, Kubeflow, Cloud Build
MLOps (Machine Learning Operations) is the practice of automating and scaling ML workflows—from data prep to training, deployment, and monitoring—while ensuring reproducibility, governance, and CI/CD (Continuous Integration/Continuous Deployment). In GCP, Vertex AI Pipelines (managed Kubeflow Pipelines) and Cloud Build are the backbone for orchestrating ML workflows, while Kubeflow (open-source) provides a portable alternative for hybrid/multi-cloud setups.
Real-world scenario: A fintech company trains a fraud-detection model nightly on fresh transaction data. They use Vertex Pipelines to automate:1. Data validation (BigQuery-Dataflow for cleaning),2. Feature engineering (Vertex AI Feature Store),3. Model training (Vertex AI Training),4. A/B testing (Vertex AI Endpoints),5. Rollback if drift exceeds 5% (Vertex AI Model Monitoring). Cloud Build triggers the pipeline on Git pushes, ensuring code changes are tested before deployment.
Vertex AI Pipelines: GCP’s managed service for running Kubeflow Pipelines (KFP) or TensorFlow Extended (TFX) workflows. Handles scheduling, artifact tracking, and execution on Google Kubernetes Engine (GKE) or serverless. Best for production-grade MLOps with minimal DevOps overhead.
Kubeflow Pipelines (KFP): Open-source framework for building portable ML pipelines (works on GKE, AWS EKS, or on-prem). Uses Python SDK to define pipelines as DAGs (Directed Acyclic Graphs) of components. Ideal for hybrid/multi-cloud or teams needing customization.
Cloud Build: GCP’s serverless CI/CD service for automating builds, tests, and deployments. Triggers pipelines on Git events (e.g., git push to Cloud Source Repositories) or schedules. Supports Docker, Terraform, and custom scripts.
git push
Vertex AI Components: Reusable building blocks for pipelines (e.g., Vertex AI Training, Vertex AI Hyperparameter Tuning, Vertex AI Batch Prediction). Each component is a containerized step with inputs/outputs tracked in Vertex ML Metadata.
Artifact Registry: GCP’s managed container registry for storing Docker images (e.g., custom training containers). Replaces Container Registry (deprecated). Critical for reproducible builds in pipelines.
Vertex ML Metadata: GCP’s lineage tracking service for ML artifacts (datasets, models, metrics). Automatically logs inputs/outputs of pipeline steps. Enables auditability and reproducibility.
TFX (TensorFlow Extended): Google’s open-source end-to-end ML platform for production pipelines. Integrates with Vertex Pipelines for orchestration. Best for TensorFlow-centric workflows (e.g., TF Serving, TFX libraries like ExampleGen).
ExampleGen
Kubeflow on GKE: Self-managed Kubeflow deployment on GKE. Offers more control than Vertex Pipelines but requires DevOps expertise (e.g., managing Istio, Katib for HPO). Useful for custom ML frameworks (e.g., PyTorch, XGBoost).
Cloud Scheduler: GCP’s cron service for triggering pipelines on a schedule (e.g., nightly retraining). Works with Cloud Functions or Cloud Build to start Vertex Pipelines.
Vertex AI Feature Store: Managed feature repository for serving pre-computed features to training/inference. Reduces training-serving skew and feature duplication. Integrates with BigQuery and Dataflow.
CI/CD for ML: The practice of automating ML workflows (testing, training, deployment) using tools like Cloud Build, GitHub Actions, or GitLab CI. Key steps:
Deploy-Monitor (A/B test, rollback if needed).
Pipeline Triggers: Mechanisms to start pipelines automatically:
main
tfx.orchestration.pipeline.Pipeline
@dsl.pipeline def fraud_detection_pipeline(): data = dsl.importer(...) processed_data = preprocess(data=data) model = train(data=processed_data.output) ```
gcloud builds submit --tag gcr.io/PROJECT_ID/preprocess:v1
python:3.9-slim
kfp.compiler.Compiler().compile(pipeline_func, 'pipeline.json')
python from google.cloud import aiplatform aiplatform.init(project=PROJECT_ID, location=REGION) job = aiplatform.PipelineJob( display_name="fraud-detection", template_path="pipeline.json", parameter_values={"param1": "value1"} ) job.run()
data_path
model_version
cloudbuild.yaml
push
data_path: str
python @dsl.pipeline def pipeline(data_path: str): data = dsl.importer(artifact_uri=data_path, ...)
python from kfp.v2 import dsl @dsl.component def train(data: Input[Dataset], model: Output[Model]): # Training logic model.metadata["accuracy"] = 0.95 # Log metrics
gcloud
python job = aiplatform.PipelineJob(..., enable_caching=True)
Trainer
A retail company wants to retrain its recommendation model daily using fresh user clickstream data. The pipeline must run automatically on a schedule, log all artifacts, and support rollback if model performance degrades. Which GCP services should they use? Answer: Vertex AI Pipelines + Cloud Scheduler + Vertex ML Metadata + Vertex AI Model Monitoring. - Why? Vertex Pipelines orchestrates the workflow, Cloud Scheduler triggers it daily, ML Metadata tracks artifacts, and Model Monitoring detects drift.
A data scientist wants to test a new preprocessing step in their pipeline without affecting production. They use GitHub for version control. What’s the most efficient way to implement this? Answer: Cloud Build trigger on a feature branch + Vertex Pipelines with parameterized inputs. - Why? Cloud Build can run the pipeline on a feature branch, and parameters allow testing without modifying production code.
A team is migrating from AWS to GCP. Their current pipeline uses SageMaker Pipelines and ECR. Which GCP services should they use for equivalent functionality? Answer: Vertex AI Pipelines + Artifact Registry. - Why? Vertex Pipelines replaces SageMaker Pipelines, and Artifact Registry replaces ECR.
kfp.compiler.Compiler()
enable_caching=True
Join 4M+ learners. Unlock unlimited quizzes, wrong-answer tracking, flashcards + reminders, study guides, and 1-on-1 challenges.