Fatskills
Practice. Master. Repeat.
Study Guide: Google Cloud Certified Data Engineer: 11. Measuring, Monitoring, and Troubleshooting Machine Learning Models - Important Things To Know
Source: https://www.fatskills.com/google-cloud-certified-professional-data-engineer/chapter/google-cloud-certified-data-engineer-11-measuring-monitoring-and-troubleshooting-machine-learning-models-important-things-to-know

Google Cloud Certified Data Engineer: 11. Measuring, Monitoring, and Troubleshooting Machine Learning Models - Important Things To Know

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~3 min read

1. Know the three types of machine learning algorithms: supervised, unsupervised, and reinforcement learning. Supervised algorithms learn from labeled examples. Unsupervised learning starts with unlabeled data and identifies salient features, such as groups or clusters, and anomalies in a data stream. Reinforcement learning is a third type of machine learning algorithm that is distinct from supervised and unsupervised learning. It trains a model by interacting with its environment and receiving feedback on the decisions that it makes.
2. Know that supervised learning is used for classification and regression. Classification models assign discrete values to instances. The simplest form is a binary classifier that assigns one of two values, such as fraudulent/not fraudulent, or has malignant tumor/does not have malignant tumor. Multiclass classification models assign more than two values.
3. Regression models map continuous variables to other continuous variables.
4. Understand how unsupervised learning differs from supervised learnin
g. Unsupervised learning algorithms find patterns in data without using predefined labels. Three types of unsupervised learning are clustering, anomaly detection, and collaborative filtering.
5. Clustering, or cluster analysis, is the process of grouping instances together based on common features. Anomaly detection is the process of identifying unexpected patterns in data.
6. Understand how reinforcement learning differs from supervised and unsupervised techniques.
Reinforcement learning is an approach to learning that uses agents interacting with an environment and adapting behavior based on rewards from the environment. This form of learning does not depend on labels. Reinforcement learning is modeled as an environment, a set of agents, a set of actions, and a set of probabilities of transitioning from one state to another after a particular action is taken. A reward is given after the transition from one state to another following an action.
7. Understand the structure of neural networks, particularly deep learning networks.
Neural networks are systems roughly modeled after neurons in animal brains and consist of sets of connected artificial neurons or nodes. The network is composed of artificial neurons that are linked together into a network. The links between artificial neurons are called connections. A single neuron is limited in what it can learn. A multilayer network, however, is able to learn more functions. A multilayer neural network consists of a set of input nodes, hidden nodes, and an output layer.
8. Know machine learning terminology. This includes general machine learning terminology, such as baseline and batches; feature terminology, such as feature engineering and bucketing; training terminology, such as gradient descent and backpropagation; and neural network and deep learning terms, such as activation function and dropout. Finally, know model evaluation terminology such as precision and recall.
Know common sources of errors, including data-quality errors, unbalanced training sets, and bias. Poor-quality data leads to poor models. Some common data-quality problems are missing data, invalid values, inconsistent use of codes and categories, and data that is not representative of the population at large. Unbalanced datasets are ones that have significantly more instances of some categories than of others. There are several forms of bias, including automation bias, reporting bias, and group attribution.



ADVERTISEMENT