Fatskills
Practice. Master. Repeat.
Study Guide: All The Useful Machine Learning Interview Questions & Answers - Part 2
Source: https://www.fatskills.com/hesi/chapter/all-the-useful-machine-learning-interview-questions-answers-part-2

All The Useful Machine Learning Interview Questions & Answers - Part 2

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~18 min read

Q 78. How would you handle an imbalanced dataset?
Sampling Techniques can help with an imbalanced dataset. There are two ways to perform sampling, Under Sample or Over Sampling.

In Under Sampling, we reduce the size of the majority class to match minority class thus help by improving performance w.r.t storage and run-time execution, but it potentially discards useful information.

For Over Sampling, we upsample the Minority class and thus solve the problem of information loss, however, we get into the trouble of having Overfitting.

There are other techniques as well -
Cluster-Based Over Sampling - In this case, the K-means clustering algorithm is independently applied to minority and majority class instances. This is to identify clusters in the dataset. Subsequently, each cluster is oversampled such that all clusters of the same class have an equal number of instances and all classes have the same size

Synthetic Minority Over-sampling Technique (SMOTE) - A subset of data is taken from the minority class as an example and then new synthetic similar instances are created which are then added to the original dataset. This technique is good for Numerical data points.

Q 79. Mention some of the EDA Techniques.
Exploratory Data Analysis (EDA) helps analysts to understand the data better and forms the foundation of better models.

Visualization

Univariate visualization
Bivariate visualization
Multivariate visualization

Missing Value Treatment - Replace missing values with Either Mean/Median

Outlier Detection - Use Boxplot to identify the distribution of Outliers, then Apply IQR to set the boundary for IQR
Transformation - Based on the distribution, apply a transformation on the features
Scaling the Dataset - Apply MinMax, Standard Scaler or Z Score Scaling mechanism to scale the data.
Feature Engineering - Need of the domain, and SME knowledge helps Analyst find derivative fields which can fetch more information about the nature of the data
Dimensionality reduction - Helps in reducing the volume of data without losing much information

Q 80. Mention why feature engineering is important in model building and list out some of the techniques used for feature engineering.
Algorithms necessitate features with some specific characteristics to work appropriately. The data is initially in a raw form. You need to extract features from this data before supplying it to the algorithm. This process is called feature engineering. When you have relevant features, the complexity of the algorithms reduces. Then, even if a non-ideal algorithm is used, results come out to be accurate.

Feature engineering primarily has two goals:

Prepare the suitable input data set to be compatible with the machine learning algorithm constraints.
Enhance the performance of machine learning models.
Some of the techniques used for feature engineering include Imputation, Binning, Outliers Handling, Log transform, grouping operations, One-Hot encoding, Feature split, Scaling, Extracting date.

Q 81. Differentiate between Statistical Modeling and Machine Learning.
Machine learning models are about making accurate predictions about the situations, like Foot Fall in restaurants, Stock-Price, etc. where-as, Statistical models are designed for inference about the relationships between variables, as What drives the sales in a restaurant, is it food or Ambience.

Q 82. Differentiate between Boosting and Bagging?
Bagging and Boosting are variants of Ensemble Techniques.

Bootstrap Aggregation or bagging is a method that is used to reduce the variance for algorithms having very high variance. Decision trees are a particular family of classifiers which are susceptible to having high bias.

Decision trees have a lot of sensitiveness to the type of data they are trained on. Hence generalization of results is often much more complex to achieve in them despite very high fine-tuning. The results vary greatly if the training data is changed in decision trees.

Hence bagging is utilised where multiple decision trees are made which are trained on samples of the original data and the final result is the average of all these individual models.

Boosting is the process of using an n-weak classifier system for prediction such that every weak classifier compensates for the weaknesses of its classifiers. By weak classifier, we imply a classifier which performs poorly on a given data set.

It's evident that boosting is not an algorithm rather it's a process. Weak classifiers used are generally logistic regression, shallow decision trees etc.

There are many algorithms which make use of boosting processes but two of them are mainly used: Adaboost and Gradient Boosting and XGBoost.

Q 83. What is the significance of Gamma and Regularization in SVM?
The gamma defines influence. Low values meaning 'far' and high values meaning 'close'. If gamma is too large, the radius of the area of influence of the support vectors only includes the support vector itself and no amount of regularization with C will be able to prevent overfitting. If gamma is very small, the model is too constrained and cannot capture the complexity of the data.

The regularization parameter (lambda) serves as a degree of importance that is given to miss-classifications. This can be used to draw the tradeoff with OverFitting.

Q 84. Define ROC curve work.
The graphical representation of the contrast between true positive rates and the false positive rate at various thresholds is known as the ROC curve. It is used as a proxy for the trade-off between true positives vs the false positives.

Q 85. What is the difference between a generative and discriminative model?
A generative model learns the different categories of data. On the other hand, a discriminative model will only learn the distinctions between different categories of data. Discriminative models perform much better than the generative models when it comes to classification tasks.

Q 86. What are hyperparameters and how are they different from parameters?
A parameter is a variable that is internal to the model and whose value is estimated from the training data. They are often saved as part of the learned model. Examples include weights, biases etc.

A hyperparameter is a variable that is external to the model whose value cannot be estimated from the data. They are often used to estimate model parameters. The choice of parameters is sensitive to implementation. Examples include learning rate, hidden layers etc.

Q 87. What is shattering a set of points? Explain VC dimension.
In order to shatter a given configuration of points, a classifier must be able to, for all possible assignments of positive and negative for the points, perfectly partition the plane such that positive points are separated from negative points. For a configuration of n points, there are 2n possible assignments of positive or negative.

When choosing a classifier, we need to consider the type of data to be classified and this can be known by VC dimension of a classifier. It is defined as cardinality of the largest set of points that the classification algorithm i.e. the classifier can shatter. In order to have a VC dimension of at least n, a classifier must be able to shatter a single given configuration of n points.

Q 88. What are some differences between a linked list and an array?
Arrays and Linked lists are both used to store linear data of similar types. However, there are a few difference between them.

Array Vs   Linked List
Elements are well-indexed, making specific element accessing easier   - Elements need to be accessed in a cumulative manner
Operations (insertion, deletion) are faster in array -    Linked list takes linear time, making operations a bit slower
Arrays are of fixed size   - Linked lists are dynamic and flexible
Memory is assigned during compile time in an array   - Memory is allocated during execution or runtime in Linked list.
Elements are stored consecutively in arrays.   - Elements are stored randomly in Linked list
Memory utilization is inefficient in the array   - Memory utilization is efficient in the linked list.

Q 89. What is the meshgrid () method and the contourf () method? State some usesof both.
The meshgrid( ) function in numpy takes two arguments as input : range of x-values in the grid, range of y-values in the grid whereas meshgrid needs to be built before the contourf( ) function in matplotlib is used which takes in many inputs : x-values, y-values, fitting curve (contour line) to be plotted in grid, colours etc.

Meshgrid () function is used to create a grid using 1-D arrays of x-axis inputs and y-axis inputs to represent the matrix indexing. Contourf () is used to draw filled contours using the given x-axis inputs, y-axis inputs, contour line, colours etc.

Q 90. Describe a hash table.
Hashing is a technique for identifying unique objects from a group of similar objects. Hash functions are large keys converted into small keys in hashing techniques. The values of hash functions are stored in data structures which are known hash table.

Q 91. List the advantages and disadvantages of using neural networks.

Advantages:
We can store information on the entire network instead of storing it in a database. It has the ability to work and give a good accuracy even with inadequate information. A neural network has parallel processing ability and distributed memory.

Disadvantages:
Neural Networks requires processors which are capable of parallel processing. It's unexplained functioning of the network is also quite an issue as it reduces the trust in the network in some situations like when we have to show the problem we noticed to the network. Duration of the network is mostly unknown. We can only know that the training is finished by looking at the error value but it doesn't give us optimal results.

Q 92. You have to train a 12GB dataset using a neural network with a machine which has only 3GB RAM. How would you go about it?
We can use NumPy arrays to solve this issue. Load all the data into an array. In NumPy, arrays have a property to map the complete dataset without loading it completely in memory. We can pass the index of the array, dividing data into batches, to get the data required and then pass the data into the neural networks. But be careful about keeping the batch size normal.

Q 93. Write a simple code to binarize data.
Conversion of data into binary values on the basis of certain threshold is known as binarizing of data. Values below the threshold are set to 0 and those above the threshold are set to 1 which is useful for feature engineering.

Code:

from sklearn.preprocessing import Binarizer
import pandas
import numpy
names_list = ['Alaska', 'Pramod', 'Pierce', 'Sandra', 'Soundarya', 'Meredith', 'Richard', 'Jackson', 'Tom','Joe']
data_frame = pandas.read_csv(url, names=names_list)
array = dataframe.values
# Splitting the array into input and output
A = array [: 0:7]
B = array [:7]
binarizer = Binarizer(threshold=0.0). fit(X)
binaryA = binarizer.transform(A)
numpy.set_printoptions(precision=5)
print (binaryA [0:7:])

Q 94. What is an Array?
The array is defined as a collection of similar items, stored in a contiguous manner. Arrays is an intuitive concept as the need to group similar objects together arises in our day to day lives. Arrays satisfy the same need. How are they stored in the memory? Arrays consume blocks of data, where each element in the array consumes one unit of memory. The size of the unit depends on the type of data being used. For example, if the data type of elements of the array is int, then 4 bytes of data will be used to store each element. For character data type, 1 byte will be used. This is implementation specific, and the above units may change from computer to computer.

Example:

fruits = ['apple', banana', pineapple']

In the above case, fruits is a list that comprises of three fruits. To access them individually, we use their indexes. Python and C are 0- indexed languages, that is, the first index is 0. MATLAB on the contrary starts from 1, and thus is a 1-indexed language.

Q 95. What are the advantages and disadvantages of using an Array?
Advantages:
Random access is enabled
Saves memory
Cache friendly
Predictable compile timing
Helps in re-usability of code
Disadvantages:
Addition and deletion of records is time consuming even though we get the element of interest immediately through random access. This is due to the fact that the elements need to be reordered after insertion or deletion.
If contiguous blocks of memory are not available in the memory, then there is an overhead on the CPU to search for the most optimal contiguous location available for the requirement.
Now that we know what arrays are, we shall understand them in detail by solving some interview questions. Before that, let us see the functions that Python as a language provides for arrays, also known as, lists.

append() - Adds an element at the end of the list
copy() - returns a copy of a list.
reverse() - reverses the elements of the list
sort() - sorts the elements in ascending order by default.

Q 96. What is Lists in Python?
Lists is an effective data structure provided in python. There are various functionalities associated with the same. Let us consider the scenario where we want to copy a list to another list. If the same operation had to be done in C programming language, we would have to write our own function to implement the same.

On the contrary, Python provides us with a function called copy. We can copy a list to another just by calling the copy function.

new_list = old_list.copy()
We need to be careful while using the function. copy() is a shallow copy function, that is, it only stores the references of the original list in the new list. If the given argument is a compound data structure like a list then python creates another object of the same type (in this case, a new list) but for everything inside old list, only their reference is copied. Essentially, the new list consists of references to the elements of the older list.

Hence, upon changing the original list, the new list values also change. This can be dangerous in many applications. Therefore, Python provides us with another functionality called as deepcopy. Intuitively, we may consider that deepcopy() would follow the same paradigm, and the only difference would be that for each element we will recursively call deepcopy. Practically, this is not the case.

deepcopy() preserves the graphical structure of the original compound data. Let us understand this better with the help of an example:

import copy.deepcopy
a = [1,2]
b = [a,a] # there's only 1 object a
c = deepcopy(b)

# check the result by executing these lines
c[0] is a # return False, a new object a' is created
c[0] is c[1] # return True, c is [a',a'] not [a',a'']
This is the tricky part, during the process of deepcopy() a hashtable implemented as a dictionary in python is used to map: old_object reference onto new_object reference.

Therefore, this prevents unnecessary duplicates and thus preserves the structure of the copied compound data structure. Thus, in this case, c[0] is not equal to a, as internally their addresses are different.

Normal copy
>>> a = [[1, 2, 3], [4, 5, 6]]
>>> b = list(a)
>>> a
[[1, 2, 3], [4, 5, 6]]
>>> b
[[1, 2, 3], [4, 5, 6]]
>>> a[0][1] = 10
>>> a
[[1, 10, 3], [4, 5, 6]]
>>> b # b changes too -> Not a deepcopy.
[[1, 10, 3], [4, 5, 6]]

Deep copy

>>> import copy
>>> b = copy.deepcopy(a)
>>> a
[[1, 10, 3], [4, 5, 6]]
>>> b
[[1, 10, 3], [4, 5, 6]]
>>> a[0][1] = 9
>>> a
[[1, 9, 3], [4, 5, 6]]
>>> b # b doesn't change -> Deep Copy
[[1, 10, 3], [4, 5, 6]]

Q 97. Given an array of integers where each element represents the max number of steps that can be made forward from that element. The task is to find the minimum number of jumps to reach the end of the array (starting from the first element). If an element is 0, then cannot move through that element.
Solution: This problem is famously called as end of array problem. We want to determine the minimum number of jumps required in order to reach the end. The element in the array represents the maximum number of jumps that, that particular element can take.

Let us understand how to approach the problem initially.

We need to reach the end. Therefore, let us have a count that tells us how near we are to the end. Consider the array A=[1,2,3,1,1]

In the above example we can go from
> 2 - >3 - > 1 - > 1 - 4 jumps

Q 1 - > 2 - > 1 - > 1 - 3 jumps

Q 1 - > 2 - > 3 - > 1 - 3 jumps
Hence, we have a fair idea of the problem. Let us come up with a logic for the same.

Let us start from the end and move backwards as that makes more sense intuitionally. We will use variables right and prev_r denoting previous right to keep track of the jumps.

Initially, right = prev_r = the last but one element. We consider the distance of an element to the end, and the number of jumps possible by that element. Therefore, if the sum of the number of jumps possible and the distance is greater than the previous element, then we will discard the previous element and use the second element's value to jump. Try it out using a pen and paper first. The logic will seem very straight forward to implement. Later, implement it on your own and then verify with the result.

def min_jmp(arr):

n = len(arr)
right = prev_r = n-1
count = 0

# We start from rightmost index and travesre array to find the leftmost index
# from which we can reach index 'right'
while True:
for j in (range(prev_r-1,-1,-1)):
if j + arr[j] >= prev_r:
right = j

if prev_r != right:
prev_r = right
else:
break

count += 1

return count if right == 0 else -1

# Enter the elements separated by a space
arr = list(map(int, input().split()))
print(min_jmp(n, arr))

Q 98. Given a string S consisting only 'a's and 'b's, print the last index of the 'b' present in it.
When we have are given a string of a's and b's, we can immediately find out the first location of a character occurring. Therefore, to find the last occurrence of a character, we reverse the string and find the first occurrence, which is equivalent to the last occurrence in the original string.

Here, we are given input as a string. Therefore, we begin by splitting the characters element wise using the function split. Later, we reverse the array, find the first occurrence position value, and get the index by finding the value len - position -1, where position is the index value.

def split(word):
return [(char) for char in word]

a = input()
a= split(a)
a_rev = a[:: 1]
pos = -1
for i in range(len(a_rev)):
if a_rev[i] == 'b':
pos = len(a_rev)- i -1
print(pos)
break
else:
continue
if pos==-1:
print(-1)

Q 99. Rotate the elements of an array by d positions to the left. Let us initially look at an example.
A = [1,2,3,4,5]
A <<2
[3,4,5,1,2]
A<<3
[4,5,1,2,3]
There exists a pattern here, that is, the first d elements are being interchanged with last n-d +1 elements. Therefore we can just swap the elements. Correct? What if the size of the array is huge, say 10000 elements. There are chances of memory error, run-time error etc. Therefore, we do it more carefully. We rotate the elements one by one in order to prevent the above errors, in case of large arrays.

# Rotate all the elements left by 1 position
def rot_left_once ( arr):
n = len( arr)
tmp = arr [0]
for i in range ( n-1): #[0,n-2]
arr[i] = arr[i + 1]
arr[n-1] = tmp

# Use the above function to repeat the process for d times.
def rot_left (arr, d):
n = len (arr)
for i in range (d):
rot_left_once ( arr, n)

arr = list( map( int, input().split()))
rot =int( input())
leftRotate ( arr, rot)

for i in range( len(arr)):
print( arr[i], end=' ')

Q 100. Water Trapping Problem:
Given an array arr[] of N non-negative integers which represents the height of blocks at index I, where the width of each block is 1. Compute how much water can be trapped in between blocks after raining.

# Structure is like below:

# | |

# |_|

# answer is we can trap two units of water.

Solution: We are given an array, where each element denotes the height of the block. One unit of height is equal to one unit of water, given there exists space between the 2 elements to store it. Therefore, we need to find out all such pairs that exist which can store water. We need to take care of the possible cases:

There should be no overlap of water saved
Water should not overflow
Therefore, let us find start with the extreme elements, and move towards the centre.

n = int(input())
arr = [int(i) for i in input().split()]
left, right = [arr[0]], [0] * n
# left =[arr[0]]
#right = [ 0 0 0 0â€¦0] n terms
right[n-1] = arr[-1] # right most element
# we use two arrays left[ ] and right[ ], which keep track of elements greater than all
# elements the order of traversal respectively.

for elem in arr[1 : ]:
left.append(max(left[-1], elem) )
for i in range( len( arr)-2, -1, -1):
right[i] = max( arr[i] , right[i+1] )
water = 0
# once we have the arrays left, and right, we can find the water capacity between these arrays.

for i in range( 1, n - 1):
add_water = min( left[i - 1], right[i]) - arr[i]
if add_water > 0:
water += add_water
print(water)

Also see:
All The Useful Machine Learning Interview Questions & Answers - Part 1

All The Useful Machine Learning Interview Questions & Answers - Part 3

⚡ Recently practiced quizzes in this class

Machine Learning Test Machine Learning: Recommendation Systems Questions Machine Learning 101 Practice Test: Linear Regression Machine Learning Basics Knowledge Test Machine Learning 101 Practice Test: Fundamental Theorem of PAC Learning Machine Learning 101 Practice Test: Kernels And Kernel Trick Machine Learning 101 Practice Test: K-Nearest Neighbor Algorithm and Nearest Neighbor Analysis Machine Learning 101 Practice Test: Neural Networks in Machine Learning Machine Learning 101 Practice Test: Decision Trees Machine Learning 101 Practice Test: Version Spaces, Find-S Algorithm And Candidate Elimination Algorithm

➡️ Next Study Guide

All The Useful Machine Learning Interview Questions & Answers - Part 2

❤ If you liked Fatskills, consider supporting us by checking out The Life Manuals You Never Got.

About | Explore | User Guide | Topics | Subjects | Doubt Solver | Career Aptitude Test | Answers | Free Tools | What Should We Know?
Privacy | Terms |

Without work one finishes nothing. - Ralph Waldo Emerson
© 2026 Fatskills.com

All trademarks, logos and brand names are the property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, trademarks and brands does not imply endorsement.

All The Useful Machine Learning Interview Questions & Answers - Part 2

❤ If you liked Fatskills, consider supporting us by checking out The Life Manuals You Never Got.

About | Explore | User Guide | Topics | Subjects | Doubt Solver | Career Aptitude Test | Answers | Free Tools | What Should We Know? Privacy | Terms |

Without work one finishes nothing. - Ralph Waldo Emerson© 2026 Fatskills.com

All trademarks, logos and brand names are the property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, trademarks and brands does not imply endorsement.

About | Explore | User Guide | Topics | Subjects | Doubt Solver | Career Aptitude Test | Answers | Free Tools | What Should We Know?
Privacy | Terms |

Without work one finishes nothing. - Ralph Waldo Emerson
© 2026 Fatskills.com