# Machine learning mcq with answer [21-40].

1. The square root of the variance is called the ________ deviation

A. empirical

B. Mean

C. Continious

D. Standard

Answer is D) Standard

2. For continuous random variables, the CDF is the derivative of the PDF.

A. True

B. False

Answer is B) False

3. Cumulative distribution functions are used to specify the distribution of multivariate random variables.

A. True

B. False

Answer is A) True

4. Consider the results of a medical experiment that aims to predict whether someone is

going to develop myopia based on some physical measurements and heredity. In this

case, the input dataset consists of the person’s medical characteristics and the target

variable is binary: 1 for those who are likely to develop myopia and 0 for those who

aren’t. This can be best classified as

A. Regression

B. DeSicion Tree

D. Clustering

D. Association rule

Answer is B) Desicion Tree

5. The purpose of a machine learning model is to approximate an unknown function

that associates input elements to output ones

A. True

B. False

Answer is A) True

6. Training set is normally a representation of a global distribution

A. True

B. False

Answer is A) True

7. when working with a supervised scenario, we define a non-negative error measure em which takes two arguments and allows us to compute a total error value over the whole dataset. Those two arguments are.

A. expected and predicted output

B. calculated and predicted output

C. calculated and measured output

D. None of the above

Answer is A) Expected and predicted output

8. Initial value represents a starting point over the surface of a n-variables function.A generic training algorithm has to find the global minimum or a point quite close to it (there's always a tolerance to avoid an excessive number of iterations and a consequent risk of overfitting). This measure is also called

A. loss function

B. predicted output

C. measured output

D. mean square error

Answer is A) Loss function

9. In 1984, the computer scientist L. Valiant proposed a mathematical approach to determine whether a problem is learnable by a computer. The name of this technique is

A. Max likelihood

B. Zero one loss error

C. Probably approximately correct

D. All of the above

Answeris C) Probably approximately correct

10. In particular, a concept is a subset of input patterns X which determine the same

output element

A. True

B. False

Answer is A) true

11. An exponential time could lead to computational explosions when the datasets are too large or the optimization starting point is very far from an acceptable minimum.

Moreover, it's important to remember the so-called

A. curse of dimensionality

B. Hughes phenomenon

C. Probably approximately correct

Answer is A) Curse of demensionality

12. In many cases, in order to capture the full expressivity, it's necessary to have a very large dataset and without enough training data, the approximation can become problematic. This is called

A. curse of dimensionality

B. Hughes phenomenon

C. Probably approximately correct

Answer is B) Hughes phenomenon

13, Which of the following step / assumption in regression modeling impacts the trade-off between under-fitting and over-fitting the most

A. The polynomial degree

B. Whether we learn the weights by matrix inversion or gradient descent

C. The use of a constant-term

Answer is A) The Polynomial degree

14. Let’s say, a “Linear regression” model perfectly fits the training data (train error is zero). Now, Which of the following statement is true?

A. You will always have test error zero

B. You can not have test error zero

C. None of the above

Answer is C) None of the above

15. Which one of the statement is true regarding residuals in regression analysis?

A. Mean of residuals is always zero

B. Mean of residuals is always less than zero

C. Mean of residuals is always greater than zero

D. There is no such rule for residuals.

Answer is A) Mean of residuals is always zero

16. Which of the one is true about Heteroskedasticity?

A. Linear Regression with varying error terms

B. Linear Regression with constant error terms

C. Linear Regression with zero error terms

D. None of te above

Answer is A) Linear regression with varying error terms

17. To test linear relationship of y(dependent) and x(independent) continuous variables, which of the following plot best suited?

A. Scatter plot

B. Barchart

C. Histograms

D. None of these

Answer is A) Scatter plot

18. Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?

A. Ridge regression uses subset selection of features

B. Lasso regression uses subset selection of features

C. Both use subset selection of features

D. All of the above

Answer is B) Lasso regression uses subset selection of features

19. If two variables are correlated, is it necessary that they have a linear relationsh

A. Yes

B. No

Answer is B) No

20. Correlated variables can have zero correlation coeffficient. True or False?

A. true

B. false

Answer is A) true