Maximum Likelihood Estimation (MLE) and Maximum a Posteriori (MAP)

Consider the theory and the notation provided in the the MLE/MAP section (https://devangelista2.github.io/statistical-mathematical-methods/regression_classification/MLE_MAP.html). Let be a polynomial regression model as in the previous Homework, and let the poly_regression_small.csv from Virtuale be the training set. Then, sample 20% of the data in the poly_regression_large.csv dataset to use as test set.

  • For a given value of , write three Python functions computing , i.e. the optimal parameters obtained by optimizing the MLE-related loss function with Gaussian assumption on the likelihood , by Gradient Descent, Stochastic Gradient Descent (with a batch_size = 5), and Normal Equations method with Cholesky Decomposition.

  • Compare the performance of the three regression model computed above. In particular, if is the test set from the poly_regression_large.csv dataset, for each of the model, compute:where is the number of elements in the test set, are the input and output elements in the test set. Comment the performance of the three models.

  • For different values of , plot the training datapoints and the test datapoints with different colors and visualize (as a continuous line) the three learned regression model . Comment the results.

  • For increasing values of , compute the training and test error as discussed above. Plot the two errors with respect to . Comment the results.

  • Repeat the same experiments by considering the MAP formulation with Gaussian assumption on the prior term . Set and test different values of in the experiments. Comment the results, comparing:

  • the three optimization method used to obtain (i.e. GD, SGD and Normal Equations),
  • the different values of tested,
  • the results obtained by vs .

Other assignments: