Lecture 4#

Topic#

In this lecture we look at the basics of machine learning and work with sklearn for the firs time.

Lecture Slides#

Download the slides

Exercises#

Suggested Homework#

  • Re-do or finish the exercises from the class

  • Do hyperparameter-tuning for logistic regression, random forrests and boosting

  • Experiment with other datasets

Additional materials#

Introduction to Machine Learning with Python#

A slightly outdated but very accessible introduction to machine learning is the book Introduction to Machine Learning with Python by Andreas Müller and Sarah Guido.

In the lecture we had only time to cover supervised learning (and there only classification). The book would be a good source for anyone who wants to read up on unsupervised learning methods (e.g. clustering and dimensionality reduction). It also contains a lot more background on the things we covered here and

Chapter 5 of the Python Data Science Handbook#

We have used the book before. It starts by introducing some fundamental concepts and continues with in-depth chapters for different methods. Definitely have a look if you want to use a method we did not cover in enough detail here.

JEP Paper#

This paper by Mullainathan and Spiess (Journal of Economic Perspectives) is a great summary of the similarities and differences between econometrics and Machine Learning.

Online-courses#

If you want to get much deeper into machine learning, you can have a look at Sebastian Raschka’s online course. He also has a very recent course on deep learning.

Summary paper about scores for multiclass classification#

Comprehensive paper paper on the different scores for multiclass classification.