Chapter 1 Introduction

This is a collection of python code examples, designed as the companion to machine learning lecture notes. The (reasonably) updated version of these notes are at UW faculty page. The original of these notes is on the BitBucket repository.

1.1 Topics

These notes cover broadly the following topics:

  • python, and handling data in python
  • linear and logistic regression
  • a selection of machine learning models using sklearn
  • convolutional networks with keras.

1.2 Scope of these notes

These notes focus on the python-specific coding concepts, how to solve certain data science–related tasks in python, and on basic functionality of the more popular libraries. More detailed functionality is not covered. This is a deliberate choice–many questions are easier to answer with just a quick web search than by reading textbooks like this. This applies very much to the usage of library function–most of those have a large number of arguments that are easy enough to look up in the corresponding documentation.

However, not everything can be easily looked up. In particular, if the underlying concepts are not clear, then both documentation and examples on Stackoverflow may not be of much help. So these notes focus on concepts, such as indexing and slicing, index in data frames, or on the basic functionality of statsmodels library.

Second, these notes do not attempt to provide a comprehensive review of certain aspects of python. Such reviews are rather left to well-written books, such as Lubanovic Introducing Python: Modern Computing in Simple Packages or McKinney Python for data analysis. Instead, it is intended to be a quick, easy, but superficial introduction. There are much more functionality in python than what is covered in these notes!

Finally, these notes barely discuss the underlying mathematical concepts. Again, this is a deliberate choice as it is designed to be a companion, not a substitute of lecture notes.