Written by Katherine Giron Pe
on March 25, 2018

Data Science Math Skills

About a month ago, I completed the Data Science Math Skills course which is an introductory course to Mathematics for Data Science. Because I am privileged to have access to some premium courses for free as a mentor, this is not the first time for me to encounter a similar course. In fact, a lot of the content of the course are part of our high school, undergraduate and post-graduate curriculum. If you studied Discrete Mathematics or Discrete Structures in Computer Science, you probably already know most of what Data Science Math Skills aims to teach. The familiarity due to prior Computer Science courses and a lot of Mathematics in my early years helped a lot. I still struggled with connecting these concepts to reality. This is why I am writing this post, and why I am sharing how I am learning Mathematics for Data Science.

Data Science Math Skills Notes

Many gave me very positive feedback about my Data Science Math Skills notes. It is, without a doubt, helpful for me. The process of writing helps me remember. I am even happier that it is helpful to someone else. I found it fun to actually create notes for R Markdown and LaTeX. This can later be compiled to HTML and other formats, but for now, I just want the PDF version. Feel free to download that and read it. The repo will be very active.

Discrete Mathematics and Its Applications

This may be a repetition of my answer to the Quora question “As a programmer with weak math skills, what is the best path to learn discrete mathematics?,” but I have a few other notes on why you should study Discrete Mathematics before studying Data Science Mathematics.

Of all Mathematics courses I took, this course does not have the tediousness of Mathematics of Finance and the lack of challenge of Statistics. It’s very interesting Mathematics. It looks like Logic and a lot of programming theory.

Probability, Propositional Logic, Set Theory, Functions, Combinatorics, Graph Theory and Computational Complexity are all necessary for understanding both Data Science and Programming. Data Science does involve a lot of programming. The good news is this branch of Mathematics is the least boring of them all. I have taken a lot of Mathematics courses. This was certainly the most interesting Mathematics course.

I highly recommend reading the book Discrete Mathematics and Its Applications by Kenneth Rosen. If you’ve taken the Data Science Math Skills course and still feel very clueless, reading the book will help you out.

Think Bayes

Someone can teach you Bayes Theorem in ten minutes. Watch a few more YouTube videos, and it will sink in. It’s most helpful in understanding classifying algorithms like Naive Bayes. In fact, that is probably the most practical application of learning the Bayes Theorem. I used this classifier in the past for intuitively classifying some information. Apparently, accuracy depends on how you’re optimizing training data sets with this algorithm.

There are many other algorithms like Frequentist and Logistic Regression, but there are more research papers in favor of Bayesian Thinking. Bayesian Thinking considers your prior beliefs, and the new data or posterior data. Frequentist Thinking does not consider your initial beliefs.

I have been reading the book Think Bayes and it is probably the most definitive book about the Bayes Theorem.

More Notes

This is the first of many posts about learning Data Science and Computer Science. My notes are more helpful than the posts. I may have to curate and organize all these soon.

I make a lot of notes because I just feel like a lot of resources given out in University hide the more important lessons we need to learn. Even if you take a course in a good University, I think it will only upgrade your skills from 0 to 0.1 if you rely on a single resource.

← → Top