This article gives you everything you need to get started with machine learning. I have crawled the web for hours to find these 15 best cheat sheets in machine learning. Each cheat sheet link points directly to the PDF file. So don’t lose any more time, and start learning faster with these 15 ML cheat sheets.
(Article reading time: 12 minutes ||| Or watch the video)
Why should you care about cheat sheets?
Have you ever studied Pareto’s 80/20 principle: 80% of the causes are responsible for 20% of the effects? Here are a few real-world statistics showing the Pareto principle at work:
- “The top 100 words build about 50% of our language” (source).
- “20% of the world’s population controls 82.7% of the world’s income” (source).
- “20% of the customers are responsible for the 80% of the profit earned” (source).
Cheat sheets are the 80/20 principle applied to coding: learn 80% of the relevant material in 20% of the time.
This article compiles the list of all the best cheat sheets for machine learning. Are you a practitioner and want to move towards machine learning and data science? Are you a young data scientist just starting out with your career? Or are you a computer science student struggling to find a clear path of how to master the intimidating area of machine learning? Then check out these cheat sheets to make your life easier.
1. Supervised Learning (Afshine Amidi)
This cheat sheet is the first part of a series of cheat sheets created for the Stanford Machine Learning Class. It gives you a short and concise introduction to supervised learning.
Topics include the following:
- Supervised learning notations,
- Linear regression,
- Classification,
- Logistic regression,
- Generalized linear models,
- Support vector machines,
- Generative learning,
- Gaussian discriminant analysis,
- Naive Bayes,
- Tree-based and ensemble methods, and
- General learning theory.
2. Unsupervised Learning (Afshine Amidi)
This cheat sheet is the second part of the introductory series for the Stanford Machine Learning Class. It provides a concise introduction to unsupervised learning.
You will learn about these topics:
- Expectation-maximization (EM),
- K-means clustering,
- Hierarchical clustering,
- Clustering assessment metrics,
- Principal component analysis, and
- Independent component analysis.
3. Deep Learning (Afshine Amidi)
This is the third part of the cheat sheet series provided by the Stanford Machine Learning Class. The cheat sheet is packed with dense information about deep learning. This cheat sheet offers a promising kickstart into the hot topic of deep learning.
The cheat sheet addresses topics such as
- Introduction to neural networks,
- Entropy,
- Convolutional neural networks,
- Recurrent neural networks,
- Reinforcement learning, and
- Control.
Of course, this covers only a subspace of the broad field of deep learning, but it will give you a short and effective start into this attractive area.
4. Machine Learning Tips and Tricks (Afshine Amidi)
The fourth part of the cheat sheet series provided as part of the Stanford Machine Learning Class promises small tips and tricks in machine learning. Although the author calls it that way (“Tips and Tricks”), I believe this is merely an understatement. In reality, this cheat sheet gives you valuable insights from a highly-skilled practitioner in the field.
The topics are not only limited to
- Metrics,
- Classification,
- Regression,
- Model selection, and
- Diagnostics.
A must-read for upcoming data scientists.
5. Probabilities and Statistics (Afshine Amidi)
The fifth part of the cheat sheet series of the Stanford Machine Learning Class gives you a quick start (they call it a “refresher”) in the crucial area of probability theory and statistics. No matter in which field you will end up working, statistics will always help you on your path to becoming a machine learning professional. This refresher is definitely worth a read (and an investment of your printer ink).
Here are the topics addressed in this cheat sheet:
- Introduction to probability and combinatorics,
- Conditional probability,
- Random variables,
- Joint distributions, and
- Parameter estimation.
Get this cheat sheet now!
6. Linear Algebra and Calculus (Afshine Amidi)
Although the sixth part of the popular cheat sheet series of the Stanford Machine Learning Class does not sound too sexy, it teaches a fundamental area each machine learning professional knows well: linear algebra.
Do you struggle understanding this critical topic? Your lack of understanding will cost you weeks as soon as you start implementing practical machine learning algorithms. Simply put: you have to master linear algebra, there is no way around. So do it now and do it well.
What are the precise topics included in this cheat sheet?
- Standard matrix notation,
- Matrix operations,
- Matrix properties, and
- Matrix calculus (gradient operations).
You see, it’s all about matrices. Before you even consider diving in practical libraries used in machine learning (such as Python’s numpy, check out my HUGE numpy tutorial), study this cheat sheet first.
7. Comprehensive Stanford Master Cheat Sheet (Afshine Amidi)
This cheat sheet comprises six cheat sheets of the Stanford Machine Learning Class. It is an awesome resource, packed with information in many important subfields in Machine Learning. I highly recommend downloading this resource and studying it a whole day. It will boost your machine learning skills in little time.
The widely distributed topics of this 16-page cheat sheet include
- Supervised Learning,
- Unsupervised learning,
- Deep learning,
- Machine learning tips and tricks,
- Probabilities and statistics, and
- Linear algebra and calculus.
Don’t lose any more time reading the rest of this article and download this cheat sheet. Thanks, Afshine, for this awesome resource!
8. Data Science Cheat Sheet (Datacamp)
The datacamp cheat sheets are always worth a look. However, I would recommend this cheat sheet only for absolute beginners in the field of data science. If you focus on learning core machine learning concepts and you already have some experience, please skip this cheat sheet. But if you are just starting out with data science and machine learning – and you want to use Python as your programming language – this 1-page data science cheat sheet is for you.
The basic topics of this cheat sheet are
- Installing Python,
- Python variables and data types,
- Strings and string operations,
- Lists and list methods, and
- Basic numpy functionality (numpy is the Python library for basic linear algebra and matrix operations).
9. Keras Cheat Sheet (Datacamp)
This 1-page cheat sheet is worth your time if you are looking into the specialized machine learning tool Keras. I have not yet used Keras myself but it is considered to be the best abstraction layer for deep learning and neural networks.
Wikipedia defines Keras as follows. “Keras is an open source neural network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, or Theano. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible”.
With such a broad applicability, I am so convinced, I will check out Keras after finishing this blog post. Will you, too?
The Keras Cheat Sheet addresses the following points (from a code-centric perspective).
- Basic usage,
- Data and data structures,
- Preprocessing,
- Multilayer perceptron,
- Convolutional neural networks,
- Recurrent neural networks, and
- Model training, inference, & fine-tuning.
10. Deep Learning with Keras Cheat Sheet (RStudio)
Simply put: I love this cheat sheet. It’s about deep learning with the open-source neural network library Keras. It is visual, to the point, comprehensive, and understandable. I highly recommend checking out this cheat sheet!
- The 2-page cheat sheet gives you a quick overview of the Keras pipeline for deep learning.
- It shows you how to work with models (e.g. definition, training, prediction, fitting, and evaluation).
- Furthermore, it gives you a visual overview of how to access the diverse layers in the neural network.
- Finally, it provides a short but insightful example of the standard demo problem of handwriting recognition.
11. Visual Guide to Neural Network Infrastructures (Asimov Institute)
This 1-page visual guide gives you a quick overview of all the most common neural network infrastructures that you will find in the wild. The sheet showcases 27 different architectures. As a machine learning newbie, you will not get much out of this sheet. However, if you are a practitioner in the field of neural networks, you will like it.
The cheat sheet shows 27 neural network architectures including
- Perceptron,
- Feedforward, Radial basis network, Deep feedforward,
- Recurrent neural network, long / short term memory (LSTM), gated recurrent unit,
- Autoencoder, variational autoencoder, denoising autoencoder, sparse autoencoder,
- Markov chain, Hopfield network,
- Boltzmann machine, restricted Boltzmann machine, deep belief network, and
- Finally, deep convolutional network, deconvolutional network, deep convolutional inverse graphics network, generative adversarial network, liquid state machine, extreme learning machine, echo state network, deep residual network, kohonen network, support vector machine, and neural turing machine.
Puh, what a list!
12. Skicit-Learn Python Cheat Sheet (Datacamp)
Another 1-page PDF cheat sheet that gives you a headstart in Python’s library for machine learning scikit-learn. This library is the best single-CPU, general-purpose libraries for machine learning in Python. Python is the most popular programming language in the field of machine learning, so this cheat sheet gives you a lot of value. Get this cheat sheet if you use Python for machine learning.
The topics include
- Basic functionality such as loading and preprocessing the training data,
- Creating the model,
- Model fitting,
- Prediction and inference, and
- Evaluation metrics such as classification metrics, regression metrics, clustering metrics, cross-validation, and model tuning.
Be warned that these concepts are not explained in detail. It only shows how to use them in the skicit-learn library.
13. Scikit-learn Cheat Sheet: Choosing the Right Estimator (Scikit-learn.org)
This cheat sheet is so valuable – I cannot even describe it in words. Thanks, scikit-learn creators, for posting this awesome piece of art!
It helps you figure out which algorithm to use for which kind of problem. You simply follow the questions in the cheat sheet. As a result, you will reach the recommended algorithm for your problem at hand. This is why I love cheat sheets – they can deliver complex information in little time.
The cheat sheet divides the estimators into four classes:
- Classification,
- Clustering,
- Regression, and
- Dimensionality reduction.
Although those classes are not explored in depth, you will already know in which direction to look further. Of course, if you are already an experienced practitioner, the provided information may be too simplistic – but isn’t this true for every cheat sheet?
Build your own opinion now! (Do it.)
14. Tensorflow Cheat Sheet (Altoros)
Although this cheat sheet is not the most sophisticated one, it is still valuable being one of the few TensorFlow cheat sheets out there.
You know TensorFlow, don’t you? TensorFlow is one of the most popular Github projects and it’s created by Google. Its machine learning API is tailored to deep learning on a heterogeneous computing environment (including GPUs). Nowadays, if you push in the field of deep learning, there is no way you can avoid TensorFlow.
Get a first impression with this cheat sheet and then dive into Google’s TensorFlow system. By the way, you can also use Keras on top of TensorFlow as a more high-level abstraction layer. Check out the Keras cheat sheet described earlier.
The cheat sheet gives you hints about
- The correct installation method,
- Helper functions,
- The name of some important functions in TensorFlow, and
- Estimators.
To be frank, I would not recommend learning TensorFlow with this cheat sheet. Why? Because it is not focused on education. Yet, I felt obliged to include the link because there are no better alternatives for TensorFlow. If you know a better resource, please let me know.
15. Machine Learning Test Cheat Sheet (Cheatography)
Do you know cheatography? It’s like Wikipedia for cheat sheets. Everybody can submit cheat sheets (user-generated content).
After going through most machine learning cheat sheets at Cheatography, I found that this one will be most helpful for most of our readers. It is a well-structured overview of some important machine learning algorithms.
- It shows you that there are three common problems in machine learning: regression, clustering, and classification.
- It gives you the general steps for training a model.
- Finally, it glances over a collection of specific algorithms that you should know when starting out in the field of machine learning. Those are logistic regression, decision tree, random forest, k-means, naive Bayes, k nearest neighbors, and support vector machines.
I know that it is only a first dip into the ocean. But if you are a beginner or intermediate machine learning practicioner, this may just be what you have looked for.
Did you enjoy this collection of the 15 best machine learning cheat sheets on the web? I recommend to download all 15 sheets, print them and work through each of them. This will give you a first overview of the field of machine learning. Later, you can decide in which area to dive in further.
Bonus: Many hot machine learning systems (e.g. TensorFlow) require excellent Python programming skills. Do you know all the features, tips, and tricks of Python? If not, I recommend to check out this free Python cheat sheet email course.
The email course will not only provide you with 5 Python cheat sheets (80% of the learning in 20% of the time, remember?) but also with a constant stream of Python programming lectures. It’s 100% free, you can unsubscribe at any time, and I will not spam you. It’s pure value (and occasionally I will send you information about my books and courses). So check it out!