Twelve Most Popular Machine Learning Algorithms Described

Overview: There are many algorithms used to build machine learning models.  Listed below are 12 of the most common / popular algorithms along with a relatively simple definition, the type of algorithm (regression, classification, dimensionality reduction, or clustering), and an example use-case.


1) Linear Regression (Ordinary Least Squares, OLS) – OLS seeks to draw a line through a scatter plot of data points in such a way that the total sum of squared differences between the observed values and the values predicted by the line is as small as possible.

Algorithm Type: Regression

Example Use: Predicting housing prices based on various features like size, location, and number of bedrooms. A fundamental tool in real estate valuation.


2) Linear Regression (Stochastic Gradient Descent, SGD) – SGD is a method used to find the minimum of a function—specifically, the cost function in the context of linear regression. Unlike Ordinary Least Squares (OLS), that calculates the regression coefficients using all the data at once, SGD updates the coefficients (or weights) incrementally, using a subset of data at each step of the calculation process. This approach makes SGD particularly useful for large datasets where fitting the entire data into memory at the same time is impractical.

Algorithm Type: Regression

Example use: Large-scale financial forecasting, such as predicting stock prices, where Stochastic Gradient Descent (SGD) can handle vast datasets efficiently. 


3) Logistic Regression (Binary) –  Logistic regression is a powerful statistical method for classifying data into two categories. Imagine you want to predict whether an email is spam or not, or whether a customer will churn (cancel their service). This is where logistic regression comes in. Unlike linear regression that predicts continuous values, logistic regression uses a mathematical function called the sigmoid function to convert its results into probabilities between 0 and 1. This makes it perfect for situations with two distinct outcomes, like “yes” or “no.”

Algorithm Type: Classification

Example use: Email spam detection by classifying emails into spam or not spam. It’s widely used in filtering unwanted emails.


4) Logistic Regression (Multiclass One-vs-Rest, OVR) – This is a strategy for handling multiple classes in logistic regression. In logistic regression (multi-class OVR), the technique involves training a separate model for each class, where one class is treated as the positive outcome and the other classes are combined into a single negative outcome. This way, for multiple classes like ‘cat’, ‘dog’, and ‘bird’, you’d have one model that tells cat from not-cat, another for dog versus not-dog, and so on, simplifying the multiclass problem into several binary classification problems.

Algorithm type: Classification

Example use: Medical diagnosis, such as classifying the stages of a disease (none, mild, moderate, severe) based on symptoms and test results.


5) Decision Tree – A type of AI algorithm that makes decisions by splitting data into branches at certain points, similar to how a tree splits into branches. At each branch, it makes a simple yes or no question until it reaches the final decision, much like following a path of decisions from the trunk to the leaf of a tree. It’s like a flowchart that leads to a conclusion based on the answers given along the way.

Algorithm type: Classification

Example use: Credit scoring by financial institutions to decide whether to grant a loan based on customer attributes.


6) Random Forest Classifier – An AI algorithm that combines the decisions from multiple decision trees to improve accuracy. It’s like consulting a group of experts instead of just one to make a decision; each tree in the “forest” gives a vote, and the most popular result becomes the final decision. This method helps to ensure that the final answer is more reliable than relying on a single decision tree.

Algorithm type: Classification

Example use: Predicting customer churn in telecommunications by analyzing customer data to identify likely churners.


7) Support Vector Machines – A type of AI algorithm used to find the best boundary that separates different groups of data points. Imagine drawing a line on the ground where one side are cats and the other side are dogs; SVM finds the best line that keeps the cats and dogs apart. It not only separates the groups but also tries to find the line that has the widest space to both sides, making the separation as clear as possible.

Algorithm type: Classification

Example use: Image classification tasks, such as distinguishing between different types of objects in satellite images.


8) K-Nearest Neighbor – Classifies a new item based on the most common category among its closest, existing neighbors. If you’re trying to figure out if a fruit is an apple or an orange, KNN looks at the nearest fruits to it; if most are oranges, the new fruit is likely an orange too. It’s a straightforward method that decides the group of a new point by the majority vote of its ‘k’ nearest points in the data.

Algorithm type: Classification

Example use: Recommending products on e-commerce platforms by finding products like those a user has liked in the past.


9) Naive Bayes – Makes predictions by using probabilities based on past knowledge. It’s like guessing the outcome of a game based on which team has won more in the past; Naive Bayes uses the frequency of past events to predict future events. It’s called ‘naive’ because it assumes each feature of the data is independent of the others, which simplifies the calculation but is not always true in real-life situations.

Algorithm type: Classification

Example use: Naive Bayes algorithms are often used in email services to filter out spam. By analyzing the frequency and patterns of words in messages, the algorithm calculates the probability of an email being spam or legitimate. If an incoming message has many characteristics of known spam (like certain keywords or phrases), the Naive Bayes classifier is likely to flag it and move it to the spam folder, helping keep users’ inboxes free of unwanted mail.


10) Principal Component Analysis – An algorithm that simplifies a complex dataset by combining many variables into just a few, without losing much information. Think of it like summarizing a long book into a few key sentences that capture the main points. This makes it easier to analyze and visualize the data by focusing on the most important features.

Algorithm type: Dimensionality reduction.

Example use: Manage and simplify the complexities of large portfolios. By identifying patterns and correlations among various financial instruments, PCA can distill these into a set of principal components, which are the underlying factors that drive the majority of the portfolio’s movement. This simplification helps investors and portfolio managers identify the key risks and return drivers, aiding in making more informed investment decisions and in the diversification of investment portfolios to minimize risk.


11) t-SNE (t-Distributed Stochastic Neighbor Embedding) – Used to make sense of data by visually grouping similar items together when plotted on a graph. Imagine having a messy room full of different toys; t-SNE would be like organizing these toys into clusters where similar toys are placed closer together, making it easier to see what types of toys you have. It’s especially useful for understanding complex datasets with many variables by creating a 2D or 3D map, simplifying the data into a form that’s easy to visualize and interpret.

Algorithm type: Dimensionality reduction.

Example use: t-SNE is particularly powerful in the field of bioinformatics, especially in the analysis of genetic data. For instance, it’s used to visualize the genetic profiles of different types of cells in cancer research. By applying t-SNE to high-dimensional genetic data, researchers can plot these cells in a two or three-dimensional space, where cells with similar genetic expressions are grouped together. This visualization helps identify previously unknown or unclassified cell types based on their genetic similarities, providing insights into cancer progression, response to treatment, and identifying new targets for therapy. It’s a tool that turns complex genetic information into a map that can reveal new pathways for understanding diseases.


12) KMeans Clustering – The algorithm picks a specified number of points, called centroids, and then organizes the data points around these centroids based on which ones they are closest to. This process creates clusters of data points that are like each other, making it easier to understand and analyze large sets of data by dividing them into manageable groups. KMeans Clustering is like organizing a large group of people into smaller groups based on their interests or characteristics, without knowing any details about the groups beforehand.

Algorithm type: Clustering.

Example use: Market segmentation by grouping customers into clusters based on purchasing behavior and preferences to tailor marketing strategies.