Introduction to Machine Learning

Updated: Sep 5

The first in a ten part series on Machine Learning and implications for investors.

Machine Learning (ML) automates prediction. The amount and variety of data, powerful and usability of models, and implications for investing will continue to increase. A key implication for investors is that ML is not an industry. ML is a utility that is increasingly embedded inside the global digital infrastructure. By utility we mean that it is exceedingly cheap and available to everyone through natural monopolies just like water and electricity. However, unlike traditional utilities, ML driven forecasts are, by design, not fully understood by their developers. As a result, everyone in the world will increasingly be reliant upon and incentivized to change behavior by a global utility that is increasingly incomprehensible.

This article is the first of ten in a series on Machine Learning published by IntuitEcon. If you are reading this on our public blog then you received it a month after publication. Subscribers may also submit questions and hear our live podcast every Sunday at 7pm EST. Learn more through the links below. [UPDATED: Podcast now on Monday 6PM EST]

Our introduction to Machine Learning is organized into five sections...

  1. The Rise of Machine Learning

  2. What is Machine Learning?

  3. How is ML Different from Statistics?

  4. Where is ML most useful?

  5. Where is ML Impacting Markets and the Economy?

1. The Rise of Machine Learning

Machine Learning (ML) is an application of artificial intelligence (AI) that uses computers to

automate prediction. Humans have been interested in AI for many decades, but interest in ML is more recent. Google searches for “Machine Learning” grew by an order of magnitude over the past ten years and now double that of its broader discipline “Artificial Intelligence”. ML is everywhere; recommending new friends, products, and even driving cars. ML applications are so ubiquitous that tools are now available to help children understand how they work.


ML applications have been led by platform companies like Google and Facebook. Then the finance industry followed such as banks and Financial Technology (FinTech) firms. Now ML is being broadly applied by leaders across essentially all industries to help improve decision making across business lines. We don’t even necessarily perceive it anymore. For example, the cheap availability of natural language processing (NLP) technology is making it easy for just about any company to employ chatbots to help retrieve information and improve customer experience. Forecasting models had already been at the heart of data intensive businesses such as banking for decades for operations like underwriting and fraud detection. ML is now extending beyond traditionally data intensive industries because of the rapid expansion of datafication.

Adoption of ML provides many benefits and new challenges. ML makes prediction cheaper and easier which may in turn reduce costs. But ML methods automate prediction in ways that can reduce transparency. Reducing transparency can increase risks if not properly appreciated and understood. But before diving into these issues and implications we first provide an overview of ML to help put these implications in context.

Deep Learning (DL) is the process used to train Deep Neural Networks, which are neural networks with more than one “hidden layer” separating features from predictions. Hidden layers are used to capture interactions between features. The invention of DL in 2006 was the major breakthrough that re-ignited interest in machine learning (ML). By 2010, DL started to surpass all other ML applications when given enough data. Much of what we see as rapid progress in ML is really incremental improvements in Deep Learning. As the amount of data we create and processing power continues to grow and become cheaper, so will the usefulness of Deep Learning and other ML methods.

Reinforcement Learning allowed AlphaGo to beat the world’s leading Go champion in 2016.

Reinforcement Learning is distinct from Deep Learning in that it is useful even when there is

little data upon which to train. Reinforcement Learning is particularly useful in fields where the machine can learn from trial-and-error, such as a board game like “Go”.

Both Deep Learning and Reinforcement Learning still fall into the category of Artificial Narrow Intelligence (ANI) which means the machine cannot generalize what is learned. Deep Neural Networks trained to predict mortgage default rates cannot generalize this knowledge to commercial loans, or know when to use a different model for subprime. The underlying code is still simply finding correlations between numerical input data and defaults.

2. What is ML?

ML is commonly defined as the field of study within AI that gives computers the ability to learn from data without being explicitly programmed. 3 A key source of confusion stems from the word “learning” which is analogous to “training” or “calibrating”. All statistical models require training, but traditional statistics involves a lot of manual and often tedious steps. For this reason, we describe ML more simply as any model that uses computers to automate prediction.

The term “automation” in the context of ML is a matter of degree. Likewise, the degree to which a model is “machine learning” is also a matter of degree dependent on the extent to which humans are manually involved in the prediction process. Model development, implementation, monitoring, and recalibration are all part of the prediction process. One or more of these parts can be automated.

For example, a developer may judgmentally determine which features to include in the model, apply a Random Forest to structure how these features are mapped to a forecast, then recalibrate the model every quarter if forecasting errors breach some threshold. In this case, only a portion of the development process, the application of the Random Forest, is automated. The degree to which the developer manually sets “hyper-parameters”, such as the number of trees in the forest, is another aspect of automation. A purer ML application of a Random Forest would automate everything from feature selection to forecast.

Experts sometimes disagree on which tools fall under the header of ML. For example, a common problem in macroeconomic forecasting is the need to combine many highly correlated features using Principal Component Analysis (PCA). James Stock and Mark Watson have been developing PCA methods using traditional statistics since at least 2002. They didn’t call their approach “Machine Learning” back then, but their approach is essentially an automated means of identifying relevant features and turning them into a prediction. Is PCA a type of ML? The answer seems to depend on who you ask. That’s where it helps to think of ML in terms of automating prediction: If it automates prediction…then it is machine learning.

3. How is ML Different from Statistics?

ML methods are not altogether different from traditional statistics (TS). Both ML and TS attempt to turn data into value, and use many of the same statistical building blocks. A key difference is in the applications most suited to ML methods (i.e. high-dimensional and complex problems).

Much confusion is caused by terminology. For example, ML practitioners refer to “labels” and “features” while TS practitioners use the terms “dependents” and “independents”. Likewise with the ML terms “learning”, “classifier”, and “instance” which are analogous to the traditional terms “estimation”, “hypothesis”, and “data point”. Practical applications of ML methods still require professionals to understand their objectives, choose the appropriate methods, compensate method strengths and weaknesses (such as over-fitting), and target the modeling solution to solve real business problems. ML is no magic bullet for bad data. Garbage in still results in garbage out.

ML models emphasize prediction accuracy over inference. Traditional Statistical models are

explicitly programmed in a manner that generically reduces to the form y = βX + e where our

dependent variable (y) is assumed to be a linear function of independent variables (X) with

coefficients (β) and some error term (e). Explicit formulations help to infer the relationship

between X and y.

ML and Big Data are often referred to in the same breath because ML methods tend to

outperform Traditional Statistics when applied to Big Data. This is a direct consequence of

relaxing assumptions (priors) about the relationship between a label and its potential features. Technically speaking, ML allows for far greater “degrees of freedom”; which simply means features and label interaction assumptions are relaxed to allow more potential solutions.

Today, there are so many ML methods that the term “Machine Learning” might seem difficult to define. Some ML methods, such as Reinforcement Learning, don’t require much data. Others, such as Deep Learning, are more practical for problems involving BigData. Random Forests and clustering models can work well with modest (traditional) amounts of data. What matters is that more prediction problems are being automated, making some harder to fully understand but allowing for potentially better predictions and new statistical tools, each with their own use cases, strengths and weaknesses.

4. Where is ML most useful?

ML methods are most useful for high-dimensional and complex (non-linear) problems. 5 No fine line exists to differentiate when to use ML, but practically it is the point where it becomes overly restrictive to impose a theory or structure on a prediction problem. Some prediction problems are too complex to manually test all potentially useful relationships. For these problems, we relax the assumptions of traditional statistics (TS) and let the data speak. The result is a black box that may improve performance.

Digitization is increasing the portion of problems for which ML methods can derive superior

solutions. As a result, bankers and regulators will increasingly be forced to confront these black boxes or fall behind the artificial intelligence arms race. Regulations designed in an era of explicit underwriting and risk management systems such as Fair Lending, Know Your Customer, and Model Risk Management are already confronting challenges posed by ML methods that by design are not fully comprehensible.

We’ve already discussed two key factors that distinguish problems best suited for ML, namely high dimensionality and non-linearity. Others factors include domain knowledge and pace of regime change. Take commercial credit modeling as an example. Experienced credit model developers know the features that drive default prediction tend to fall into one of the Five Cs; namely character, capacity, collateral, capital and conditions. The same modelers also know the direction in which these features should influence default rates. As a result, it may make sense to constrain the estimation process in a manner consistent with the modeler’s priors. Doing so narrows the degrees of freedom to what the developer doesn’t already know, such as the magnitude of the relationships. Moreover, the Five Cs are unlikely to go away anytime soon.

Sometimes the fundamental nature of a dynamic process changes. The technical term for this is “Regime Change”. Regime change causes historical data and previously useful domain knowledge to become obsolete. Inserting domain knowledge goes directly against the grain of ML. Credit prediction models provide a useful example of where regime change is probably moving more slowly.

Domain knowledge surrounding the nature of when and how a borrower will default is unlikely to change radically any time soon. Given enough data, an unconstrained ML model may still produce superior results to constrained traditional statistics, but practitioners should weight this improvement against a loss to transparency and potentially non-intuitive relationships.

Traditional statistical methods still dominate low frequency problems such as macroeconomic forecasting and illiquid asset pricing such as commercial credit. While ML methods may help the high degree of domain knowledge combined with low frequency of relevant data makes the application of ML methods more difficult. This is especially true for operations with less standardized data such as commercial credit underwriting. At the other extreme are applications of ML in markets today such as High Frequency Trading (HFT). Understanding why is helpful as we dive into ML impacts in markets and the economy.

5. Where is ML Impacting Markets and the Economy?

The short answer is everywhere, but the reason comes down to “Big Data”. Most ML applications in most industries use data. Big Data is a hyped term that simply means very large datasets. By “vary large” we mean so large that traditional models and analytics become impractical. The amount of data generated from some mobile applications, payment systems, trading platforms, and even some consumer credit products are incomprehensibly large. Digitization is driving an exponential rise in data across nearly everything we do … hence the near complete scope of impacts that ML is having on markets and the economy.

High Frequency Trading (HFT) provides a good introduction to the incomprehensible nature of ML on markets. HFT trading strategies that are driven by ML are difficult for the human mind to comprehend. There are several reasons...nanosecond price movements, large volume of data involved, multiple HFT ML strategies are interacting with each other in some highly liquid markets. This can result in rapid regime change, especially when the underlying estimation processes driving these ML strategies are dynamic.

Try to comprehend for moment a dynamic HFT ML model that is automatically re-estimating its trading strategy over very short periods of time (ex. daily or hourly). Now imagine multiple dynamic HFT ML models trading with and learning from each other. Humans simply can’t react to and learn from each other as fast as computers. We have to manually take in new data inputs, analyze, re-estimate, and put new models into production. Automating this process makes regime change potentially much more rapid, and its a big reason to be skeptical of any humans ability to compete against machines at higher frequencies.

One important consideration for investors is the potential consequence to market dynamics and regime changes. Historical volatility, skew, return distributions, and other moments of price behavior might be changing faster today than before. Standardized measures of “riskiness” calibrated to the past decade or two might not accurately reflect their current risk and liquidity. Electronic trading is only a few decades old and has undergone major changes such as the rise of ETFs, pricing to the penny (and smaller), HFT, ML, and an increasing proportion of trades performed between computers that learn from each other.

Understanding the sources of BigData helps clarify where ML applications are headed. Four major data sources include Natural Language Processing (NLP), mobile applications, digital payments, and financial markets.

NLP is a field within ML that is opening up entirely new opportunities for prediction. By turning spoken and written words into features, NLP can turn records, text, and speech from customers (and employees) into valuable insights. Major applications of NLP include information retrieval, intent parsing, sentiment analysis, speech recognition and classification. Information retrieval occurs anytime customers are searching for information such as the use of keywords to find documents on a website. Intent parsing includes the sometimes frustrating experience we have talking to chatbots and other automated customer service applications. Companies are just scratching the surface of opportunities to save time, reduce risk, and improve customer services via harnessing NLP.

Mobile applications are revolutionizing the way consumers receive nearly all services. By creating apps, firms are also able to generate new proprietary data sources that are tailored to their particular services. Once a consumer downloads the app, firms might gain access to customer locations, apps they use, and other data generated from mobiles phone that just might have weak relationships with labels of interest. For example, battery life was found to be a weak feature in explaining default probability.

Digital payments between businesses (B2B), peers (P2P) and between businesses with their customers (B2C) are increasing exponentially. This is creating a wealth of data on spending habits and cash flows previously obscured by cash. China has largely skipped over the use of credit cards and gone straight to mobile payment solutions like WeChat Pay or Ali Pay effectively reducing settlement time to zero. US firms like Square, Paypal, Venmo, Apple, and Facebook are helping the USA to catch up in digital payments and competing with banks in a wide variety of other financial services.

Financial markets are largely and increasingly digital. Digital prices and automated markets have opened up many ML opportunities for traditional and new financial services such as P2P lending and other forms for automated underwriting, automated high frequency trading (HFT), and real-time asset pricing and forecasting.

There are certainly other sources of data, but these four compose the lion’s share and are driving impacts to the economy and markets. Understanding these data sources and the economics of ML helps to assess where ML will spread and change the investing landscape.

-- END --

Thank you for subscribing to IntuitEcon!

IntuitEcon will be publishing each Machine Learning series at every Saturday and hosting a Sunday private podcast for Brain Trust members and subscribers.

  • Saturday Publishing - This series will be posted each Saturday starting on July 17th, 2021 and continuing through all ten topics listed above.

  • Sunday Podcast - Podcasts will be held at 7:00pm EDT every Sunday after publishing and is free to join by Brain Trust members and subscribers. These podcasts will be recorded and released to the public at a later date.

Brain Trust members and subscribers are encouraged to submit questions at least two hours before the podcast to

Those interested in signing up for reminders to this series and other events at IntuitEcon are encouraged to create an account at


IntuitEcon Leadership Team