Everything you wanted to know about machine learning (but were afraid to ask).
If you work in tech (or are even thinking about it), you’ve probably come across the term “machine learning.” Google Trends shows that the popularity of this as a search term has grown by about 200 percent in the last three years, which indicates a growing interest in machine learning – but what exactly is it?
We’ve written about the more specific In this post, we’ll take a closer look at what is meant by machine learning, and explain why it seems to have such importance to modern businesses.
What Is Machine Learning?
There is a famous definition by the Computer Scientist Tom M. Mitchell that has often been used, and which we will adopt here. We say that a computer program is learning how to perform a task if it gets better at performing the task as it accumulates experience. Computer programs that learn this way are said to fall under the umbrella of machine learning.
It might help to understand this definition by considering programs that do not satisfy the definition. An important problem in some areas of math and computer science is the problem of finding the prime factors of a number. If you input the number 1081 into a program that solves this problem, it would output the numbers 23 and 47 because of 1081 = 23 * 47. It’s not too hard to figure out how to write a simple program that solves this problem – an obvious one simply tries dividing 1081 by all the numbers smaller than 1081. This program accomplishes the task, but it does not learn: it does not get better at factoring numbers with experience. You can run the program on a million inputs and it will never factor the number 1081 any faster or better than it did the first time you ran it.
Now, let’s think about a different kind of task: recognizing handwriting. Suppose we wish to write a program that takes as its input an image of a handwritten digit and gives an output digit from 0 to 9. Coming up with a simple solution to this problem is not as straightforward as the integer factoring problem above, but we might be able to come up with some ideas. You might write a program that says something like:
- If the image is an oval, return 0.
- If the image is a vertical line, return 1.
- If the image is two circles on top of each other, return 8.
The challenge here is that different people write differently. It is not remotely feasible to capture every acceptable variation in even a single digit by these kinds of rules, let alone all 10 digits. Even if you could, this type of solution would not count as machine learning: the rules don’t change or adapt as experience grows.
The machine learning approach attacks the problem in a completely different way. Rather than trying to impose rules from the start, a machine learning algorithm seeks to discover the rules by looking at examples. In machine learning, instead of trying to come up with rules, we try to come up with data. We gather as many pre-labeled images of digits as we can into what’s called a training set, which is used to literally train the computer program. We take all the images that we have of ones, show them to the computer, and tell it that they’re ones. And then we do the same with the images of twos, and so on. For each digit, the computer tries to figure out on its own what that digit’s images have in common.
We’re hand-waving over the details a little bit, but you can see how this approach would tend to improve with experience, making it fit into our working definition of machine learning. Some people cross their sevens. If the set of images that I start with does not contain any crossed sevens, my resulting program might not be able to recognize a crossed seven as a seven. But as I increase the number of examples that it has to look at, eventually it will end up with some crossed sevens and will learn that sometimes they are crossed. The same would happen with other common variations.
It turns out that cleverly designed machine learning programs can become incredibly good at this kind of task. A common introductory project for learning how to do machine learning is to perform exactly this task on a well-known dataset of images called the MNIST database. Very simple machine learning algorithms can learn to classify these images correctly with better than 90 percent accuracy, and Researchers have used more advanced machine learning tools to achieve better than 99.7 percent accuracy.
How Does Machine Learning Work?
Why is the handwriting recognition problem well-suited to a machine learning solution while the integer factoring problem is not? There are a few key differences.
One is in the complexity of the rules governing the relationship between input and output. The integer factoring problem is very difficult in a certain technical sense, but the relationship between the input and output of the factoring problem is very straightforward: if the numbers output by the program are prime and multiply together to give the input, then you’ve got the right answer. The rules that link images of handwriting to the digits they represent are much more complex and fuzzy and difficult to capture.
A related difference is that in the factoring problem, we are looking for an exact solution, whereas in the handwriting recognition problem, we are satisfied with a very good approximate solution. In fact, an exact solution to the handwriting recognition problem would not be feasible even in theory. Some threes look like fives and some fours look like nines and the only way to tell for sure what the correct label is would be to ask the person who wrote down the digit in the first place. All we can reasonably expect out of a solution to the handwriting recognition is that it is right most of the time.
Finally, it seems that handwriting recognition is inherently a statistical or probabilistic task. As humans, we don’t actually know with certainty whether we’re looking at a nine or a four. We think that a digit is probably a nine because it looks more like nines we’ve seen in the past than fours we’ve seen in the past. Most of the time we have a lot of certainty about our guess, but we are still taking a guess. We shouldn’t expect the computer to be able to do any better than that either.
When Machine Learning Works
Machine learning is well suited to problems that have the characteristics of the handwriting recognition problem – that is, problems that are highly complex, where approximate solutions will suffice, and that are inherently statistical or probabilistic. Businesses are increasingly discovering that many of their problems have these traits. Consider the problem of flagging fraudulent credit card transactions.
- Complexity: The rules that identify fraudulent credit card transactions are complex and ever-changing.
- Approximations suffice: We are flagging transactions for further review, so it is alright if the program is wrong sometimes.
- Solutions are probabilistic: We are never certain that a transaction is fraudulent until we verify by contacting the customer.
And what do we need to implement a machine learning solution to a business problem like this? Data – a commodity that modern businesses have in high supply. For these reasons, businesses are discovering that machine learning tools fit quite naturally in their activities and objectives, which is why we are seeing such a dramatic rise in the application of machine learning tools and technologies in the business world.
Interested in learning more about machine learning? Check out BrainStation’s Machine Learning Certificate Course.