Technical machine learning terms don’t normally make it out of research papers or academic courses, but the phrases ‘deep learning’ and ‘neural networks’ are starting to crop up more and more in popular press articles.
Deep learning appears on MIT Tech Review’s 10 Breakthrough Technologies of 2013, Microsoft recently announced their success using deep neural networks for speech recognition, and Google have done the same for image classification. It can be difficult to describe machine learning methods to the average non-technical person, but neural networks are based on the way the human brain works which gives an easy and intuitive explanation of what they are. And, of course, teaching a computer to think in the same way as we do is accepted almost without question as a good way of doing things.
Deep learning methods aren’t new – neural networks were first popularised in the 50s but soon fell out of favour as they proved slow and difficult to work with. With an increase in both the amount of data and the computational power available to train them, they’ve undergone a revival in recent years, and have had a big impact on machine learning. With large amounts of data, complicated models, and lots of processing power, neural networks have significantly improved accuracy on some difficult machine learning problems.
So what is ‘deep learning’? In a special edition of IEEE Trans on Audio, Speech and Language Processing (Jan 2012) it’s defined as
“a machine learning method that involves at least three, adaptive nonlinear processing steps from the input to the output”.
That means that you take an input, such as the pixels in an image, transform them at least three times using a nonlinear function, and the result is the output of the network. The nonlinear functions are learnt from data (i.e. they adapt) so that the output is meaningful. If the input image is a handwritten number, the output might be a value that tells you whether the input image corresponds to a handwritten ‘1’ or a ‘0’. It’s both the number of processing steps and the complexity of the nonlinear function that gives deep learning its advantage over some of the simpler machine learning techniques that have been successful in the past.