Machine Learning Driven Technological Revolution

Milos Dunjic , AVP, Payments Innovation Technology Solutions at TD Bank Group

29.01.2018 07:45 am

The machine learning is software programming technique for automated detection of meaningful and valuable patterns in data and making decisions based on the identified patterns. One thing that all machine learning applications have in common is that they are able to ‘learn’ and improve their accuracy over time, based on the previously observed, historical data. The patterns that need to be detected with machine learning are usually too complex for the traditional computer programs, which are built from the deterministic specifications and with ‘if-then’ statements. In many ways, the machine learning applications behave like people at very early age, when many of their own skills got acquired, then improved (i.e. learned) over time through personal experience, rather than following pre-scribed ‘life script’ or specification. The machine learning is, therefore, all about applications that are able to ‘learn’ and adapt, rather than requiring reprogramming every time, when new pattern variation (from the same pattern class), needs to be identified.

Machine learning has been around for some time. It is embedded in modern search engines for optimization of most relevant search results, it is also heart of systems for detection of likely spam messages, and it is guarding payment card transaction authorization systems to detect potential fraud. Thanks to machine learning, the modern digital cameras are able to detect human faces and properly focus the lens on them. The NLP enabled smart phone applications like chat-bots and voice assistants (like Alexa and Google Home), are capable of fairly accurately recognizing and acting on text and voice commands. Kasisto’s chat-bots are, for example, able to mimic the bank personnel by being highly trained to ‘converse’ very effectively with consumers, using the lingo from financial service sector.

The High-Level Mechanics of Machine Learning

What are the underlying mechanics of machine learning? How can the process of learning be automated so that it is appropriate for the ‘dummy’ machine like computer? How can we evaluate the success of a particular machine learning process, i.e. how do we (and machine) know when it ‘learned enough’ and is ready for processing of the non-training data?

Every machine learning application is based on:

The model: The correct mapping rule between its inputs and outputs. This is the ‘knowledge’ that the application is trying to acquire. This exact model function is always unknown to the machine learning algorithm, and the algorithm is basically trying to figure it out.
Training data: The set of input data points that the machine learning algorithm will be exposed to, during training phase, together with the set of corresponding output values, which is ‘labeled’ (i.e. pre-classified) according to the ‘correct’ mapping rule (i.e. the model). Training data is encoded ‘knowledge’.
The predictor: The prediction rule according to which the output value is generated (or guessed) from the arbitrary input value. This predictor is also called hypothesis or classifier. The model (mentioned above) is in fact the predictor which defines the perfectly ‘correct’ prediction rule.
The error: the difference between the output value generated using the predictor from the given input value AND the output value generated using the model from the same given input value. This is basically achieved by comparing what current iteration’s predictor form ‘thinks’ is correct, against the actual correct value, according to the training sample’s ‘label’ (i.e. correct output value).

The main goal of every machine learning algorithm is to minimize the magnitude of error of its underlying predictor function. Basically, each machine learning algorithm is programmed to iteratively keep adjusting the parameters of its underlying predictor function, until it starts behaving as close as possible as the model (i.e. ‘correct’ predictor). What is exactly being adjusted? It really depends on the internal representation of the predictor function. If it is regression predictor type, the algorithm keeps adjusting the coefficients of the underlying polynomial model equation. In case of neural network predictor, it keeps adjusting weights of the edges between input, hidden and output ‘e-neuron’ nodes.

Regardless of the predictor type, this iterative adjustment of predictor form parameters, is facilitated by feeding it with the large enough set of pre-labeled training data, until the magnitude of its error becomes smaller than some pre-configured small value. At that point, we declare that the algorithm achieved the state of being Probably Approximately Correct (PAC) and is ready for processing the real life, unlabeled input data sets with high degree of expected accuracy.

All Machine Learning Systems Are Highly Specialized

As exciting, cool and powerful as they already are, the current machine learning applications are still somewhat limited in what they can and can’t do. They are very good at ‘learning’ one particular set of ‘skills’ and mastering it to perfection - assuming availability of large enough quantity of high quality, properly ‘labeled’ training data. Whether it is robots rearranging boxes on shelves in a warehouse OR vacuuming the floors of your house, fraud-detection engines detecting fraudulent activity OR robo-advisors for investment portfolio tax optimization, or computer vision systems for recognizing specific objects on a picture - the predictors in all of these AI engines were ‘trained’ to excel at just a single task.

Of course, multiple of ‘single-specialized-task’ mini AI engines can be creatively combined into more complex ‘AI organisms’, like robotic ‘animals’ or ‘humanoid robots’, which are able to execute basic human like operations like walk, open doors, climbing stairs, keep balance, drive car, etc. But even those, Hollywood style ‘scary’ AI examples, are just more complex combinations and assemblies of highly specialized sub-component AI engines, each trained in a similar way, for single specialized task, just working together and exchanging data.

In other words, all AI systems available today, no matter how simple, complex or even impressive, are just computer systems powered by specific mathematical equation, with self-adjustable parameters, that simply map set of inputs to the set of outputs (i.e. predicting or classifying. All predictors, powering the AI applications in existence today are highly trained and specialized for executing single set of tasks only.

Who Should Be Concerned?

In my personal opinion, systems with this level of perfected specialization, will undeniably affect many professions in existence today, especially those that involve speedy analysis of large sets of data for decision making or those involving lots of repetitive tasks. In most cases, human labor will not be able to compete with such level of automation, although I would argue that many of those professions and workers had already been significantly affected by cheaper labour markets and early data entry automation systems, for almost a decade. However, machine learning will enable more extreme version of that trend, and will quickly go beyond, by starting to affect many other professions, which were relatively safe so far.

Alpha Go is probably the most prominent and brutal example of this new type of ‘inequality’. Alpha Go is neural network with multiple hidden layers, highly specialized at doing just one thing - playing the ancient game of Go. It brutally destroyed the best Go game player in the world. No emotions, or human type creativity … just brute force ability to quickly look ahead 50 ‘moves’, in set of available moves (based on the opponent’s ‘moves’ as input data) and quickly classify the best one, based on ‘learned’ probabilities, which are encoded as weights of its connections between ‘e-neurons’. Those probabilities were determined by the neural network itself, through the lengthy historical training. Will any of the future Go tournaments have the same appeal, after Alpha Go’s historical win? What does it mean now being a human Go world champion, when no one can match the computer anymore?

Who are the potential winners in this process?

On the other side of the new inequality equation, are technological ingenuity and engineering creativity. The people which have knowledge and skills to identify areas and use cases, where machine learning can effectively be used, combined with skills to choose and train underlying mathematical model, will benefit the most. Those highly qualified knowledge workers are quickly emerging as probable biggest winners of the latest, AI driven, technological revolution.

Can these types of systems also be used by fraudsters in unethical and fraudulent activities? Will they benefit also? Sure, they can and they probably already are. We are likely already experiencing the hidden war between AI engines for detecting fraud and those trying to trick them, with very capable software and AI engineers and mathematicians madly working on both sides.

What To Do To Stay Relevant?

In the end, the only wisdom and recipe in all this is - the more personal time you invest, to learn about the underlying principles of machine learning and types of models that could power various AI systems, better you will be equipped and prepared to use that knowledge in designing the next (AI powered) killer app, service or robot, rather than become a victim and be replaced by it.