Machine Learning (ML) has become almost like a quintessential capability that firms harness to better business performance. Yet, not many are aware of what it really does. Here, we bust ML myths, illuminate its uses and help you decide which type of ML you should adopt for your firm.
- Misconceptions about ML
- Difference between ML and Traditional Analytics
- ML Algorithm and Classification
- Supervised ML
- Supervised ML Categorisation
- Unsupervised ML
- Unsupervised ML Types
- Industry Use Cases of Implementing ML - Uber, Gmail, Amazon and more
- Considerations for ML Implementation
Machine Learning (ML) has penetrated almost every aspect of our modern lives, with many enterprises implementing ML in the products and services they offer. Although it has become a buzzword, it is not a novel concept as it is amplified to be.
In technical terms, ML enables computers to learn by itself without being explicitly programmed. Put simply, machines take on a proactive approach to learning, very much like the way humans learn from experience. With time, ML gets more accurate.
Some experts see ML as an application of artificial intelligence (AI). ML deals with the development of a self-learning program where the computer accesses data to progressively learn patterns on its own.
Before we continue, let us do away with some common misconceptions attached to ML:
1) ML is superhuman intelligence that will replace human beings
This myth has been prevalent for quite some time due to exaggerations by the media. Fortunately, machines will not be replacing human beings any time soon. On the contrary, ML helps to do repetitive human jobs in a much better way by performing tasks in a structured, systematic and objective manner. It does so without error. There are many grey areas where the implementation of ML is difficult and requires human intervention. This is because ML lacks irreplaceable soft skills such as emotional intelligence, intuition and creativity that only humans can offer.
2) ML is the same as Artificial Intelligence
ML is not equal to Artificial Intelligence (AI). While some people see ML as an application of AI, ML and AI are markedly different. AI could be observed as an objective, while ML is the means by which that objective is achieved.
3) Data Scientists, Data Engineers, AI and ML Engineers are all the same
This is like comparing apples to oranges and is quite prevalent in the industry. They play different roles and require different skillsets. There are some common skills shared across these fields, but each is uniquely different. Data Scientists are specialists in mathematics, statistics and programming, and are expected to work on business problems. Data Engineers deal with building infrastructure for application deployment. The AI Engineer role is quite vast and will mostly include reasoning, knowledge representation and Natural language processing. Finally, the ML engineer trains and develops models that can be fed into a computer.
So what sets ML apart from traditional analytics? ML relies heavily on massive computing power, whereas traditional analytics was developed at times when computing power was not present, and human effort was needed to gain insight into data. Traditional analytics is obviously still helpful in generating reports and explaining data at large, but more often than not, it requires assumptions about any given problem and its associated data distribution for efficiency’s sake. ML on the other hand, applies its computing power to large data sets and is not bound to make presumptions about the existing data.
ML algorithms allow a model to learn from data inputs and outputs, as well as improving accuracy over a period as more data accumulates. There are two main classifications of ML models:
a) Supervised Machine Learning
b) Unsupervised Machine Learning
However, some industry experts also believe that ML models should be classified into two more categories:
a) Semi-Supervised Machine Learning
b) Reinforcement Machine Learning
This is contentious however, and many feel these are more or less generalised forms of supervised or unsupervised machine learning.
Let’s take a look into Supervised Machine Learning.
In supervised ML the computer is taught by example. It learns from past data and applies the learnings to present data to predict future events. In this case, both input and desired output data provided helps to the prediction of future events. For accurate predictions, the input data is labelled or tagged as the right answer.
A typical supervised learning algorithm
Image Source: IBM
It is important to remember that all supervised ML are essentially complex algorithms, categorised as either classification or regression models.
1) Classification Models - Classification models are used for problems where the output variable can be categorised, such as "Yes" or "No" or "Pass" or "Fail." Classification Models are used to predict the category of the data. Real life examples include spam detection, sentiment analysis, and scorecard prediction of exams, to name but a few.
2) Regression Models - Regression models are used for problems where the output variable is a real value such as a unique number, dollars, salary, weight or pressure, for example. It is most often used to predict numerical values based on previous data observations. Some of the more familiar regression algorithms include linear regression, logistic regression, polynomial regression and ridge regression.
There are some very practical applications of supervised ML in real life including:
- Text categorisation
- Face Detection
- Signature recognition
- Customer discovery
- Spam detection
- Weather forecasting
- Predicting housing prices based on the prevailing market price
- Stock price predictions
Unsupervised ML, on the other hand, is the method that trains machines to use data that is neither classified nor labelled. It means no training data can be provided and the machine is made to learn by itself. The machine must be able to classify the data without any prior information about the data. The idea is to expose the machines to large volumes of varying data and allow it to learn from that data to provide insights that were previously unknown, and to identify hidden patterns. As such, there aren’t necessarily outcomes from unsupervised ML. Rather, it determines what is different or interesting from the rest of a dataset.
The machine needs to be programmed to learn by itself. The computer needs to understand and provide insights from both structured and unstructured data.
1) Clustering is one of the most common unsupervised ML methods. The method of clustering involves organising unlabelled data into similar groups called clusters; thus cluster is a collection of similar data items. The primary goal here is to find similarities in the data points and group similar data points into a cluster.
2) Anomaly detection is the method of identification of rare items, events or observations which differ significantly from the majority of the data. We generally look for anomalies or outliers in data because they are suspicious. Anomaly detection is often utilised in bank fraud and medical error detection.
Some practical applications of unsupervised ML include:
- Fraud detection
- Malware detection
- Identification of human errors during data entry
- Conducting accurate basket analysis
- Exploratory data analysis
3) Cybersecurity is an area where unsupervised and supervised learning finds a great many uses. New types of malware, viruses, and vulnerabilities are discovered daily and they need to be tackled immediately. ML helps not only to detect these threats, but also predicts the trend and direction in which they are moving.
4) Powered by ML, Amazon Forecast helps businesses predict future outcomes like product demand based on past purchasing patterns, types of resources required and financial performance accurately. Using more than 2,000 real-time and historical data points per order, Amazon’s Buyer Fraud Service system has also actively employed ML algorithms to detect and prevent fraudulent transactions, saving customers approximately US$1 Million every week. As with most e-commerce firms, the company provides recommendations and promotions on its products by analysing browsing and purchasing data.
It can be a daunting task to select the correct algorithm for the problem at hand. Generally, supervised ML is preferred when we have existing data that includes target values which we wish to predict. To build such a model, there must be a subset of data points for which the target value is already known. The model will then be applied to the other datasets for which target values are unknown.
We use unsupervised machine learning when the machine is required to learn by itself and help us to achieve difficult specific tasks.
Many are averse to machine learning due to lack of understanding and knowledge, as well as the many obstacles of implementing ML within a business. However, there are many practical uses for this technology that should not be dismissed, as it is an invaluable business tool for long-term digital transformation. To help businesses take the first step in implementing ML, DataVLT is offering a complimentary assessment on how best ML can be integrated into your business. Alternatively, you can enjoy 12 months of free analytics development through DataVLT’s Pilot Partnership Program based on the problem that you would like to solve.
DataVLT is an affordable, on-demand analytics platform secured by blockchain technology. It is designed to simplify the complexity of data science. Backed by artificial intelligence and machine learning capabilities, DataVLT empowers enterprises to make meaningful sense of their big data and scale cost efficiently. Essentially, it is an end-to-end data/information management platform.
Learn more at www.datavlt.com