Machine learning (ML) is a part of artificial intelligence (UI) that provides machines with the ability to automatically learn from data, past experiences, and identify patterns to make predictions with minimal human intervention. Machine learning methods allow computers to work independently without explicit programming. ML applications are fed with new data and can learn, grow, evolve and eventually adapt on their own. Machine learning extracts insightful information from large volumes of data by using algorithms to identify patterns and learn in an iterative process. ML algorithms use computational methods to learn directly from data, rather than relying on some predetermined equation to serve as a model. The performance of ML algorithms adaptively improves as the number of available samples increases during learning processes. For example, deep learning is a subfield of machine learning that trains computers to mimic natural human characteristics, such as learning from examples. The three main types of machine learning are:

Supervised learning (SL) is a type of machine learning in which an algorithm learns to map inputs to outputs based on labeled examples. The training data consists of input-output pairs and the algorithm derives a function based on this data that allows predictions or decisions based on the new data. The labeled data is divided into a training data set and a test data set, and the algorithm is learned from the trained data set to predict or classify the output variable in the test data set. SL algorithms need external help for learning, and patterns are learned from trained data that are applied to new data for prediction or classification[O26]. When acquiring data, SL can be divided into two types of problems - classification and regression:

  1. Classification uses an algorithm to accurately classify test data into specific categories. It recognizes specific entities in a data set and attempts to draw certain conclusions about how those entities should be labeled or defined. Common classification algorithms are linear classifiers, support vector machines (SVM), decision trees, nearest neighbors and random tree.
  2. Regression is used to understand the relationship between dependent and independent variables. It is commonly used to make forecasts such as sales revenue for a given business. Linear regression, logistic regression, and polynomial regression are popular regression algorithms.

Unsupervised learning (UL) refers to machine learning tasks where there is no labeled data or teacher to provide correct answers. In other words, we have input data but no output data.

In unsupervised learning, algorithms are free to explore and discover interesting structures in the data on their own. It is not clear what the output will look like. These algorithms typically learn only a few key features from the data, which are then used to classify new data. The main applications of such learning are in data clustering and feature reduction [O26]. Learning defines various algorithms such as Principal Component Analysis (PCA), Independent Component Analysis (ICA), K-Means and Expectation Maximization Algorithm (EMA) [O27]. UL models can perform more complex tasks when compared to SL, but they are also more unpredictable. Learning can be divided into two types:

  1. Clustering where we find hidden patterns in data based on their similarities or differences. These patterns can relate to shape, size or color and are used to group data items or create clusters. There are several types of clustering algorithms, such as exclusive, overlapping, hierarchical, and probabilistic.
  2. Association is a type in which we can find the relationship of one data item to another data item. We can then use these dependencies and map them in a way that is beneficial for us. An association rule is used to determine the probability of co-occurrence of items in a collection. These techniques are often used to analyze customer behavior on e-commerce websites.

One of the most famous types of artificial intelligence today are large language models (LLM). These models use unsupervised machine learning and are trained on huge amounts of text to learn how human language works. These texts include scientific articles, books, websites and many others.

Reinforcement learning (RL) is a learning principle that uses feedback from evaluated results to strengthen valid rules and weaken ineffective or bad rules. In reinforcement learning, there is no labeled data as training data, but this does not mean that there is no supervision information at all. The system runs according to the program of reinforced learning and when the desired result is achieved, it gives a signal called a reward. which in turn provides feedback on the training itself. RL is a field of machine learning that emphasizes how to behave based on the environment in order to maximize expected benefit. The idea comes from the behaviorist theory in psychology, according to which the organism, when stimulated by rewards or punishments provided by the environment, gradually creates expectations of stimuli, which leads to habitual behavior and this results in the maximization of benefits [O28].

Neural networks are a means of machine learning in which a computer learns to perform a task by analyzing trained examples. Usually the examples are manually labeled in advance. For example, an object recognition system can receive thousands of labeled images of cars, houses, cups of coffee, etc. and finds visual patterns in them that consistently correlate with specific labels.

A neural network, loosely modeled after the human brain, consists of thousands or even millions of simple processing nodes that are densely interconnected. Most of today's neural networks are organized into layers of nodes and are "feed-forward", meaning that data moves in only one direction. An individual node can be connected to several nodes in the layer below it, from which it receives data, and to several nodes in the layer above it, to which it sends data.

A node assigns a number known as a "weight" to each of its incoming connections. When the network is active, a node receives a different data - a different number - through each of its connections and multiplies it by an assigned weight. Then he adds the resulting products and gets one number. If this number is lower than the threshold, the node will not forward any data to the next layer. If the number exceeds a threshold, the node "fires," which in today's neural networks generally means that it sends the number—the sum of the weighted inputs—along all of its output connections.

When a neural network is trained, all its weights and thresholds are initially set to random values. The training data is fed to the bottom layer - the input layer - and passes through the following layers, while being multiplied and added in a complex way, until it finally arrives radically transformed in the output layer. During training, the weights and thresholds are continuously adjusted until training data with the same labels produce similar outputs [O35].

Deep neural networks are typically composed of more than one hidden layer, which are organized into deeply nested network architectures. In addition, they usually contain advanced neurons as opposed to simple ANNs. This means they can use advanced operations (eg convolutions) or multiple activation in a single neuron instead of using a simple activation function. These properties allow deep neural networks to be given raw input data and automatically discover the representation that is needed for the learning task at hand. DL is particularly useful in areas with high-dimensional data, and thus deep neural networks outperform shallow ML algorithms in most applications that need to process text, image, video, speech, and audio data [O38].