Misplaced Pages

Predictive learning

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Machine learning technique

Predictive learning is a machine learning (ML) technique where an artificial intelligence model is fed new data to develop an understanding of its environment, capabilities, and limitations. This technique finds application in many areas, including neuroscience, business, robotics, and computer vision. This concept was developed and expanded by French computer scientist Yann LeCun in 1988 during his career at Bell Labs, where he trained models to detect handwriting so that financial companies could automate check processing.

The mathematical foundation for predictive learning dates back to the 17th century, where British insurance company Lloyd's used predictive analytics to make a profit. Starting out as a mathematical concept, this method expanded the possibilities of artificial intelligence. Predictive learning is an attempt to learn with a minimum of pre-existing mental structure. It was inspired by Jean Piaget's account of children constructing knowledge of the world through interaction. Gary Drescher's book Made-up Minds was crucial to the development of this concept.

The idea that predictions and unconscious inference are used by the brain to construct a model of the world, in which it can identify causes of percepts, goes back even further to Hermann von Helmholtz's iteration of this study. These ideas were further developed by the field of predictive coding. Another related predictive learning theory is Jeff Hawkins' memory-prediction framework, which is laid out in his book On Intelligence.

Mathematical procedures

Training process

Similar to ML, predictive learning aims to extrapolate the value of an unknown dependent variable Y {\displaystyle Y} , given independent input data X = ( x 1 , x 2 , , x n ) {\displaystyle X=(x_{1},x_{2},\dots ,x_{n})} . A set of attributes can be classified into categorical data (discrete factors such as race, sex, or affiliation) or numerical data (continuous values such as temperature, annual income, or speed). Every set of input values is fed into a neural network to predict a value y {\displaystyle y} . In order to predict the output accurately, the weights of the neural network (which represent how much each predictor variable affects the outcome) must be incrementally adjusted via backpropagation to produce estimates closer to the actual data.

Once an ML model is given enough adjustments through training to predict values closer to the ground truth, it should be able to correctly predict outputs of new data with little error.

Maximizing accuracy

In order to ensure maximum accuracy for a predictive learning model, the predicted values y ^ = F ( x ) {\displaystyle {\hat {y}}=F(x)} must not exceed a certain error threshold when compared to actual values y {\displaystyle y} by the risk formula:

R ( F ) = E x y L ( y , F ( x ) ) {\displaystyle R(F)=E_{xy}L(y,F(x))} ,

where L {\displaystyle L} is the loss function, y {\displaystyle y} is the ground truth, and F ( x ) {\displaystyle F(x)} is the predicted data. This error function is used to make incremental adjustments to the model's weights to eventually reach a well-trained prediction of:

F ( x ) = argmin F ( x ) E x y {\displaystyle F^{*}(x)={\underset {F(x)}{\operatorname {argmin} }}\,E_{xy}} L ( y , F ( x ) ) {\displaystyle L(y,F(x))}

Once the error is negligible or considered small enough after training, the model is said to have converged.

Ensemble learning

In some cases, using a singular machine learning approach is not enough to create an accurate estimate for certain data. Ensemble learning is the combination of several ML algorithms to create a stronger model. Each model is represented by the function

F ( x ) = a 0 + m = 1 M a m f m ( x ) {\displaystyle F(x)=a_{0}+\sum _{m=1}^{M}a_{m}f_{m}(x)} ,

where M {\displaystyle M} is the number of ensemble models, a 0 {\displaystyle a_{0}} is the bias, a m {\displaystyle a_{m}} is the weight corresponding to each m {\displaystyle m} -th variable, and f m ( x ) {\displaystyle f_{m}(x)} is the activation function corresponding to each variable. An ensemble learning model is represented as a linear combination of the predictions from each constituent approach,

a ^ m = argmin a m i = 1 N L ( y i , a 0 + m = 1 M a m f m ( x i ) ) + λ m = 1 M | a m | {\displaystyle {\hat {a}}_{m}={\underset {a_{m}}{\operatorname {argmin} }}\sum _{i=1}^{N}L\left(y_{i},a_{0}+\sum _{m=1}^{M}a_{m}f_{m}(x_{i})\right)+\lambda \sum _{m=1}^{M}|a_{m}|}

where y i {\displaystyle y_{i}} is the actual value, the second parameter is the value predicted by each constituent method, and λ {\displaystyle \lambda } is a coefficient representing each model's variation for a certain predictor variable.

Applications

Cognitive development

Dr. Yukie Nagai's predictive learning architecture for predicting sensorimotor signals

Sensorimotor signals are neural impulses sent to the brain upon physical touch. Using predictive learning to detect sensorimotor signals plays a key role in early cognitive development, as the human brain represents sensorimotor signals in a predictive manner (it attempts to minimize prediction error between incoming sensory signals and top–down prediction). In order to update an unadjusted predictor, it must be trained through sensorimotor experiences because it does not inherently have prediction ability. In a recent research paper, Dr. Yukie Nagai suggested a new architecture in predictive learning to predict sensorimotor signals based on a two-module approach: a sensorimotor system which interacts with the environment and a predictor which simulates the sensorimotor system in the brain.

Spatiotemporal memory

Computers use predictive learning in spatiotemporal memory to completely create an image given constituent frames. This implementation uses predictive recurrent neural networks, which are neural networks designed to work with sequential data, such as a time series. Using predictive learning in conjunction with computer vision enables computers to create images of their own, which can be helpful when replicating sequential phenomena such as replicating DNA strands, face recognition, or even creating X-ray images.

Social media consumer behavior

In a recent study, data on consumer behavior was collected from various social media platforms such as Facebook, Twitter, LinkedIn, YouTube, Instagram, and Pinterest. The usage of predictive learning analytics led researchers to discover various trends in consumer behavior, such as determining how successful a campaign could be, estimating a fair price for a product to attract consumers, assessing how secure data is, and analyzing the specific audience of the consumers they could target for specific products.

See also

References

  1. "Yann LeCun "Predictive Learning: The Next Frontier in AI"". Nokia Bell Labs. 2017-02-17. Retrieved 2023-11-04.
  2. Corporation, Predictive Success (2019-05-06). "A Brief History of Predictive Analytics". Medium. Retrieved 2023-10-27.
  3. Drescher, Gary L. (1991). Made-up Minds: A Constructivist Approach to Artificial Intelligence. MIT Press. ISBN 978-0-262-04120-1.
  4. ^ Friedman, Jerome H.; Popescu, Bogdan E. (2008-09-17). "Predictive learning via rule ensembles". The Annals of Applied Statistics. 2 (3): 916–954. arXiv:0811.1679. doi:10.1214/07-AOAS148. ISSN 1932-6157.
  5. ^ Nagai, Yukie (2019-04-29). "Predictive learning: its key role in early cognitive development". Philosophical Transactions of the Royal Society B: Biological Sciences. 374 (1771): 20180030. doi:10.1098/rstb.2018.0030. ISSN 0962-8436. PMC 6452246. PMID 30852990.
  6. Chaudhary, Kiran; Alam, Mansaf; Al-Rakhami, Mabrook S.; Gumaei, Abdu (2021-05-25). "Machine learning-based mathematical modelling for prediction of social media consumer behavior using big data analytics". Journal of Big Data. 8 (1): 73. doi:10.1186/s40537-021-00466-2. ISSN 2196-1115.
Category: