In this post we consider a basic fundamental block of all neural networks and AI applications: perceptron.

Mathematically, a perceptron is a function y=f(x1,….xN) with many inputs and one binary output.

Physically, it is a device (mechanical, electric, electronic, optical, etc.) with many inputs and one binary (two states) output.

It is often used as a classifier for different sorts of inputs onto two classes in the output. Today, some people define perceptron as an algorithm for classifications on two groups.

For example, when we classify people on two groups: tall or short by heights we perform a basic function of a perceptron.

Such devices are known from the earliest times. For example, in ancient Greece decisions of a group of priests were classified on two group: favored by the Gods, not favored by the Gods by devices called oracles (which we call today perceptrons).

In modern times the perceptron was used as a model of a neuron of nervous system and it’s invention is credited to McCulloch and Pitt in 1943. This model was subsequently implemented in custom-built hardware as the "Mark 1 perceptron".

Single layer one direction neural networks became very popular topic for researchers, but it was recognized quickly that such networks have limitations. This caused the field of neural networks stagnates for many years.

Only when scientists discovered that feedback loops and multi-layered structures can resolve the limitations of the initial models the field attracted attentions of researchers and public again.

In 1982 John Hopfield suggested to use bidirectional connections and in 1986 three independent groups of researchers came up with similar ideas which are now called back propagation networks.

The book “*Perceptrons”* by Marvin Minsky and Seymour Papert also played a role in popularity of neural networks.

Today we are obsessed with ChatGPT. This program is nothing more than a collection of perceptrons. In fact all neural networks are collections of perceptrons in the same way as all things around us are collections of atoms. The difference only is in structures of connections and numbers of elements.

Let us consider a simple example from [1-2] of using a perceptron for classification.

from sklearn.datasets import load_digits from sklearn.linear_model import Perceptron X, y = load_digits(return_X_y=True) clf = Perceptron(tol=1e-3, random_state=0) clf.fit(X, y) Perceptron() clf.score(X, y) 0.939… Initially we imported databases and a perceptron (first two lines). Then we loaded data into arrays x, y (the third line). On the lines 4-6 we train the perceptron on the data to make classifications. On the last line we output how accurate the model is.

References:

[1] https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Perceptron.html

[2] https://python-course.eu/machine-learning/perceptron-class-in-sklearn.php