Neural Networks
This site is intended to be a guide on technologies of neural networks, technologies that I believe are an essential basis about what awaits us in the future.
The site is divided into 3 sections:
The first one contains technical information about the neural networks architectures known, this section is merely theoretical,
The second section is set of topics related to neural networks as: artificial intelligence genetic algorithms, DSP's, among others. IntroductionWhat is an artificial neural network? An artificial neural network is a system based on the operation of biological neural networks, in other words, is an emulation of biological neural system. Why would be necessary the implementation of artificial neural networks?
Although computing these days is truly advanced, there are certain tasks that a program made for a common microprocessor is unable to perform; even so a software implementation of a neural network can be made with their advantages and disadvantages.
Disadvantages:
Artificial neural networks (ANN) are among the newest signal-processing technologies in the engineer's toolbox. The field is highly interdisciplinary, but our approach will restrict the view to the engineering perspective. In engineering, neural networks serve two important functions: as pattern classifiers and as nonlinear adaptive filters. We will provide a brief overview of the theory, learning rules, and applications of the most important neural network models. Definitions and Style of Computation An Artificial Neural Network is an adaptive, most often nonlinear system that learns to perform a function (an input/output map) from data. Adaptive means that the system parameters are changed during operation, normally called the training phase . After the training phase the Artificial Neural Network parameters are fixed and the system is deployed to solve the problem at hand (the testing phase ). The Artificial Neural Network is built with a systematic step-by-step procedure to optimize a performance criterion or to follow some implicit internal constraint, which is commonly referred to as the learning rule . The input/output training data are fundamental in neural network technology, because they convey the necessary information to "discover" the optimal operating point. The nonlinear nature of the neural network processing elements (PEs) provides the system with lots of flexibility to achieve practically any desired input/output map, i.e., some Artificial Neural Networks are universal mappers . There is a style in neural computation that is worth describing. An input is presented to the
neural network and a corresponding desired or target response set at the output (when this is the case the training is
called
supervised
). An error is composed from the difference between the desired response and the system output. This error information is fed back to the system and adjusts the system parameters in a systematic
fashion (the learning rule). The process is repeated until the performance is acceptable. It is clear from this
description that the performance hinges heavily on the data. If one does not have data that cover a significant
portion of the operating conditions or if they are noisy, then neural network technology is probably not the
right solution. On the other hand, if there is plenty of data and the problem is poorly understood to derive an
approximate model, then neural network technology is a good choice.
This operating procedure should be contrasted with the traditional engineering design, made of exhaustive
subsystem specifications and intercommunication protocols. In artificial neural networks, the designer chooses the network topology,
the performance function, the learning rule, and the criterion to stop the training phase, but the system
automatically adjusts the parameters. So, it is difficult to bring
a priori
information into the design, and when
the system does not work properly it is also hard to incrementally refine the solution. But ANN-based solutions
are extremely efficient in terms of development time and resources, and in many difficult problems artificial neural networks
provide performance that is difficult to match with other technologies. Denker 10 years ago said that "artificial neural networks
are the second best way to implement a solution" motivated by the simplicity of their design and because of
their universality, only shadowed by the traditional design obtained by studying the physics of the problem. At
present, artificial neural networks are emerging as the technology of choice for many applications, such as pattern recognition,
prediction, system identification, and control. Artificial neural networks emerged after the introduction of simplified neurons by McCulloch and Pitts in 1943 (McCulloch & Pitts, 1943). These neurons were presented as models of biological neurons and as conceptual components for circuits that could perform computational tasks. The basic model of the neuron is founded upon the functionality of a biological neuron. "Neurons are the basic signaling units of the nervous system" and "each neuron is a discrete cell whose several processes arise from its cell body". The neuron has four main regions to its structure. The cell body, or soma, has two offshoots from it, the dendrites, and the axon, which end in presynaptic terminals. The cell body is the heart of the cell, containing the nucleus and maintaining protein synthesis. A neuron may have many dendrites, which branch out in a treelike structure, and receive signals from other neurons. A neuron usually only has one axon which grows out from a part of the cell body called the axon hillock. The axon conducts electric signals generated at the axon hillock down its length. These electric signals are called action potentials. The other end of the axon may split into several branches, which end in a presynaptic terminal. Action potentials are the electric signals that neurons use to convey information to the brain. All these signals are identical. Therefore, the brain determines what type of information is being received based on the path that the signal took. The brain analyzes the patterns of signals being sent and from that information it can interpret the type of information being received. Myelin is the fatty tissue that surrounds and insulates the axon. Often short axons do not need this insulation. There are uninsulated parts of the axon. These areas are called Nodes of Ranvier. At these nodes, the signal traveling down the axon is regenerated. This ensures that the signal traveling down the axon travels fast and remains constant (i.e. very short propagation delay and no weakening of the signal). The synapse is the area of contact between two neurons. The neurons do not actually physically touch. They are separated by the synaptic cleft, and electric signals are sent through chemical 13 interaction. The neuron sending the signal is called the presynaptic cell and the neuron receiving the signal is called the postsynaptic cell. The signals are generated by the membrane potential, which is based on the differences in concentration of sodium and potassium ions inside and outside the cell membrane. Neurons can be classified by their number of processes (or appendages), or by their function. If they are classified by the number of processes, they fall into three categories. Unipolar neurons have a single process (dendrites and axon are located on the same stem), and are most common in invertebrates. In bipolar neurons, the dendrite and axon are the neuron's two separate processes. Bipolar neurons have a subclass called pseudo-bipolar neurons, which are used to send sensory information to the spinal cord. Finally, multipolar neurons are most common in mammals. Examples of these neurons are spinal motor neurons, pyramidal cells and Purkinje cells (in the cerebellum). If classified by function, neurons again fall into three separate categories. The first group is sensory, or afferent, neurons, which provide information for perception and motor coordination. The second group provides information (or instructions) to muscles and glands and is therefore called motor neurons. The last group, interneuronal, contains all other neurons and has two subclasses. One group called relay or projection interneurons have long axons and connect different parts of the brain. The other group called local interneurons are only used in local circuits. The Mathematical ModelWhen creating a functional model of the biological neuron, there are three basic components of importance. First, the synapses of the neuron are modeled as weights. The strength of the connection between an input and a neuron is noted by the value of the weight. Negative weight values reflect inhibitory connections, while positive values designate excitatory connections [Haykin]. The next two components model the actual activity within the neuron cell. An adder sums up all the inputs modified by their respective weights. This activity is referred to as linear combination. Finally, an activation function controls the amplitude of the output of the neuron. An acceptable range of output is usually between 0 and 1, or -1 and 1. Mathematically, this process is described in the figure From this model the interval activity of the neuron can be shown to be: The output of the neuron, yk, would therefore be the outcome of some activation function on the value of vk. Activation functions
As mentioned previously, the activation function acts as a
squashing function, such that the output of a neuron in a neural network is between
certain values (usually 0 and 1, or -1 and 1). In general, there are
three types of activation functions, denoted by Φ(.) .
First, there is the Threshold Function which takes on a value of 0 if
the summed input is less than a certain threshold value (v), and the
value 1 if the summed input is greater than or equal to the
threshold value. The artifcial neural networks which we describe are all variations on the parallel distributed processing (PDP) idea. The architecture of each neural network is based on very similar building blocks which perform the processing. In this chapter we first discuss these processing units and discuss diferent neural network topologies. Learning strategies as a basis for an adaptive system A framework for distributed representationAn artifcial neural network consists of a pool of simple processing units which communicate by sending signals to each other over a large number of weighted connections. A set of major aspects of a parallel distributed model can be distinguished :
Processing unitsEach unit performs a relatively simple job: receive input from neighbours or external sources and use this to compute an output signal which is propagated to other units. Apart from this processing, a second task is the adjustment of the weights. The system is inherently parallel in the sense that many units can carry out their computations at the same time. Within neural systems it is useful to distinguish three types of units: input units (indicated by an index i) which receive data from outside the neural network, output units (indicated by an index o) which send data out of the neural network, and hidden units (indicated by an index h) whose input and output signals remain within the neural network. During operation, units can be updated either synchronously or asynchronously. With synchronous updating, all units update their activation simultaneously; with asynchronous updating, each unit has a (usually fixed) probability of updating its activation at a time t, and usually only one unit will be able to do this at a time. In some cases the latter model has some advantages. Neural Network topologiesIn the previous section we discussed the properties of the basic processing unit in an artificial neural network. This section focuses on the pattern of connections between the units and the propagation of data. As for this pattern of connections, the main distinction we can make is between:
Classical examples of feed-forward neural networks are the Perceptron and Adaline. Examples of recurrent networks have been presented by Anderson Training of artifcial neural networksA neural network has to be configured such that the application of a set of inputs produces (either 'direct' or via a relaxation process) the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way is to 'train' the neural network by feeding it teaching patterns and letting it change its weights according to some learning rule. We can categorise the learning situations in two distinct sorts. These are:
Modifying patterns of connectivity of Neural Networks
Both learning paradigms supervised learning and unsupervised learning result in an adjustment of the weights of the connections
between units, according to some modification rule. Virtually all learning rules for models
of this type can be considered as a variant of the Hebbian learning rule suggested by Hebb in
his classic book Organization of Behaviour (1949) (Hebb, 1949). The basic idea is that if two
units j and k are active simultaneously, their interconnection must be strengthened. If j receives
input from k, the simplest version of Hebbian learning prescribes to modify the weight wjk with |