The following report details stage 1 of the research study “Artificial Intelligence in a Simplified Driving World Environment”. Artificial intelligence (AI) focuses on the development of machines to be able to perform intelligent tasks independently, replicating the actions that a human expert would take in a given situation. The AI market is growing annually, and the AI robotics market is predicted to exceed more than $12 Billion at a CAGR of 29% by 2024. The focus of this report is on how AI can be used to control autonomous vehicles, more specifically using machine learning and artificial neural networks (ANNs), in a simulated environment. Currently, 94% of road accidents are caused by human error, and the development of autonomous vehicles has the potential to drastically reduce this figure. ANNs are currently one of the biggest subsections of AI being researched to solve this issue.
During this stage 1 report, the literature review, preliminary studies, project objectives and project plan are defined to prepare for stage 2. Background research on AI and ANN design is conducted throughout the literature review. The many subsections of AI are first categorised, before reinforcement, unsupervised and supervised learning are described when discussing machine learning techniques. This project involves supervised learning, more specifically regression, as the operator will be “supervising” the learning process and continuous output variables are being analysed. ANNs are then covered in terms of the network structure and some developments to improve the network’s learning process, including the artificial neurons used, backpropagation, cost functions and gradient descent. Finally, the state of the art is discussed – namely recurrent neural networks and convolutional neural networks (RNNs and CNNs). However, given the short time frame of the project, the ANN to be created in this project will not be either of these due to the complexity of these networks and the computational power required to run them.
Preliminary studies have been conducted ahead of the project objectives, as further information is required to put the objectives into context. The Open Racing Car Simulator (TORCS) and Python are the simulator and programming language used in this project. Analysis of previously created TORCS robots (“SnakeOil” and “ClientEdwards”) has then been performed. The autopilot code used in these robots will help with the development of a more sophisticated autopilot using ANNs. Some basic scripts have been created to gain familiarity with controlling a TORCS robot using Python, such as making the robot do donuts (full lock steering, full throttle acceleration). The use of Xbox controller joysticks has also been explored to improve the vehicle’s steering performance over basic keyboard controls. Finally, a machine learning framework, TensorFlow, has been researched, which could allow better visualisation of the ANN’s behaviour and easier adaption of learning parameters.
In stage 2, a sophisticated, robust ANN will be created to control a robot vehicle in TORCS and emulate the lap of a human operator, learning from each lap it performs as more data is collected. These are defined in the project objectives, with a Gantt chart used to represent the project plan. The report, presentation and project poster will then be completed.
Artificial intelligence has been a growing research sector attracting great interest year on year. The AI robotics market alone has been predicted to exceed more than US $12 Billion at a CAGR of 29% by 2024 [1]. The broadness of AI means that only the most relevant aspects to “Artificial Intelligence in a Driving World Environment” are discussed in this report. The focus is centred around its potential uses in autonomous vehicles in the future – in particular the extent to which artificial neural networks (ANNs) could be safely and successfully used.
Autonomous vehicles are desirable due to their potential to reduce annual road accidents. Neural networks are one of the most widely-researched subfields of AI and machine learning being applied to autonomous vehicles, and so they are discussed in detail in this report. Currently, 94% of road accidents are caused by human error every year [2]. Successful ANN-operated vehicles have massive potential to reduce this figure, but the challenge is ensuring the networks are robust, safe, and capable of learning from new scenarios encountered.
Although neural networks have been in development since 1957, when perceptrons were first proposed by Frank Rosenblatt [3][4], the concept has only taken off in recent years with the discovery of deep neural networks, or “deep learning” in 2006 [5]. Now, evidence of ANNs can be seen on a daily basis, with their uses essential to the progression of companies such as Google and Facebook. Much like the classifications of AI, many different types of ANNs exist too. Therefore, only those with the greatest relevance to the research topic have been discussed, namely variations of traditional ANNs and convolutional neural networks (CNNs), in particular deep learning CNNs, which are those currently being used in the most sophisticated and state of the art autonomous vehicles to date, such as those used by Nvidia [6]. Currently, the complexity of CNNs and the computing power required to run them makes them unrealistic to develop given the time restrictions of the project, but may covered should any further research be conducted following this project.
The aims of this project are to develop a sophisticated understanding of Python and ANNs, and to use this understanding to control a vehicle using ANNs to race around a track on a racing car simulator. The aim is to perform the fastest, controlled “flying” lap time possible, with no collisions. The neural network will learn from training data, collected from a lap performed manually by the operator, with the ANN aiming to replicate this lap to its best ability. It shall then learn to improve its performance by using data collected from laps it performs itself. This project could therefore be seen as the early stages of designing an autonomous racing vehicle, developing and refining the code of a neural network design and evaluating its performance.
Artificial intelligence, machine learning techniques and neural networks are covered extensively throughout the literature review, along with the current applications and state of the art developments to date. Preliminary studies conducted are then covered before discussion of the project objectives, as knowledge of programmes and methodology are first required to give context to some objectives. The project plan can then be found in section 6).
Literature Review
A Background of Artificial Intelligence
AI can have a vast range of applications and has the potential to be utilised in multiple sectors because of its adaptability. Due to the large numbers of fields within AI, only those subsections with particular relevance to its use in autonomous vehicles have been considered in this report. A breakdown of the key fields of AI are shown in Figure 9, Appendix 7.1.
Expert Systems
Expert systems are one of the earlier developed AI systems. They are knowledge-based systems that attempt to replicate the decision-making process of an expert operator in the field. Using a similar method to human-reasoning by utilising “if” or “for” statements [7], the system can solve complex problems in specialised areas under the supervision of the expert. The issue with expert systems is that they require considerable human expertise to ensure their high performance and reliability. This can have extreme risks in potential fields such as in medicine and finances should the system develop an error [8].
Vision Systems
AI in computer vision is used to construct explicit, meaningful descriptions of objects from images that the system is presented with [9]. However, the system must be able to do more than just image processing, as it must also be able to understand the image in order to recognise similar images. Applications can include the system being able to recognise many different images, from numbers and letters to faces and natural land formations.
Knowledge Representation
AI is able to represent knowledge symbolically and can manipulate data using reasoning programmes. Techniques such as logic-based AI or a top-down approach can be used to build the behaviour of the system intelligently, finding out what is needed by an agent before applying the most effective computational solution [10].
Deductive Reasoning and Problem Solving
AI can be used to develop logic and reasoning in order to be able to answer problems, based on statements or conditions that can be formed by the system. This is similar to knowledge representation, in which a top-down, logical approach can be utilised to most effectively draw conclusions from the statements, allowing possible solutions to the problem to be narrowed down until a solution is found [11].
Machine Learning Techniques
Machine learning is the field of AI which has the greatest relevance to this study and is therefore covered in more detail through section 2.2. Shown by Figure 9, Appendix 7.1, its techniques can also be broken down into deeper subsections of AI. This is important in order to understand exactly what type of AI is being developed in this project, and to justify the choice of techniques which allow the machine to learn. Machine learning can be classified into three main subsections, based on the three types of feedback an agent can learn from: reinforcement, unsupervised and supervised learning [12]. The interaction between agents and the environment is typical for all machine learning cases, where multiple agents can exist. Agents either use sensors to perceive what is happening in the environment or perform actions within the environment using actuators to modify what is happening.
Reinforcement Learning
Reinforcement machine learning lies in-between supervised and unsupervised machine learning. The machine learns from either success or failure, and receives reward or punishment based on its actions to determine whether they were “good” or “bad” [12]. This is known as “reinforcement” being learned by the machine and it learns from a series of these reinforcements to decide on the next action to be taken.
A Markov Decision Process (MDP) is also typically used to convert the environment into a set of states that the agents can understand [13]. The machine will decide what it thinks will provide it with the maximum potential reward signal possible, or if it is yet to receive reward in the learning process, try to avoid the actions that will most likely result in punishment [13]. This method is therefore similar to trial and error, and allows the machine to learn what to do without direct labelled examples of what it should do in different situations [12]. Algorithms used in machine learning include Bayesian learning, neural networks, deep learning and manifold learning techniques [14]. Reinforcement learning works well for many scenarios in which “correct” decisions can be easily classified, however in some scenarios it is not so easy to define what is a good move by the algorithm and what is a bad move. This is the case with driving, where although it may be possible for the machine to learn through trial and error, supervised and unsupervised learning are simpler and more efficient with regards to deciding autonomous manoeuvres.
Unsupervised Learning
Unsupervised learning is similar to reinforcement learning in that it attempts to process datasets that have not been classified or labelled, so no feedback is given [12]. However, this method is more similar to the way that human learning works, where the algorithm must instead infer patterns from input data and respond to information provided without guidance. This is done by finding relationships or structure within a given dataset, where the machine can observe and learn different relationships between elements. Using this knowledge, the machine then attempts to categorise data without a given set of instructions or rules, as in supervised learning [15], allowing unsupervised learning systems to also have the ability to perform more complex tasks [16]. Unsupervised learning most commonly uses clustering analysis to perform tasks, such as hierarchical and k-Means clustering or clustering through the use of neural networks, to most effectively group data that first appeared hidden into classes [17] [13]. Decisions can then be made based on patterns from the input data to find an output solution.
It is believed that unsupervised learning has the potential to improve upon supervised learning techniques, particularly in classification methods. However, at this current moment in time, well-performing unsupervised machines have been difficult to create, despite rapid progress being made over the past few decades [16]. As a result, supervised learning machines still seem to be the preferred choice for most researchers [18][19].
Supervised Learning
Supervised learning allows learning from datasets using a given set of rules or actions and specific outputs that are pre-defined by a human operator, who “supervises” the learning process, and has a clear idea of what actions or deductions of the machine are “correct” for a given scenario [12]. The system computes the process of getting from the inputs to the desired outputs, inductively learning from a given set of training data to find and achieve what the operator believes to be the best action for the given scenario to get to the desired goal [12][13]. As the system knows all of the inputs and possible outputs, it has to find the best mapping function to get the system from the inputs as close to the correct output as possible. It can therefore effectively be thought of as learning from instruction or examples [15].
The choice of using either using supervised, unsupervised or reinforcement learning depends mainly on the available computing power and other factors relating to the volume and structure of the data [18]. Supervised learning is deemed to be the most suitable choice for this project as the desired outputs are known, so by learning from a given set of labelled training datasets from driving examples, it should be possible to guide the vehicle to learn and perform faster lap times per run. The simplicity of supervised learning is also most suitable for the short duration that this project is running, and the project considers the most currently favoured machine learning technique in today’s research [18][19].
Supervised Learning: Classification vs. Regression
Supervised learning can be separated into two sub-sections: classification and regression. Both attempt to approximate a mapping function to get from a set of input variables to an output, however the difference between the two is that regression focuses on finding continuous output variables, whereas classification focuses on finding discrete output variables [20]. For example, classification can involve a machine being given a number of different images, such as cats and dogs, and being told what they are. If another image is then shown to the machine of a cat or a dog that it has not yet seen, it is expected that, based on previous data it has collected, it will be able to determine which animal it is being shown, and state this as its output. Comparatively, regression is what is being considered in this project – driving a vehicle around a race track. The required output variables in the simulation, such as vehicle speed and steering rate, will be continuously changing as the vehicle is positioned on different areas of the track. The regression algorithm must find the best prediction of the discrete input values, such as the throttle speed and steering angle, to allow the vehicle to achieve the desired outputs.
Both methods can utilise the same algorithms in some cases, such as neural networks and decision trees. However, some algorithms are also only applicable to one method or the other. For example, linear regression can only be used in regression, whereas logistic regression can only be used in classification [18]. In this project, neural networks will be used because of their adaptability and their future potential in the field of AI over other methods, as they more closely replicate human thinking than any other method of AI created to date [21].
Artificial Neural Networks (ANNs)
Neural networks, or ANNs, are not so easily categorised or defined as shown by their multiple appearances in Figure 9, as they are a key component of multiple subsections of AI and because of their adaptability in the field. Instead, the different network types and models are defined. In combination with deep learning, many subfields of AI can presently best be developed and solved using ANNs and their future potential is of great interest to researchers in the field of AI [5].
Other machine learning techniques do exist, such as Support Machine Vectors (SVMs), which can also produce highly accurate results [22][3], similar to those of neural networks. The truth is that there are many varying opinions on which learning technique is currently the best, which is why research is split into all these different fields. However, with the potential and continued appearance of certain neural networks currently in development, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) [5][19][23], neural networks have been focused on throughout this report.
Neural networks have existed since 1957 when perceptrons were first proposed by Frank Rosenblatt [3][4]. Since then, they have come and gone as researchers struggled to use different concepts of neural networks in practice [24]. However, in 1986, the return of successful backpropagation methods saw excitement around neural networks resurface [25], with slow progression of neural nets picking up pace in the 1990’s to get to the point where they are now, with mass interest in the development of deep learning systems and attempts to create unsupervised learning systems [24]. With the huge number of networks created and experimented with during this time period, only those with particular relevance to this project have been covered in this report. A background on the key concepts in neural networks is first discussed, followed by an explanation of more complex networks that could be researched should further work be carried out on this project.
Network Structure
The basic structure of a neural network can be described as follows, shown in Figure 2. ANNs have a number of layers comprised of artificial neurons, starting with the input player made up of known input variables, and ending with the output layer, defining the output variable or variables [26]. As only supervised machine learning is considered in this project, these output variables will be pre-known to the network. In between the inputs and outputs are layers known as hidden layers, which is a term used to simply describe layers that are neither the input or output layer [27]. A neural network can have any number of these, but the larger number of layers usually gives an indication it’s the complexity and sophistication. Networks with more than one layer are known as multiple layer or multi-layer networks, and most networks used today are typically multi-layered in order to solve problems of increasing complexity that are difficult to solve through other means [26].
Figure 1 – Perceptron Process [5] Figure 2 – Multi-Layer Neural Network [27]
Artificial Neurons
Artificial neurons are fundamental to all neural networks. Their job is to take all of the binary inputs being fed into them and compute them alongside some given weights and a bias to produce a single, binary output [28]. Two main types of artificial neurons are considered in this report: the perceptron and the sigmoid neuron.
The first artificial neuron created was the perceptron, which on its own was the first known neural network, created by Frank Rosenblatt [4]. Weights, w, are used to apply a level of importance of the inputs, x, to the output, resulting in the output becoming either a 1 or a 0, while the bias, b, is used to represent a threshold value, specified by the operator, to allow classification of the output value using Eq. (1) [5]. If the formula is less than or equal to 0, the output of the perceptron will equal 0, but if it is greater than 0, the output will equal 1:
output={█(0 if wx+b≤0@1 if wx+b>0)┤ (1)
It is important to understand that each perceptron only ever has one output value as demonstrated in Figure 1, even if Figure 2 seems to suggest otherwise. The multiple outputs from perceptrons here simply represent the single output being fed into the multiple inputs of the next layer of perceptrons. The problem that perceptrons have are that they can vary dependent on the previous layer’s results. If a small change in the weight of a perceptron is made in an attempt to re-classify a parameter, for example if the output is not what was expected or desired by the operator, it can alter the entire network behaviour [5]. The output of certain perceptrons could completely flip from either 0 to 1 or 1 to 0, making the change hard to control. Sigmoid neurons were proposed to counter this issue.
Sigmoid neurons are artificial neurons that contain a sigmoid activation function to compute the inputs to the desired outputs and are an improved version of the original perceptron idea. Contrastingly to perceptrons, they allow the output of the artificial neuron to be any number between (and including) 0 and 1. Therefore, making small changes to the weights or bias do not have major effect on other neurons across the system, allowing re-classification of mis-judged parameters far more easily [29]. The output of the sigmoid neuron can be represented by Eq. (2) [5]:
output=1/(1+e^(-∑_j▒〖w_j x_j-b〗) ) (2)
Cost Functions and Gradient Descent
Now the network structure is better understood, improvements that have been made to the calculations in neural networks can be explored. Inputs can be obtained from a training set of data for the algorithm to learn from and pass through the network. The next requirement for the network is to be able to quickly iterate to find the weights and biases for the output of the network to be correctly approximated. To find out how well the output is being calculated, a cost function, C, is used. The aim of the algorithm is to minimise this cost to be as small as possible through the use of the weights and biases. This is most easily achieved through a concept called gradient descent [30]. Here, the task of the algorithm is to find weights and biases that will allow the neural network to iteratively converge to a solution as fast as possible as it learns, but at a controlled learning rate, η, so that the machine does not learn too quickly and pass the desired output value. The choice of this variable is therefore very important to the algorithm and must be adjusted accordingly [31].
The Stochastic gradient descent is an improvement upon the more simplistic gradient descent method, as it is able to process much larger numbers of training inputs at a quick rate [5][31]. The gradient of the cost function, ∇C, is estimated in this algorithm for just a small sample of training inputs, or a “mini-batch”, as opposed to the entire set of training data in the case of the gradient descent method [31]. This allows a quick, accurate estimate of the gradient to be obtained to improve the learning process, using Eq. (3) and Eq.(4) [5]:
w_k→w_k^’=w_k-η/m ∑_j▒(∂C_(X_j ))/(∂w_k ) (3)
b_l→b_l^’=b_l-η/m ∑_j▒(∂C_(X_j ))/(∂b_l ) (4)
Here, w_k and b_l and the weights and biases, η is the learning rate, m is the small number of randomly-chosen training inputs, C is the cost function and X_j is the randomly chosen training input chosen for that calculation. Then, another mini-batch of training inputs can be chosen and the same calculations performed, and again for the next match, until all of the training inputs have been trained with. Once this has been completed, one “epoch” is said to have been completed, and the process can be restarted for the next epoch.
The cost function can then be used to quantify how well this output is being calculated. It is possible to use the quadratic cost function to do this [32]. However, alongside the use of the sigmoid function, the choice of using the cross-entropy cost function is far better. This is because the quadratic cost function can experience a phenomenon “learning slowdown” on poor choice of the starting weights and biases, where the cost function takes a long time before the learning process speeds up to find new weights and biases that will achieve an output close to the desired output [5].
Although this is only an issue if the chosen starting weights and biases are poor, the cross-entropy cost function is a safer choice as it does not have this delay even with poor choices. Its ability to improve the speed of learning, after possible necessary adjustments to the learning rate, η, is proven to show that the cross-entropy cost function is almost always the better choice, providing sigmoid neurons are used to allow its implementation into the network [5]. For multiple neurons, the cross-entropy cost function can be stated using Eq. (5) [5]. n is the number of items of training data, x the training inputs, y the corresponding desired output, a the neuron’s activation output, L the layer, and taking the sum over every output neuron, j:
C=-1/n ∑_x▒∑_j▒〖[y_j ln(a_j^L ) )+(1-y_j )ln(1-a_j^L)]〗 (5)
Backpropagation
Backpropagation is what has allowed gradient descent algorithms to be computed by allowing the weights and biases in the neural network to be better understood, and see how they affect the overall behaviour of the neural network [33][34]. It makes two assumptions of the cost function, in that it can be written as an average, and can also be written as a function of outputs of the neural network. It then applies four fundamental equations to compute the error and gradient of the cost function. Using a set of input training examples, it computes the data, calculates the output error in the weights and biases used, then backpropagates the error before updating the weights and biases accordingly to become more appropriate values, ready to perform the gradient descent calculation [19]. The process of “backpropagating” through the network starting from the output layer is why this method is named backpropagation.
Deep Learning and Convolutional Neural Networks (CNNs)
Effectively, the “depth” of the network is dependent on how many layers are included in it, and a network made up of two or more layers can be classified as a deep neural network [35]. Early layers typically answer more simplistic questions before later layers are used to perform more complex decisions or tasks. Deep learning uses backpropagation to automatically calculate and tune the weights and biases in the network based on results from the previous layer [19]. It is widely regarded as a powerful learning tool with great potential, particularly in robotics, machine learning and computer vision, with rapid advancements over the last five years in the field [36]. However, although it works well for many types of neural networks, deep learning can become extremely complex in CNNs.
The architecture of CNNs is completely different to the standard architecture used in most neural networks, as demonstrated by Figure 3 where neurons are instead laid-out in a square format, but this also makes the network very fast to train. Three basic ideas are used in CNNs named local receptive fields, shared weights, and pooling, which create an extremely complex network of neurons to create the network [19][5]. Deep CNNs are some of the most powerful and fast to train neural networks used to date and could speed up the learning process for this study even further, when done successfully. However, there have been many unsuccessful attempts at creating these for tasks outside the field of figure recognition, and attempting to use CNNs alongside the stochastic gradient descent and backpropagation together has been difficult to accomplish [5]. Therefore, due to its complexity and given the time limitation of this project, it would be unrealistic to be able to create a deep learning CNN for the given scenario, and would be far more beneficial to the project to first explore the concept and abilities of more standard feedforward artificial neural networks. The depth of the network will also be kept shallow due to the time-frame of the project, with deep neural networks being notoriously hard to train.
Figure 3 – Local receptive field scanning in CNNs
Recurrent Neural Networks (RNNs)
It should also be noted that only feedforward neural networks are described through Section 2. These are more simplistic networks, where information is only fed forwards through the layers that comprise the neural networks, and never fed back [37]. If loops were present in the system then artificial neurons would become dependent on themselves, which is illogical. RNNs are a solution to this issue, where the artificial neurons are fired for just a limited time duration before they become dormant, and a gradient-based method utilising Long Short-Term Memory (LSTM) units is used [38].
This method of thinking more closely represents how the human brain works, which is why it is of interest to many researchers. Although RNNs have the potential to solve complex problems far more easily than feedforward networks [23], they are currently less powerful in their current state of research and difficult to train [39], so for the purpose of this report and the research that will be conducted, only feedforward networks are considered, although they are still very sophisticated algorithms that are more than capable of producing accurate results during the study.
Applications of AI
Global Applications
The current applications of AI reach into a huge variety of different sectors, as mentioned in section 2.1. Speech recognition has allowed developments in home devices such as smart home hubs, including Amazon’s “Alexa” using long short-term memory neural networks in speech recognition [40]. Apple’s “Siri” also switched from using more traditional AI methods to use neural networks instead in 2014 [41]. Other applications include Netflix using deep learning to recommend new TV shows for viewers to watch based on their previous viewing preferences [42], image recognition, such as facial recognition in Facebook’s “DeepFace” and Google’s “FaceNet”, using machine and deep learning as a form of security [43], and even in investment banking, such as ING’s “Katana” using predictive analytics to aid traders in making better statistical predictions [44]. The number of AI start-ups globally are increasing annually, with a fourteen times increase since the year 2000 [45], together with the interests of major businesses, further increasing demand in the sector.
Automotive Applications
In term of the specific automotive applications of AI, autonomous vehicles are the key focus. However, AI has been successfully integrated into vehicles in many different forms already. Speech recognition systems have been used to combat the dangers of drivers using their phones while driving, as well as to improve the entertainment system and outdated driver interfaces [46]. Eye tracking, driver monitoring and ADAS systems can also more directly improve driver safety, such as “Affectiva”, “EyeSight” and “Nvidia Drive”, by monitoring driving fatigue on long journeys and driver distraction, which can be particularly useful for companies controlling fleets of vehicles [47] [48] [49] [50].
Perhaps the most important forms of AI that are relevant to the development of autonomous vehicles are camera-based machine vision systems, radar-based detection units, and sensor fusion ECUs – some of which have been around since the 1970s [51]. If the intelligent technology is able to process the data being received from these systems, in theory it should be possible to teach or allow the vehicle to learn how to drive in a specified way.
Current Developments
Some of the most famous examples of AI and neural networks being developed in vehicles today are Nvidia’s self-driving car, the Google Car, and multiple Tesla vehicles. All of these vehicles have the capability of driving fully-autonomously. As an example, Nvidia used end-to-end deep learning, using CNNs to develop and control the vehicle [52]. CNNs are able to be used now due to far more sophisticated processing units being developed, allowing the learning process to be greatly accelerated. Nvidia’s “DAVE” and “DAVE-2” systems, based off the “ALVINN” system developed by Pomerleau [53], control the vehicle using a 9-layer CNN comprising 27 million connections and 250,000 parameters [52]. Nvidia have been an important company in the development of self-driving vehicles, with Tesla only recently developing their own AI computer to replace the Nvidia machine used in their vehicles up until August 2018 [54]. The success of this has been highlighted by their continuing growth in the automotive sector [55]. However, due to the high complexity and computing power required to use CNNs, a more traditional ANN will be developed in this study, as discussed. This project is effectively sensor-based, so standard ANNs are able to sufficiently process the 2D array of inputs that will be supplied to the network, such as vehicle speed and steering. The advantages of CNNs are their far quicker learning rate, however a more standard ANN is capable of performing the required calculations for this project, and a CNN would make the network become unnecessarily complex for the scenario considered.
2019-1-6-1546803465