With at least three major accidents in 2018, autonomous driving has witnessed some setbacks recently. Infrared imaging, which is also known as thermal imaging, can decrease the likelihood of some kinds of accidents happening.
Autonomous driving is not a novel concept proposed only in science fiction anymore. It has become a fast-developing industry that is growing exponentially because of improved computing capabilities and the availability of cost-effective sensors.
By the 2030s, it is thought that fully autonomous vehicles will be a commonplace on the road. This disruptive technology will have long-lasting impacts on today’s lifestyle and economy.
Dangers on the Road to Autonomy
Some technical challenges must be overcome for fully autonomous driving to reach levels of adoption where it can start disrupting. Sensor fusion – the automatic combining of multiple imaging modalities is required for full detection of a vehicle’s surroundings.
Generally, sensors used to achieve this in current autonomous vehicle designs are LiDAR, radar, ultrasonic, and charge-coupled device (CCD) cameras. Full environmental detection should be possible using these sensors. But as is often the case, theory and practice diverge: current detection system architectures have had a number of major failures in 2018 already.
Tesla vehicles that were on autopilot setting struck a stationary fire truck and a concrete lane divider on January 22nd and March 23rd respectively, with the latter resulting in the death of the driver. In Arizona, on March 18th, an Uber self-driving Volvo – equipped with LiDAR, radar, and CCD cameras struck a pedestrian crossing a road.
The deadly collision occurred because of limited nighttime visibility – the LiDAR and radar systems failed to detect the crossing pedestrian properly. If it had been equipped with an infrared thermal imaging camera the detection rate of this system would have been improved, and the pedestrian’s life likely saved.
Why Infrared/Thermal Imaging?
Infrared (IR) is one of the seven types of electromagnetic radiation. All objects with temperatures above absolute zero give off infrared light. The intensity of this light correlates with the temperature of the object: the higher the object’s temperature the more IR it will emit.
Thermal imaging uses IR radiation by converting the “invisible” light into a visible spectrum perceivable by the human eye. This image can then be utilized for detecting and classifying objects in similar to the detection and classification methods used by CCD cameras.
Infrared can be leveraged in a number of ways to enhance existing autonomous sensory systems:
- Infrared gives more detail on objects than LiDAR and radar
- There are minimal effects on the image from shadows.
- There is no dependence on visible light with better performance during nighttime applications
- Warm-bodied objects exhibit high contrast in cold environments, making them easily identifiable
- There are no blooming effects from lights in the camera’s field of view (FOV)
Where thermal imaging falls short is: low image information content when no warm bodies are present in an image; its high cost – which could be lowered with economies of scale; low levels of image detail; and the image appearance and associated algorithm being dependent on weather and time-of-day conditions.
A direct comparison of a CCD camera image and IR image taken from a vehicle is illustrated below in Fig. 1.
Figure 1. Top image: winter scene taken by a CCD camera; bottom image: the same scene shown in IR spectrum. An example of how IR adds further information to a CCD camera.
All of the relatively warm areas in front of the vehicle can be seen in the lower image. Four pedestrians can be located easily by inspecting all warm spots, when comparing these two images.
By comparing it to the CCD image directly, a warm profile that has a vague figure can be confirmed as a pedestrian, and by matching them with a warm spot in the IR image, any shape seen to be a pedestrian in the CCD image can be confirmed. In this instance, both the IR and CCD images could be described as redundant as both are being utilized to locate a pedestrian.
This imaging redundancy is beneficial, as the information content from both images is different and can be employed to detect and classify targets with a higher probability. This is the concept of sensor fusion in practice.
Getting from Sensing to Detecting
A separate challenge to classification is detecting objects in an image. Detection is done by deciding if there is an object of interest in the frame and enclosing that object inside a shape (normally a rectangular box).
An example of is exhibited in Fig. 2. This detection was done on a CCD image utilizing a frame difference technique called background subtraction. The two pedestrians in the bottom right have been assigned as one object, so the detections shown are imperfect.
Figure 2. An example of detection of pedestrians by a vision system.
The object classification observed in Fig. 1 is a trivial process for the human examining the images. For a computer tasked with identifying the objects, the problem is not trivial though.
From birth, people absorb characteristics of their environment and can classify objects quickly based on years of experience and exposure to them. This classification cannot be done in machine vision unless the algorithm has been trained to recognize distinct images or image features.
In the case of a pedestrian for example, there are many ways they can appear in an image. A pedestrian could be partially occluded, walking behind another pedestrian, wearing different clothes, holding bags, etc.
Expanding this problem to all living things becomes a huge problem which classification algorithms can be trained to address. After detection of an object has been made, it must be labeled with a classification. Using a neural network (NN) is one way to automate this.
In the simplest sense, NNs undertake pattern recognition on an input and decide the probability of that input being something from the training dataset. To train a NN, a labeled dataset must first be taken – labeled detections in this case – and feed it through the NN, which will cause the NN to attach certain images with certain labels, they can then improve detection rates by being compared to results from other sensors.
This detection and classification technique is a more simple way of how it can be done. A number of algorithms compute both the detection and classification simultaneously, employing a type of neural network called a convolutional neural network (CNN).
These varieties of networks take advantage of mid-level features instead of the low-level features that a detection then classification algorithm may use.
Infrared imaging provides additional information to existing sensory systems in autonomous driving. To detect and classify a target or to confirm if an already classified target from another sensor being the assigned classification is probable, the additional thermal information is necessary.
Environmental information content can be increased and accidents like the death of the pedestrian in Arizona can be avoided by implementing IR imaging in autonomous driving.
This information has been sourced, reviewed and adapted from materials provided by Teledyne DALSA.
For more information on this source, please visit Teledyne DALSA.