Neuromorphic Vision Systems, will they make Traditional Machine Vision Systems Obsolete?

Nov 23 2018

Tradeshow Talks with IniVation AG

What is a neuromorphic vision system?

As the name implies, neuromorphic vision systems are inspired by aspects of the design of your retina and brain. In particular, your retina does not send pictures – frames – to the brain. Rather, it actually pre-processes the light, transmitting only changes in light intensity (to a highly simplified first approximation). It is this and other aspects of the human visual system that we have replicated in technology, which we call the Dynamic Vision Sensor (DVS). In the DVS, every pixel works independently of the other pixels in analog mode, providing an event-based stream of changes as the raw sensor output.

Processing visual information in this way has three distinct advantages. The first is that the output data stream is very sparse, as only local intensity changes are being encoded. This saves on compute and energy. The second advantage is that the system can respond very fast, typically within tens of microseconds, because the pixels are operating continuously in analog mode instead of waiting for successive frames to be captured. The third advantage is the very high dynamic range of the system – around 120 dB – due to the fact that the changes being encoded are always relative intensities.

DVS technology came out of over 20 years of research at the Institute of Neuroinformatics at the University of Zurich and ETH Zurich in Switzerland, from where we founded iniVation in 2015. The DVS is an example of visual system that is more like how our own brains and eyes work, and how the advantages of building systems in this way can have real-world advantages.

How is neuromorphic vision different from conventional machine vision systems?

Standard machine vision systems solve problems by brute force. Frames are collected and processed, without regard for the highly redundant information within and between frames. This approach is very inefficient in terms of compute, and also suffers from the speed and dynamic range issues inherent to standard machine vision.

What industries do you envision this technology being used in?

We are finding applications almost everywhere in machine vision. Some key areas of interest include factory automation and robotics, where the emphasis is on fast response times. In other areas such as IoT, the benefits of DVS include low power consumption and high dynamic range. In the automotive industry there is high interest in vision both within and outside the car.

How does this technology enable you to use smaller and cheaper computers?

The key is the combination of the sensor and a new class of event-based algorithms. Conventional vision processes every frame individually, and then tries to match corresponding features between frames – a highly inefficient, slow process. Instead, using DVS, every pixel coming in provides meaningful information. This allows certain algorithms, like tracking for example, to be implemented very easily using local computation because you are only seeing what is moving in the scene, and those movements are always near to where the last movement was. In this way you can end up with much more compact, efficient kinds of algorithms.

Is this therefore an AI-based learning system or is this separate from that?

There are at least two approaches for incorporating AI into DVS. The first approach is to use the events to create synthetic frames and feed them into existing deep networks. This method works, and has the advantage that the frames are only generated on demand (no events = no frames). In addition, those frames also have high dynamic range and are very sparse. The second approach is to develop a class of event-based AI algorithms, which are far more efficient than existing deep learning systems. This work is in the early stages, but we are expecting to be able to reduce computing requirements by two orders of magnitude.

Are you working with any customers currently that you can talk about?

Currently everything is under NDA, but we can make a few general statements. Our technology is in use at over 200 organizations worldwide. Half of the top 10 car companies are evaluating the sensor, both for their concept cars and their internal production needs. Half of the top 10 consumer electronics companies working on projects related to mobile phones and other types of devices. Another big area is in automation, and also aerospace for space observation and satellite observation.

Will this make traditional machine vision more obsolete as other industries find out about it and go towards a more specified route as opposed to the brute force method?

I believe our DVS technology will become a standard in machine vision, as the need to get more compute for your dollar (or Watt) becomes more and more important. It's natural that the advantages of our technology will become more and more widespread.

I believe that many future image sensors will have a combination of existing pixels – which are great at taking static pictures – and our pixels, which excel at dealing with motion. A similar idea is already starting to appear in some image sensors, where sensor manufacturers put auto-focus pixels directly into the pixel array. In my opinion this trend will continue, incorporating more computation directly into the image sensor using technologies such as 3D stacking and so on. Computational photography – currently a hot topic for capturing high-quality static images – will start to merge with the DVS methods we are pioneering. Computer vision will become more and more a mix of software and highly specialized hardware operating closer and closer to the image plane. This trend will make image sensors more and more like our own retinas.