Current methodologies to reduce the rate of false alarms have been moderately successful, but without ways to radically and accurately filter alarms, the absolute numbers will continue to rise. Recent advances in AI-enabled video surveillance – particularly deep learning algorithms – shows potential in curbing false alarms; in fact, efforts to hone and commercialize this technology have produced promising initial results, with large-scale adoption across the industry is projected for the next two to four years.
An Introduction to Computer Vision
Computer vision performance is a potential solution to unlocking the false alarm issue; however, it must meet certain accuracy thresholds to be made into a viable solution. In other words, computer vision should be solving the false alarm problem instead of contributing to it.
Virtually any attempt by computers to replicate an action by human intelligence falls into the broader artificial intelligence (AI) field, a variety of applications as diverse as video game behavior to modeling human vision. Human-level computer vision has not been considered possible until very recently – due to the approach.
Previous approaches have been based on the concept of “symbolic AI” – where programmers create symbols and then add rules to manipulate those symbols until the computer is capable of making “intelligent” decisions; for example, programming chess pieces into a chess AI.
Real-world scenes have proven to be too challenging to replicate, as programmers would need to code rules for literally every item in the scene; in fact, figuring out what would need to be coded is a problem in itself. How can a programmer figure out how to code the rules that would be considered “a change in angle perspective” or “a lengthening shadow from the setting sun?”
New deep learning and machine learning algorithms can address these problems, thanks to the development of powerful computer hardware capable of running complex calculations. Programmable hardware from GPU makers such as Nvidia and Qualcomm, and flexible cloud providers, such as Amazon and Google, have enabled these approaches to become practical for commercial and personal use.
Artificial Neural Networks are structured like cakes, with deeply stacked layers (10-30 layers, thus the ‘deep’ in ‘deep learning’) of artificial neurons. They are called ‘neurons’ because they have been programmed to mimic the behavior of biological neurons, the type that can be found in human brains. When deployed in sufficiently large numbers, these neural networks can effectively analyze large video scenes for human figures within them.
These visual technologies enable monitoring software to ignore non-human triggers such as vegetation and animals – even if they are actively moving – and focus only on the actions of the humans within a surveillance video.
Demonstrating the efficacy of computer vision technologies requires extended real-world trials. A large alarm monitoring center with hundreds of clients and customers recently ran one such trial for a subsection of their locations. For this trial, an AI-enabled CCTV device was trained and tuned to detect for human intrusions into an area the customer wanted restricted. When this neural network scanned the video frames had a high enough certainty that a human had indeed crossed into its restricted area, it sent an alert, highlighting the intruder in the relevant frames.
To create a control group for comparison, the center installed a Passive Infrared motion sensor (PIR) next to each AI-enabled video surveillance device. When either the surveillance device or PIR sensor were triggered, the systems logged the circumstances and paired a relevant video clip. Analysts then reviewed the clips to determine whether not the alert had legitimately found a person.
Test cases were deployed across a variety of indoor and outdoor commercial areas, including the perimeter of a car sales lot, an indoor storage room and the front gate of a warehouse – to name a few. Customers were not informed that their location had been selected as a test ground.
The video clips were recorded in either infrared or traditional video. The majority of the events (more than 70 percent) occurred at night. When a PIR sensor or light detected a potential intrusion, the cameras captured clips from both sensors at the instant the alert was sent. Afterwards, more than 110 alerts delivered by either the PIR control system or the AI-enabled surveillance device were manually examined to answer the question as to which device correctly identified the presence of a person.
Out of 110 alerts, the test found that the AI-enabled systems were able to filter out 86 percent of false alarms – dropping the number of false alarms from 88 to 13 vs. running only PIR. The most common false alarms were tied to the presence of a dog (28 percent of all filtered out incidents), light or shadows (15 percent), or insects (14 percent).
While neural networks are promising, their use presents several drawbacks. Bandwidth is one of the primary issues, as most cameras need to be connected to the internet and continually stream video data to remote servers, where the actual neural network performs its human segmentation activities. These cameras thus operate in the same family as other “cloud-based” systems that require the presence of an always-on connection.
It is not so easy to run these products and neural networks when the always-on connection is missing. A networked camera recording high-quality video can find that a single camera might upload nearly 50-100 GB a month, depending on the resolution.
Installations with data capped internet services can potentially find this to be an issue. If the connection is not fast enough and the stream too large, the camera would start to “fall behind” as the data coming piles up.
Bypassing the cloud is impractical, as the chips to run these high-performing models are not sufficient to do the analysis at the edge – on board the device. As hardware manufacturers produce more effective silicon to run these AI models, it becomes more feasible to run these neural networks directly on the device. Being able to run outside of an internet connection offers substantial benefits to users and service providers alike.
Shawn Guan is the CEO and co-founder of Umbo Computer Vision. Request more info about the company at www.securityinfowatch.com/12400317.