Security has become more complex, with volatile scenarios and crime that can intensify rapidly. While surveillance cameras can monitor activities, sound can often be the first sign of an escalating conflict or a hostile act. Audio analytics adds “ears” to the “eyes” of an existing network video system.

The strongest security and surveillance solutions take advantage of a strategic mix of technologies to achieve the best results. Cameras are the building blocks of any surveillance or security system – the best systems will provide 360-degree, high definition images regardless of lighting conditions of the monitored areas.

The latest generation of cameras have processing power that allows much of the work to be done “on the edge,” in the cameras themselves, rather than in a central server. The more sophisticated in-camera analytics can provide valuable data such as people counting, queue monitoring, or cross-line detecting. Network cameras in prioritized locations serve as extra eyes for any security operations centre. Threats and incidents can be detected automatically, and live video feeds from the scene can assist security personnel and authorities in verifying threats and prioritizing their efforts.

Audio as a deterrent

Integrating audio functionality brings an additional advantage to a video security system: not only can it monitor events, it may help deter them in progress.

Imagine two people having an animated discussion late at night, in an area known to have some criminal activity. It could be a case of escalating aggression, or it might be a lively exchange between friends. It’s difficult to understand the full picture of the event without being able to grasp the nature of the conversation – not the actual words as much as the tone and pitch. Cameras equipped with advanced audio analytics can interpret the level of aggression in spoken language – even if it’s a foreign language – allowing security staff to make a more educated decision on how to respond.

Everything from aggressive voices, gun shots, car alarms, breaking glass, to the sound of graffiti spray cans in use, can be detected and recognized while other ambient noises can be filtered out to limit the risk of false alarms. The precise detection of these sounds allows for early intervention and the correct action being taken by authorities.  When a sound is identified, a network camera can send video and audio files, emails or other alerts, and activate external devices such as alarms or audio warnings via internal network speakers or external horn speakers. In practice, this can also trigger a PTZ (pan, tilt, zoom) camera to automatically turn to a pre-set location, such as a door or entryway to visually verify the audio alarm.

Adding audio saves costs

The inclusion of audio analytics within a camera’s encasing has transformed the modern surveillance camera into an intelligent sound recognition device. This has been possible due to the processing power in today’s smart cameras and the advanced camera application platforms available today that can be programmed to suit the needs of owners and any defence or law enforcement body. Putting the technology into the cameras saves on cost as well as increasing flexibility and scalability, and analytics can be deployed to compatible legacy camera systems. Additionally, bandwidth can be saved as there is no need to stream in full HD all the time; HD is only streamed when triggered by an audio event.

More importantly, only sound characteristics are examined, not speech content, so the system is language independent and improves privacy protection because no continuous recording of audio or video is needed anymore.

One example of audio being used to enhance security is in the holding cells of the police department in Billerica, Massachusetts, a suburb of Boston. Embedded in a smart camera mounted to the ceiling of every cellblock, a sound analytic detects hostile sound waves, such as a person yelling.  Once detected, an alert is sent to the central camera station as well as to the mobile phone of the officer in charge. The department plans to add an additional level of security by adding a network speaker that can play an audio alert like this one: “Disturbance in cell block #2. Please check.” This lets the cell occupant know that someone is coming and could limit the risk of further aggression or threats.

Another example is Rock Hill School district in South Carolina, where audio analytics were embedded in cameras to detect hostility and keep quarrels from getting physical. In the past, someone had to push a button to call security which could delay response time by minutes. Now, the audio analytics notifies an administrator who immediately dispatches a security officer. “That gets our response time down to seconds instead of minutes,” says Kevin Wren, Director of Risk Security Emergency Management for Rock Hill Schools. Since the microphones are mapped to the cameras, when the administrator gets an alert and the audio clip, the live video feed also appears to determine whether the alert is real.

Voice as a deterrent

Perimeter protection is another great example of audio’s ability to halt a criminal act in progress. Imagine a potential intruder climbing a fence. The camera triggers an audio warning via an external network horn speaker to the intruder, “We can see you, you’re trespassing. Security has been notified.” More often than not, this type of warning is sufficient, preventing the need for additional security measures. Adding speakers allows security staff to deliver live messages to perpetrators, deliver instructions to good samaritans, or to reassure bystanders that help is on the way. If live interaction isn’t possible, the camera can trigger a speaker to play a pre-recorded message. Different messages and message sequence can be varied to give the appearance of live monitoring. Both methods have proven effective in deterring crime, as intruders tend to exit quickly once they hear a voice.

The inclusion of audio to a security system changes it from a reactive system, recording activities that can be viewed after the fact, to a proactive deterrence solution that adds new dimensions of insight and opportunity before or at the onset of an activity. Yes, audio might stop someone from making a bad decision or committing a crime, but it also reduces costs associated with such an action; first response or law enforcement activity, administration costs and time associated with the activity, pressure on the judicial system, insurance premiums and eliminates the possibility of injury to all parties involved.

Verification and situational awareness

For many years, false or nuisance alarms have been a major challenge for security professionals, authorities and end-users alike. The ability to verify alarms visually prior to dispatching police did much to alleviate this problem to the point where, in many municipalities, visual verification became a requirement. Adding audio analytics provides a secondary means of verification by allowing operators to both see and hear what is occurring, providing situational intelligence to first responders.

Seeing is believing but using audio for proactive deterrence may reduce costs and save lives. Adding audio functionality makes for a more robust security system and should be considered an extra layer of protection providing more detailed data to better tackle today’s safe city security challenges.   

– Chris Wildfoerster is the Business Development Manager for Audio Solutions at Axis Communications. (
(Photos courtesy of Axis Communications)