Remote Hearing: Multimodal Human Signature Detection via PTZ, IR and LDV Sensors

Project Status: Active (Started: 2003)

Research Description

Laser Doppler vibrometer (LDV) is a non-contact, remote and high resolution voice detector. Vibration of the objects caused by voice reflects the voice itself. After the enhancement with a Gaussian bandpass filtering and an adaptive volume scaling, the LDV voice signals were mostly intelligible from targets without retro-reflective finishes at short or medium distances (< 100m). By using retro-reflective tapes, the distance could be as far as 300 meters. Infrared (IR) imaging for target selection and localization was also discussed for LDV listening. A system has been set up with three types of sensors (IR cameras, PTZ color cameras and LDVs) for performing integration of multimedia sensors in human signature detection. The basic idea is to provide an advanced augmented interface in order to give users the best cognitive understanding of the environment, the sensors and the events.

The research challenge is, without retro-reflective tape treatment, the LDV voice signals were still very noisy from targets at medium and large distances. Therefore, with the state-of-the-art sensor technology, more advanced signal enhancement techniques are needed. Further sensor improvement is also necessary. In addition, automatic targeting and intelligent refocusing is a technical issue that deserves research attention for long range LDV listening. To improve the performance and the efficiency of Laser Doppler Vibrometers (LDVs) for long-range hearing, we design an active multimodal sensing platform that integrates a Pan-Tilt-Zoom (PTZ) camera, a mirror and a Pan-Tilt-Unit (PTU) to the LDV . With the assistance of the vision and active control components, the LDV can automatically select the best reflective surfaces, point the laser beam to the selected surfaces, and quickly focus the laser beam. For accomplishing these functions, distance measurement and sensor calibration methods are proposed using the triangulation between the PTZ camera and the mirrored LDV laser beam. Based on both the measured distances and the return signal levels of the LDV, a fast and automatic LDV focusing algorithm is designed. Furthermore, strategies of surface selection and laser pointing are designed for the platform to automatically point the laser beam to the designated surfaces.

Hear some voice clips captured by the LDV sensor

Related Publications and Links

  1. Y. Qu, T. Wang and Z. Zhu, An Active Multimodal Sensing Platform for Remote Voice Detection, IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM 2010), Montreal, July 6-9, 2010
  2. T. Wang, Z. Zhu, and A. Divakaran, Long-Rang, Audio and Audio-Visual Event Detection Using a Laser Doppler Vibrometer, SPIE Defense, Security and Sensing: Evolutionary and Bio-Inspired Computation: Theory and Applications IV, April, 2010
  3. Y. Qu, T. Wang and Z. Zhu, Remote Audio/Video Acquisition for Human Signature Detection, The 3rd IEEE CVPR Biometrics Workshop, June 25, 2009
  4. Z. Zhu, W. Li, E. Molina and G. Wolberg, LDV Sensing and Processing for Remote Hearing in a Multimodal Surveillance System, Chapter 4 in Multimodal Surveillance: Sensors, Algorithms and Systems, Z. Zhu and T. S. Huang (eds), ISBN-10: 1596931841, Artech House Publisher, July 2007, pp 59-90.
  5. W. Li, M. Liu, Z. Zhu and T. S. Huang, LDV Remote Voice Acquisition and Enhancement, International Conference on Pattern Recognition (ICPR’06), Hong Kong, China, August 2006
  6. W. Li, Z. Zhu and G. Wolberg, Remote Voice Acquisition in Multimodal Surveillance, accepted to IEEE International Conference on Multimedia & Expo (ICME), Toronto, Canada, July 9-12 2006, oral presentation, acceptance rate 22%
  7. Z. Zhu, W. Li and G. Wolberg, Integrating LDV audio and IR video for remote multimodal surveillance, The 2nd Joint IEEE International Workshop on Object Tracking and Classification in and Beyond the Visible Spectrum (OTCBVS’05), San Diego, CA, US, Monday June 20, 2005
  8. Z. Zhu, W. Li, Integration of laser vibrometer and infrared video for multimedia surveillance display, TR-2005006, Computer Science Department, Graduate Center, City University of New York, April 2005. (more voice clips are available by click the above link)


  • Dr. Ajay Divakaran, Sarnoff Corporation
  • Professor Thomas Huang, University of Illinois at Urbana-Champaign (UIUC)
  • Professor Ning Xiang, Rensselaer Polytechnic Institute (RPI)
  • Professor George Wolberg, Department of Computer Science, The City College of New York

Research Associate and Assistants at CUNY

Yufu Qu, Tao Wang, Edgardo Molina, Rui Li, Wai L. Khoo, Weihong Li


NCIIA E-Team Award, “Automating Long-range Vibrometry through Vision and Web Technologies” (#6629-09), PI: Z. Zhu, , 09/01/2009- 01/31/2011

AFRL/HECB, Award No. F33615-03-1-63-83, Integration of Laser Vibrometry with Infrared Video for Multimedia Surveillance Displays, 08/24/03 -10/24/04, PI: Z. Zhu

CUNY Research Equipment Grant Award, Integration of Laser Vibrometry, Infrared and Video for Multimodal Human Detection, 02/19/04-02/18/05, Co-PIs – Z. Zhu and G. Wolberg