applied imagery pattern recognition workshop 2014... · 2:10 keynote talk – dr nibir dhar....

53
1 1 Applied Imagery Pattern Recognition Workshop Dedicated to facilitating the interchange of ideas between government, industry and academia in an elegant setting conducive to technical interchange across a broad range of disciplines 2014 IEEE Applied Imagery Pattern Recognition Workshop AIPR 2014: Democratization of Imagery Cosmos Club, Washington DC October 14-16, 2014 AIPR 2014 is financially sponsored by the IEEE Computer Society Technical sponsorship and support is provided by IEEE and the IEEE Computer Society Technical Committee on Pattern Analysis and Machine Intelligence The AIPR Committee thanks Cybernet Systems Corporation, Integrity Applications Incorporated and ObjectVideo Incorporated for their generous support of this year's workshop.

Upload: others

Post on 23-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

1

1

Applied Imagery Pattern Recognition Workshop Dedicated to facilitating the interchange of ideas between government, industry and

academia in an elegant setting conducive to technical interchange across a broad range of disciplines

2014 IEEE Applied Imagery Pattern Recognition Workshop

AIPR 2014: Democratization of Imagery Cosmos Club, Washington DC

October 14-16, 2014

AIPR 2014 is financially sponsored by the IEEE Computer Society Technical sponsorship and support is provided by IEEE and the IEEE Computer Society Technical Committee on Pattern Analysis and Machine Intelligence The AIPR Committee thanks Cybernet Systems Corporation, Integrity Applications Incorporated and ObjectVideo Incorporated for their generous support of this year's workshop.

Page 2: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

2

2

The AIPR Officers:

AIPR Committee Members: Jim Aanstoos, Mississippi State University Guna Seetharaman, AFRL/RIEA, Eamon Barrett, Retired Karl Walli, AFIT Bob Bonneau, AFOSR Elmer,”Al” Williams, Self Employed Engineer Filiz Bunyak, University of Missouri-Columbia Alan Schaum, NRL John Caulfield, Cyan Systems Franklin Tanner, Hadron Industries Charles Cohen, Cybernet Peter Costiannes, AFRL, Emeritus Emeritus: Paul Cerkez, DCS Corp. Larry Clarke, NCI Peter Doucette, Integrity Applications Incorporated Robert Haralick, City University of New York Roger Gaborski, RIT Heidi Jacobus, Cybernet Donald J Gerson, Gerson Photography Joan Lurie, GCC Inc. Neelam Gupta, ARL J. Michael Selander, Mitre Mark Happel, Johns Hopkins University John Irvine, Draper Laboratory Steve Israel, Integrity Applications/ONR Michael D Kelly, IKCS Jeff Kretsch, Raytheon BBN Technologies Murray H Loew, GWU, Carlos Maraviglia, NRL Paul McCarley, AFRL Eglin AFB, Robert Mericsko, Booz Allen Hamilton, Keith Monson, FBI, Carsten Oertel, MITRE, William Oliver, Brody School of Medicine, K. Palaniappan, University of Missouri-Columbia James Pearson, Remote Sensing Consultant Surya Prasath, University of Missouri-Columbia Amit Puntambekar, Intel Corporation Mike Pusateri, LSI Corporation, Harvey Rhody, RIT David Schaefer, GMU

Chairman: Neelam Gupta, ARL Deputy Chair: John Irvine, Draper Laboratory Program Chairs: Guna Seetharaman, AFRL/RI, Rome, NY

Karl Walli, Col USAF Secretary: James Aanstoos, Mississippi State University Treasurer: Associate Treasurer:

Al Williams, Self Employed Engineering Consultant Carlos Maraviglia, NRL

Local Arrangements: Donald J. Gerson, Gerson Photography Publicity: Peter Costianes, AFRL, Emeritus Web Master: Charles J. Cohen, Cybernet External Support: Pete Doucette, Integrity Applications Incorporated Registration Chair: Jeff Kretsch, Raytheon BBN Technologies Proceedings Chair: Franklin Tanner, Hadron Industries Student Paper Award Chair:

Murray Loew, GWU Paul Cerkez, DCS Corp.

Page 3: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

3

3

2014 AIPR Recognizer Sponsors

11600 Sunrise Valley Drive, Suite 210, Reston, VA 20191 USA 1(703) 654 9300

Proud Sponsor of the Applied Imagery Pattern Recognition 

www.objectvideo.com Twitter @OVLabs

Page 4: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

4

4

IEEE AIPR-2014 - Democratization of Imagery Cosmos Club, Washington DC – Oct 14-16, 2014

Day 1: October 14 – Tuesday

8 AM Check in 8:30 Opening remarks and welcome notes.

Neelam Gupta, Don Gerson & Karl Walli 10 min

8:40 Keynote: Dr Peter Highnam, Intelligence ARPA: An Overview 30min 9:10 Q/A. 5 mins. 9:15AM-10:15AM. Session 1: Image and Video Analysis. Session Chair: Col Karl Walli, USAF. 4 Talks. 15 min each including Q+A P1 9:15 Lin, Chung-Ching, Pankanti Shartath. Accurate Coverage Summarization of UAVS. P2 9:30 Madden, Don. Mobile ISR: Intelligent ISR Management and Exploitation for the

Expeditionary Warfighter P3 9:45 Pritt, Mark. Fast Ortho-rectified Mosaicking of Thousands of Aerial Photographs from

Small UAVs. P4 10:00 John Irvine. Imagery-based Modeling of Social, Economic, and Governance Indicators in

Sub-Saharan Africa Coffee Break: 10:15 AM – 10:30 AM.

10:30AM-11:15AM Session 2: Analytics of Democratized Imagery. Session Chair: K. Palaniappan

10:30-10:50 Invited Talk. Dr. Lee Schwartz, the Department of State 10:50-11:10 Invited Talk. Robbie Schingler, COO, Planet Labs Inc. 11:10-11:15 Q&A 5 mins 11:15AM – 12:00N. Session 3: Contextual Models for Persistent ISR. Session Chair: Dr. Raghuveer Rao 3-Talks. 15 min each including Q+A. P5 11:15 Rosario, Dalton. Against Conventional Wisdom: Longitudinal Inference for Pattern

Recognition in Remote Sensing P6 11:30 Jonathan Tucker, Robert Stanfill, Suresh Subramanian. Role of Context in ISR Tracking, P7 11:45 Stylianou, Abby. Images Don’t Forget: Online Photogrammetry to Find Lost Graves.

12:00 Noon–1:30PM. Lunch Break

13:30 – 15:00 Session 4: Persistent Surveillance Analytics. Chairman: Dr. Suresh Subramanian 1:30 Keynote: Dr Philip Perconti. Vision to Perception - Imagery Analytics for Land

Forces 2:00 Q &A 5mins. 2:05 Invited Talk: Dr. Steve Suddarth. WAMI and grand scale ISR systems in the era of

ISIS: It takes a hobbyist to take a village 2:25 Poster Preview – 12 x 2 min.

2:50PM-3:10PM Coffee Break

Page 5: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

5

5

3:10PM-4:40PM. Session 5: Hyperspectral / Polarimetric Imagery and Exploitation. Session Chair: Dalton Rosario P8 3:10 Romano, Joao. A day/night range invariant anomaly detection for LWIR

polarimetric imagery P9 3:25 Holder, Joel. Calibration Methodology and Performance Characterization of a

Polarimetric Hyperspectral Imager P10 3:40 Furey, John. Visible and SWIR Hyperspectral Polarimetric Imaging of Desert Soils

P11 3:55 Gross, Kevin (Martin, Jake). Exploring polarimetric hyperspectral imaging as a tool for improved material identification

P12 4:10 Avishai Ben-David. Geodesic Paths for Time-Dependent Covariance Matrices in a Riemannian Manifold

P13 4:25 David Chenault Polarized microbolometer for target detection in video

4:40 PM -4:45 Day 1 Closing remarks: Karl Walli, Don Gerson and Neelam Gupta.

Poster session and reception.

Poster Session PS-4:45PM. Session Chairman: Carlos Maraviglia. Venue: Poster Hall.

 1. Lee, Lily.  Automated Geo-referenced 3D Reconstruction from Multiview Satellite Images 2. Kumar, Brajesh. Integrating Spectral and Textural Features Using Wavelet Analysis for

Hyperspectral Image Classification. 3. Ferris Michael. Image Structural Similarity Based Metrics: A Performance Study for No-reference

Blurred Image Fusion 4. Skurikhin, Alexei. Learning Tree-structured Approximations for Conditional Random Fields. 5. Manish, Mahajan. Secret Communication in Colored Images Using Saliency Map as Model. 6. Rahnemoonfar, Maryam. Automatic detection of ice layers in synthetic aperture radar images. 7. Ngan, Henry. Robust Vehicle Edge Detection by Cross Filter Method. 8. Baran, Matthew. Time-adapted CLAHE for HDR Tonemapping of Full Motion Video.

9. Daniel Schmitt. Timing Mark Detection on Nuclear Detonation Video. 10. Himanshu Chaubey. Enhanced View Invariant Gait Recognition Using Feature Level Fusion 11. Yu Chen, Ryan Wu and Erik Blasch. A Container-based Elastic Cloud Architecture for Real-Time

Full-Motion Video Target Tracking.

The Cosmos Club has a unique history. Mr. Donald Gerson will give a tour of the Club, on Oct 14 and Oct 15, to those who interested in three small groups each day. We encourage you to check their webpage for the dress code as well.

Page 6: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

6

6

Day 2: October 15 – Wednesday

8:00 AM Check-in 8:20 AM Opening remarks by Neelam Gupta, Don Gerson & Karl Walli

8:30AM Keynote: Prof. John Schott (30min + 10min Q/A) 9:10AM-10:15AM Session-6: Three Dimensional Models from Motion Imagery. Session Chair: John Irvine P1 9:10 Aliakbarpour, Hadi, et.al. Analyzing Effects of On-board Sensor Imprecision in WAMI

Bundle Adjustment

P2 9:25 Bruce Swett. Advanced Video Activity Analytics: Integrating and Maturing Computer Vision and Pattern Recognition Algorithms.

P3 9:40 Recker, Shawn. Depth Data Assisted Structure-from-Motion

P4 9:55 Givens, Ryan. The HSI/Lidar Direct Method for Physics based scene modeling Coffee break 10:10-10:25AM

10:25AM – 11:15AM – Advanced Systems and Technologies: Current and Future Challenges. Session Chair: Richard Thissell

10:25AM – Invited Talk. Scott Fouse, Lockheed Martin Co, California 10:55AM - Invited Talk. Joshua C. Klontz. A Case Study of Automated Face Recognition: The

Boston Marathon Bombings Suspects 10:25AM-12:00Noon. Session-7: Tracking in Persistent ISR. Session Chair: Jim Aanstoos P5 11:15 Viguier, Raphael . Particle Filter-based Vehicle Tracking Using Fused Spatial Features

and a Non-linear Motion Model P6 11:30 Arslan Basharat. Multi-Target Tracking in Video with Adaptive Integration of

Appearance and Motion Models P7 11:45 Carrie Pritt. Road Sign Detection on a smart phone for Traffic safety

12:00 Noon – 1:30PM. Lunch Break.

12:00 – 13:30. Executive Committee Meeting.

1:30 PM – 1:40: Logistics of AIPR2015 announcements – Neelam Gupta

1:40PM – 3:00PM. Session 8: Novel sensors and Applications for Persistent Imagery. Session Chair: Al Williams

1:40 Invited Talk – Prof Rama Chellappa. Persistent and Pervasive Imaging Analytics for Environmental Assessment of Oceanic Resources and Fish Stocks

2:00 Q&A 5 min

Page 7: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

7

7

2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters, 2 min each.

Coffee Break -3:10PM – 3:25PM

3:25-4:40PM. Session 9: Multispectal Imagery and Exploitation. Chair: Pete Doucette

P9 3:25 Timothy S. Khuon, Adaptive Automatic Target Recognition in single and Multi-Modal Sensor Data

P10 3:40 Daniela, Moody. Change Detection and Classification of Land Cover in Multispectral Satellite Imagery using Clustering of Sparse Approximations (CoSA) over Learned Feature Dictionaries

P11 3:55 John Caufield. Small Pixel Focal Plane Array

P12 4:10 Gurram Prudhvi and Raghuveer Rao. Entropy Metric Regularization for Computational Imaging with Sensor Arrays

P13 4:25 Bruce Johnson. Computing a Heuristic Solution to the Watchman Route Problem By Means of Photon Mapping

4:40PM. Closing remarks by Karl Walli, Don Gerson and Neelam Gupta

4:45PM. Poster session, 12 posters. Session Chairman: Neelam Gupta

1. Krucki, Kevin. Human Re-Identification in Multi-Camera Systems. 2. Khryashchev, Vladimir. Gender and age recognition for video analytics solution. 3. Harrity, Kyle. Medical Image Segmentation using Multiscale and Super resolution methods. 4. Varney, Nina. Volumetric Features for Object Region Classification in 3D LiDAR Point Clouds. 5. Borel-Donohue, Christoph. Analysis of diurnal, long-wave hyperspectral measurements of natural

background and manmade targets under different weather conditions. 6. Bhowmik, Mrinal. Background Subtraction Algorithm for Moving Object Detection and

Evaluation of Performance Using Different Dataset. 7. Lu, Min. A Fast Coherent Point Drift Algorithm for 3D Point-Cloud Registration. 8. Pless, Robert, Democratizing the Visualization of 500 million webcam images 9. Andrew Brown, 3D Geolocation Methods. 10. Sasi, Sreela. Human Activity Detection using Sparse Representation. 11. Lt Ashley Green, Capt Robert Slaughter, Dr. John McClory. Modeling of Nuclear Detonation

Optical Output in DIRSIG

Banquet at 7PM.

Banquet Speaker: Richard Koszarski, Professor of English and Film at Rutgers University, "The Coming of Sound to Motion Pictures."

Page 8: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

8

8

DAY 3 - October 16 – Thursday

8:00AM – Check in 8:15AM – Opening remark by Karl Walli

8:20AM – Keynote Talk – Richard Granger. Beyond neural networks to brain engineering: How brains recognize, localize, search, and track

9:00AM-10:00AM Session 9: Medical Image Processing. Chairman: Murray H. Loew P1 9:00 Albalooshi, Fatema. Automatic Detection and Segmentation of Carcinoma in

Radiographs P2 9:15 Lam, Walter. Novel Geometric Coordination Registration in Cone-beam Computed

Tomogram P3 9:30 Alan Schaum. Bayesian solutions to non-Bayesian detection problems: unification

through fusion. P4 9:45 Borel-Donohue, Christoph, Rapid location of radiation sources in complex

environments using optical and radiation sensors 10:00-10:15 Coffee Break 10:15AM-11:00AM Session 10: Radiation Monitoring Applications. Chairman: Alan Schaum P5 10:15 Slaughter, Robert, 3D Reconstructions of Atmospheric Nuclear Detonations

P6 10:30 Stefan, Wagner. Smooth Chemical Vapor Detection.

P7 10:45 Daniel Schmitt. Machine Learning of Nuclear Detonation Features

11:00AM-12:00 noon. Session 11: Seeing through Sound. Session Chairman: David Schaefer P8 11:00 Andy Malcom. Foley: The Art of the Transparent Soundscape

11:20 Invited Talk. Ella Striem-Amit. Visual-to-auditory sensory substitution and the “visual” functional specialization in the blind brain.

P9 11:40 Edward Schaefer. Representing pictures with sound

Lunch 12:00 – 13:00

13:00: Announcement on Student Paper Awards.

13:05-14:20 Session 12: Learning and Exploitation Frameworks I. Chairman: K. Palaniappan. P10 13:05 Comaniciu, Dorin, Marginal Space Deep Learning for Efficient Anatomy Detection in

Volumetric Image Data

P11 13:20 Amit Puntambaker (Intel), Cloud-based Power Efficient High Performance Video CODEC

Page 9: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

9

9

P12 13:35 Collins, Roddy, KWiver: An Open-Source Cross-Platform Video Exploitation Framework

P13 13:50 Walvoord, Derek. An automated workflow for observing track data in 3-dimensional geo-accurate environments

P14 14:05 Verma, Nishchal. Large Displacement Optical Flow Based Image Predictor Model

14:20 – 14:35 Coffee Break

14:20 – 15:20 Session 13: Learning and Exploitation Frameworks II. Chairman: Frank Tanner P15 14:35 Tommy Chin. Indoor Non-Linear Target Tracking Using Wi-Fi and Video Fusion

P16 14:50 Harrity Kyle. Modified Deconvolution and Wavelets Based Fusion

P17 15:05 Prater, Ashley, Sparse approximations to generalized Fourier series through linear programming

P18 15:20 Jing Peng. Approximate Regularized Least Squares and Parzen Windows.

15:35 – 15:45 Coffee Break

15:45 – 16:30 Session 4: Fusion and Mining for Video Analysis. Chairman: Ashley Prater P19 15:45 Amankwah, Anthony. Motion Estimation of Flotation Froth Using Mutual Information

and Kalman Filter P20 16:00 Harrity, Kyle. Multiresoultion Deblurring.

P21 16:15 Oh, Sangmin. Towards visual analysis of unconstrained images in social forums: Studies on concept detection and personalized economy of images in social networks

16:30: Closing Remarks by Karl Walli and Guna Seetharaman

16:35: Adjourn Notice: Neelam Gupta and Don Gerson.

Page 10: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

10

10

IEEE-AIPR2014 KEYNOTES TALKS ABSTRACTS AND SPEAKER-BIOGRAPHIES

Keynote 1: “Intelligence ARPA: An Overview” by Dr. Peter Highnam, Director, IARPA. Dr. Peter Highnam was named IARPA Director on 30 August 2012. Dr. Highnam joined IARPA in February 2009 as the Office Director for Incisive Analysis. Prior to IARPA, he was a senior advisor in the National Institutes of Health (NIH) and then in the Biomedical Advanced Research and Development Authority (BARDA). From 1999 to 2003, Dr. Highnam was a DARPA program manager with programs in electronic warfare and airborne communications. Before joining DARPA, he worked for more than a decade in applied research in industry. Dr. Highnam holds a Department of Health and Human Services Secretary’s Distinguished Service Award and a Department of Defense Civilian Exceptional Service Award. He is a co-inventor on three patents in commercial seismic exploration and holds a doctorate in computer science from Carnegie Mellon University. Keynote 2: “The MapGive Project” by Dr. Lee R. Schwartz, U.S. Department of State As Geographer of the United States, Lee Schwartz holds the position of the Director of the Office of The Geographer and Global Issues in the State Department's Bureau of Intelligence and Research. Schwartz is the State Department’s 9th Geographer, a position that was established in 1921 and bears the statutory responsibility for providing guidance to all federal agencies on questions of international boundaries and sovereignty claims. He also oversees the Humanitarian Information Unit – a U.S. government interagency organization focused on unclassified data coordination for emergency preparedness, response, and mitigation. Dr. Schwartz earned his Ph.D. in geography from Columbia University, with a focus on political and population geography. Prior to joining the Office of The Geographer, Schwartz was a member of the faculty of The American University’s School of International Service. At the Department of State, he has directed research and analysis on global issues primarily related to complex humanitarian emergencies and has coordinated related fieldwork and applied geography projects overseas, in particular in the Balkans, Central Asia, Russia, Afghanistan, Iraq, Sudan, the Horn of Africa, Haiti, and Syria. His work has focused on ethnic conflict, refugee flows, peacekeeping operations, strategic warning, and conflict mitigation and response – with an emphasis on Geographic Information Systems (GIS) and Remote Sensing information coordination as well as Participatory Mapping and Volunteered Geographic Information applications. Lee was the State Department’s 2005 winner of the Warren Christopher Award for Outstanding Achievement in Global Affairs and the 2012 recipient of the Association of American Geographers’ James R. Anderson Medal of Honor in Applied Geography in recognition of his distinguished service to the profession of geography Abstract: The United States Department of State’s Humanitarian Information Unit (HIU) is a unique interagency entity designed to break down stovepipes between federal agencies and coordinate geospatial information critical for humanitarian mitigation and response. Housed in the Office of The Geographer and partnering with agencies that include the National Geospatial-Intelligence Agency (NGA) and the US Department of Defense (DOD), the HIU has developed methodologies and toolkits for geospatial data coordination and collaboration – including disseminating and sharing with NGO and intergovernmental organizations that do the bulk of first-time response to both rapid onset and complex emergencies. Examples of The Geographer’s role in disaster resilience efforts will be drawn from emergencies that include the Pacific Tsunami and Haiyan Typhoon, and initiatives such as “Imagery-to the- Crowd/MapGive” and

Page 11: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

11

11

“ROGUE/GeoSHAPE” that are helping to transform the way governments and nongovernmental organizations collaborate on collecting, curating and sharing geospatial data critical for disaster response. With test trials currently underway, what has been learned so far and how will these technologies enhance operational effectiveness? Breaking down the silo systems that inhibit data sharing and coordination; Leveraging open source tools and platforms to facilitate mobile and “disconnected” data collection and editing; Allowing multiple organizations to contribute to the collection and maintaining of GIS information to improve overall situational awareness. Keynote 3: “Vision to Perception - Imagery Analytics for Land Forces” by Dr. Philip Perconti, Director, US Army Research Laboratory – Sensors and Electron Devices Directorate, Adelphi, MD. Dr. Perconti currently serves as the Director of the Sensors & Electron Devices Directorate of the Army Research Laboratory. He has responsibility for leading and transitioning the Army’s primary basic and applied research programs in sensors, electronics, signal processing, and power and energy component technologies. His duties include operation of unique electronics and photonics materials fabrication and characterization facilities that enable world-class, Army-relevant, component research and development. He is responsible for planning, executing and balancing mission and customer program needs to ensure science and technology dominance for the Army. He served as the Director, Science and Technology Division, US Army CERDEC Night Vision and Electronic Sensors Directorate (NVESD), from 2000- 2013. He was also the, Director, Electronics & Photonics Technology Office, National Institute of Standards & Technology (NIST) during 1996-2000. He led the Imaging Technology Branch, US Army CERDEC Night Vision and Electronic Sensors Directorate during 1990-1996. Abstract: It is now widely recognized that, owing to the rapid growth in sensor technology and their deployment, there is ingestion of massive amounts of data in many operational and reconnaissance missions but progress in the ability to rapidly exploit the data has lagged behind. Imagery in its various forms is typically the principal contributor to the data glut. While there has been ongoing research for a long time in developing techniques for automated analysis that in turn helps human analysts, the challenges to achieving practical solutions are many. Emphasis has progressed from the relatively simple notion of automatic target recognition to one of recognizing and predicting activities. This talk will outline the various challenges and provide an overview of the Army Research Laboratory's research and collaborative efforts in the field, ranging from much needed ground truth imagery collection to development of innovative solutions. Keynote 4: “DIRSIG” by Dr. John Schott, Research Professor Dr. John Schott is The Frederick and Anna B. Wiedman Professor in RIT's Chester F. Carlson Center for Imaging Science. He has been a respected member of RIT's faculty since 1981. His early impact at RIT laid the cornerstone for the university's imaging science program, where he has been a leading researcher, educator, and mentor of students for decades. From this post, John has also been a part of NASA's Landsat Science Team, and the founding director of the Digital Imaging and Remote Sensing (DIRS) Laboratory at RIT. http://www.rit.edu/alumni/ihf/inductee.php?inductee=21 Keynote 5: “Enabling Technologies for Advanced Imaging,” Dr. Nibir K. Dhar, Deputy Director, Night Vision Electronic Sensors Directorate. Advances in imaging technology have huge impact on our daily lives. Innovations in optics, focal plane arrays (FPA), microelectronics and computation have revolutionized camera design. As a

Page 12: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

12

12

result, new approaches to camera design and low cost manufacturing is now possible. These advances are clearly evident in visible wavelength band due to pixel scaling, improvements in silicon material and CMOS technology. CMOS cameras are available in cell phones and many other consumer products. Advances in infrared imaging technology have been slow due to market volume and many technological barriers in detector materials, optics and fundamental limits imposed by the pixel scaling and optics. There is of course much room for improvements in both, visible and infrared imaging technology. This presentation will describe the imaging technology challenges and related projects currently fielded through the DARPA/MTO office. In particular, description of new technology development under a portfolio program, “Advanced Wide Field of View Architectures for Image Reconstruction and Exploitation (AWARE)” will be described highlighting the following: Modular and scalable camera architecture to overcome scaling limitations of conventional imaging system design, and to demonstrate the feasibility of near-linear growth of optical information throughput with increasing imaging system scale. Advancement in infrared pixel scaling and high density FPA technology. Development of Focal Plane Arrays with broadband and multi-band day/night FPA technology. Low cost manufacturing and applications of micro- bolometer thermal technology. Dr. Nibir Dhar received a master’s (1993) degree and Ph.D. (1996) in Electrical Engineering from the University of Maryland at College Park in the area of Microelectronics and Electro-physics. He received a Bachelors’ degree in Electrical and Computer Engineering from George Mason University. Dr. Dhar joined NVESD in April of 2014 as the Deputy Director for science and technology to advance the S&T division R&D activities. Prior to joining NVESD, Dr. Dhar served as the program manager in the Microsystems Technology Office at DARPA since March 2008. At DARPA he developed various innovative technologies in EOIR imaging and transitioned several products. His efforts added significant value to the warfighter's objectives and to the imaging community at large. Dr. Dhar’s work focused in the areas of novel architectures in infrared detectors and imaging systems, nanoelectronics including NEMS/MEMS components, novel materials synthesis techniques, bio-inspired concepts, and approaches to low SWaP-C technologies. Prior to joining DARPA, Dr. Dhar worked as Branch Chief/Team Leader at the Army Research Laboratory (ARL) where he led the Integrated Power and Sensing Technology group to develop infrared sensors, soldier portable power sources, thin films, nanomaterials development and integration of sensor/power technologies. Dr. Dhar was responsible for a wide variety of infrared focal plane array technology including mercury cadmium telluride materials based focal plane arrays (FPA), quantum well infrared photodetectors, Type-II strained layer superlattice, quantum dot infrared detectors and inter-band cascade Lasers. Dr. Dhar received numerous awards and recognitions including the Office of the Secretary of Defense Medal for Exceptional Public Service in 2014. He is a fellow of SPIE. Keynote 6: Beyond neural networks to brain engineering: How brains recognize, localize, search, and track. Professor Richard Granger, Dartmouth College, NH. Abstract: Great progress has been made in object recognition, but many of the characteristics of real world vision tasks remain unsolved. In particular, I) recognition predominantly relies on labeled datasets, yet these are hugely outnumbered by unlabeled image collections; ii) despite the abundance of video data (and the fact that real-world vision occurs in a moving world), most work in computer vision focuses on still images; iii) the tasks of segmentation, recognition, localization, and tracking are typically treated as distinct, yet evidence suggests that information from one can guide the others. We will describe systems being developed to recognize and track objects in real-world environments based on algorithms derived from visual processing mechanisms in brain circuitry.

Page 13: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

13

13

Bio: Richard Granger is a professor at Dartmouth with positions in the Psychological and Brain Sciences Department and the Thayer School of Engineering, and the director of Dartmouth's Brain Engineering Laboratory. He received his Bachelor's and Ph.D. from MIT and Yale. His research ranges from computation and robotics to cognitive neuroscience, and he is the author of more than 100 scientific papers and numerous patents. He is a consultant to, and on the boards of, a number of technology corporations and government research agencies, and he has been the principal architect of advanced computational systems for military, commercial, and medical applications.

INVITED TALKS ABSTRACTS and BIOGRAPHIES

Robbie Schingle, Planet Labs Inc. Democratizing Satellite Image Galleries for Scientific Studies

Planet Labs has launched and steadily expanding a network of Earth-imaging satellites called “Doves.” The network is aimed at creating open data access fueled scientific revolution where persistent imagery of large areas is critical, at a scale that has only been possible with NASA and such agencies. In January 2014, the company delivered Flock 1, the world’s largest constellation of Earth-imaging satellites, made up of 28 Doves. Together with subsequent launches, they have launched 71 Doves, toward imaging the entire Earth, every day. Planet creates commercial and humanitarian value with the market's most capable global imaging network. Fresh data from any place on Earth is foundational to solving commercial, environmental, and humanitarian challenges. Our global sensing and analytics platform unlocks the ability to understand and respond to change at a local and global scale.

Robbie is the Chief Operating Officer of Planet Labs, and is responsible for the company’s business operations and product development. Previously, Robbie worked at NASA serving as the Chief of Staff for the Office of the Chief Technologist, incubating the Space Technology Program. He managed the exoplanet-finding mission TESS, and served as lead for NASA’s Open Government activities. Robbie received a BS in Engineering Physics from Santa Clara University, an MBA from Georgetown, and a Masters from the International Space University.

Steve Suddarth, Transparent Sky LLC, NM. WAMI and grand scale ISR systems in the era of ISIS: It takes a hobbyist to take a village  Abstract: Recent events have shown that the World is currently at risk of major regional or even larger conflicts if current religious, sectarian, and national trends continue unabated. A significant goal in addressing these threats is to stop them while they are as small as possible. A significant additional constraint is that this task must be accomplished using few, if any "boots on the ground". Over the past decade, the U.S. Military created and deployed several massive ISR systems to great effect. The author's view is that we could have even greater impact with some very small, simple, lightweight ISR technologies that fit into a cloud architecture. Although imperfect when compared to forces on the ground, such systems could provide the numbers, reach, and detailed up-to-date information to balance advances in sectarian violence at the village and town level. The talk will present a technical approach to engineering systems that may be

Page 14: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

14

14

particularly apt to addressing the kinds of rapidly evolving widespread threats that are currently seen in the Middle East. The solutions go beyond WAMI and involve large-scale data-handling and functions that could be useful to a broad constituency in practical, affordable and rapidly producible forms. Dr. Steve Suddarth, a retired U.S. Air Force Colonel, is the founder and president of Transparent Sky LLC, which is focused on building affordable and versatile wide areas sensing and exploitation systems. He played several key roles in the creation of modern WAMI systems and could reasonably be considered the architect of some of many key advancements. Dr. Suddarth has worked in related technologies spans nearly three decades including contributions in machine learning, real-time embedded image processing, autonomous drone flight control, and complex systems. He conceived of and led the development of the concept known as Angel Fire and (later) Blue Devil. This included: Developing the concept together with Livermore Lab, performing initial feasibility analysis, naming the project, marketing it to USSTRATCOM, CENTCOM, AF Materiel Command (AFMC); Forming a team that would ultimately involve Los Alamos Lab (LANL), Kitware, Inc., AF Inst. of Technology (AFIT), AF Research Lab (AFRL), and Jet Propulsion Lab (JPL); Arranging for and conducting initial data collections at LANL with the assistance of the AFIT OpTech team; Finding continuing funding through a Congressional appropriation; Providing the initial code that was used to start development. Developing the first version of the Angel Fire software in two months’ time, leading a team of three key engineers. Leading collaborations with the Constant Hawk program throughout the summer of 2005. Leading Angel Fire’s hardware design at LANL and AFIT from fall 2005 through 2006. Leading the growing team for two years through numerous field demonstrations and development of the system to near deployment-ready status. Leading field demonstrations that led to the decision to deploy with the U.S. Marines. Assisted deployment planning for the USMC. Leading the transition of the effort to AFRL for deployment the following year. Dr. Suddarth has proven the ability to assemble the small, efficient, affordable team required, and a concept like Village Peacekeeper could be realized in months with proper support.  

Prof. Rama Chellappa. Summary of the Workshop on Robust Methods for the Analysis of Images and Videos for Fisheries Stock Assessment At a recent workshop organized by the National Research Council in May 2014, researchers working in computer vision and fish stock assessment came together to explore potential areas of collaboration. In this talk, the speaker will present a brief summary of the discussions that took place at this workshop. While there are some commonalities between these two areas such as detection and tracking, shape analysis, metrology and fine-grained classification, the differences in image acquisition conditions and the environments in which data are collected need more robust extensions of existing computer vision methods for assessing fish stock assessment. Prof. Rama Chellappa is a Minta Martin Professor of Engineering and the Chair of the ECE department at the University of Maryland. He is a recipient of the K.S. Fu Prize from IAPR, the Society, Technical Achievement and Meritorious Service Awards from the IEEE Signal Processing Society (SPS) and the Technical Achievement and Meritorious Service Awards from the IEEE Computer Society. At UMD, he has received college and university level recognitions for research, teaching, innovation and mentoring of undergraduate students.He is a Fellow of IEEE, IAPR, OSA, AAAS and ACM and holds four patents.

Page 15: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

15

15

Joshua Klontz, Noblis, West Falls Church, VA. The investigation surrounding the Boston Marathon bombings: A Case Study of Missed Opportunity. The investigation surrounding the Boston Marathon bombings was a missed opportunity for automated facial recognition to assist law enforcement in identifying sus- pects. We simulate the identification scenario presented by the investigation using two state-of-the-art commercial face recognition systems, and gauge the maturity of face recognition technology in matching low quality face images of uncooperative subjects. Our experimental results show one instance where a commercial face matcher returns a rank-one hit for suspect Dzhokhar Tsarnaev against a one million mugshot background database. Though issues sur- rounding pose, occlusion, and resolution continue to con- found matchers, there have been significant advances made in face recognition technology to assist law enforcement agencies in their investigations. Josh is a software engineer at Noblis in Falls Church, VA. He received the B.S., in Computer Science, from Harvey Mudd College. Prior to Noblis, Josh worked at MITRE implementing face recognition algorithms and applications. Afterwards, he studied under Dr. Anil Jain at Michigan State University, focusing on unconstrained and cross-modality face recognition algorithms. Josh is the primary author and maintainer of the Open Source Biometric Recognition project (www.openbiometrics.org), which supports training, evaluating, and deploying biometric algorithms. Josh is also the inventor of a domain specific programming language for image processing and statistical learning called Likely (www.liblikely.org).

Visual-To-Auditory Sensory Substitution and the “Visual” Functional Specialization in the Blind Brain

Ella Striem-Amit, [email protected]

Blindness is a highly limiting condition, affecting millions of people worldwide, and despite much scientific advances there is currently still no widely applicable way to remedy many eye diseases and conditions and thus restore sight. An alternative approach suggests bypassing the eyes altogether, and delivering the visual information directly to the blind brain by using their intact senses. Such sensory substitution devices (SSDs) rely on a real-time translation of visual images to either sounds or touch, and teaching the blind how to process these sensorily-transformed inputs. But what can the blind perceive using such artificial visual information? Can people who have been born blind learn to “see”, and to identify objects and people in their surrounding? And how does the blind brain, which was deprived from its natural input from birth, even process this information? Does it develop the visual specializations that characterize the normally developing brain? We tested these questions by teaching a group of fully congenitally blind participants how to use a visual-to-auditory sensory substitution device (the vOICe), examining their visual achievements and scanning their brains using functional magnetic resonance imaging (fMRI). We found that following a relatively limited training paradigm of only tens of hours (on average 73 hours), the blind could learn to identify that images contain objects, people’s body-shapes, as well as read letters and words. Using the Snellen acuity test we assessed their visual acuity to measure beyond the World Health Organization (WHO) blindness acuity threshold. This demonstrates the potential capacity of SSDs as inexpensive, non-invasive visual rehabilitation aids, alone or when supplementing visual prostheses. Using fMRI we showed that several principles of organization of the normally developing visual cortex are retained in the blind, such as the general division to the two processing streams , and the category selectivity in areas normally preferring visual written script (the visual word-form area) and body-shapes (the

Page 16: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

16

16

extrastriate body area). Thus, the visual cortex showed retention of functional selectivity despite the a-typical auditory sensory input, the lack of visual experience, the limited training duration and the fact that such training was applied only in adulthood. These findings have both practical bearings in relation to sight restoration and the development of sensory aids for the blind as well as theoretical bearings regarding our understanding of brain organization.

Dr Striem earned her doctoral degree from the Hebrew University, Israel. The clinical aspect of her research entailed developing and applying dedicated training methods for visual rehabilitation, in teaching blind individuals how to use sensory-substitution devices. The training program she lead has so far enabled fully and congenitally blind individuals to conduct extraordinary tasks, such as walk in busy corridors while avoiding obstacles, identify and locate everyday objects, notice people in their surroundings and identify their facial expression. An example of these achievements can be seen in a movie appended to one of my recent publications:http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.1371/journal.pone.0033136.s001 (for a streaming version see: http://tinyurl.com/bqe6oz3). Additionally, she assisted in the development and testing of new devices and sensory aids.

Page 17: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

17

17

Abstracts

Page 18: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

18

18

Chung-Ching Lin, Sharath Pankanti and John Smith. Exploratory Computer Vision , IBM T J Watson Research Center, Yorktown Heights NY 10598. [email protected], [email protected]

Automatic video summarization is an important representative workload for UAV operations. A predominant fraction of UAV videos are never watched or analyzed and there is growing interest in having a summary view of the UAV videos for human consumption for obtaining a better overall perspective of the visual content. Real time summarization of the UAV video events is also important from tactical perspective. Our research focuses on developing resilient algorithms for summarizing videos that can be efficiently processed either onboard or offline. Our previous work [1] on the video summarization has focused on the event summarization. More recently, we have investigated the challenges in providing the coverage summarization of the video content from UAV videos. Different from the traditional coverage summarization taking SfM approach (e.g., [3]) on SIFT-based [2] feature points, the UAV videos have many additional challenges including jitter, low resolution, contrast, lack of salient features. We have attempted to use the conventional approach to summarize UAV videos and have found that the features correspondence algorithms do not perform well and consequently, the coverage summarization results display many artifacts. To overcome these challenges, we propose a novel correspondence algorithm that exploits the 3D context that can potentially alleviate the correspondence ambiguity. Our preliminary results on VIRAT dataset shows that our algorithm can find many correct correspondences in low resolution imagery while avoiding many false positives from the traditional algorithms. Acknowledgments: This work is sponsored in part by Defense Advanced Research Projects Agency, Microsystems Technology Office (MTO), under contract no. HR0011-13-C-0022. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government. This document is: Approved for Public Release, Distribution Unlimited. Mobile ISR: Intelligent ISR Management and Exploitation for the Expeditionary Warfighter Donald Madden, Tae Eun Choe, Hongli Deng, Kiran Gunda, Himaanshu Gupta, Asaad Hakeem (DAC), Narayanan Ramanathan, Zeeshan Rasheed and Ethan Shayne Object Video, Inc, and Decisive Analytics Corp. [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Modern warfighters are informed by an expanding variety of ISR sources, but the timely exploitation of this data poses a significant challenge. ObjectVideo presents a system, Mobile ISR, to facilitate ISR knowledge discovery for expeditionary warfighters. The aim is to collect, manage, and deliver time-critical information when and where it is needed most. The Mobile ISR system consumes video, still imagery, and target metadata from airborne, ground-based, and hand-held sensors, and indexes that data based on content using state-of-the-art video analytics and user tagging. The data is stored in a geospatial database and disseminated to warfighters according to their mission context and current activity. The warfighters use an Android mobile application to view this data in the context of an interactive map or augmented reality display, and to capture their own imagery and video. A complex event processing engine and image-

Page 19: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

19

19

based search enable powerful queries to the knowledge base. The system leverages the extended DoD Discovery Metadata Specification (DDMS) card format, with extensions to include representation of entities, activities, and relationships. Fast Orthorectified Mosaicking of Thousands of Aerial Photographs from Small UAVs

Mark Pritt, Lockheed Martin Company, [email protected] Small unmanned air vehicles (UAVs) provide an economical means of imaging large areas of terrain at far lower cost than satellites. Applications of these systems range from precision agriculture to law enforcement to power line maintenance. Because small UAVs fly at low altitudes of approximately 100 meters, their cameras have only a limited field of view and must take thousands of photographs to cover a reasonably sized area. Furthermore, to provide a unified view of the area, these photographs must be combined into a seamless photo mosaic. The conventional approach for accomplishing this mosaicking process is called block bundle adjustment, and it works well if there are only a few hundred photographs. When there are thousands of photographs, however, this method fails because its memory and computational time requirements become prohibitively excessive. We have developed a new technique that replaces block bundle adjustment with an iterative algorithm that is very fast and requires little memory. After pairwise image registration, the algorithm projects the resulting tie points to the ground and moves them closer to each other to produce a new set of control points. It fits the image parameters to these control points and repeats the process iteratively to convergence. Results from UAVs for precision agriculture will be presented. The resulting mosaics cover hundreds of acres and have a GSD (ground sample distance) of less than one inch. Imagery-based Modeling of Social, Economic and Governance Indicators in Sub-Saharan Africa John Irvine. Chief Scientist for Data Analytics, Draper Laboratory [email protected] Many policy and national security challenges require understanding the social, cultural, and economic characteristics of a country or region. Addressing failing states, insurgencies, terrorist threats, societal change, and support for military operations require a detailed understanding of the local population. Information about the state of the economy, levels of community support and involvement, and attitudes toward government authorities can guide decision makers in developing and implementing policies or operations. However, such information is difficult to gather in remote, inaccessible, or denied areas. Draper’s previous work demonstrating the application of remote sensing to specific issues, such as population estimation, agricultural analysis, and environmental monitoring, has been very promising. In recent papers, we extended these concepts to imagery-based prediction models for governance, well-being, and social capital. Social science theory indicates the relationships among physical structures, institutional features, and social structures. Based on these relationships, we developed models for rural Afghanistan and validated the relationships using survey data. In this paper we explore the adaptation of those models to sub-Saharan Africa. Our analysis indicates that, as in Afghanistan, certain attributes of the society are predictable from imagery-derived features. The automated extraction of relevant indicators, however, depends on both spatial and spectral information. Deriving useful measures

Page 20: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

20

20

from only panchromatic imagery poses some methodological challenges and additional research is needed. Against Conventional Wisdom: Longitudinal Inference for Pattern Recognition in Remote Sensing Dalton Rosarioa, Christoph Borelb, Joao Romanoc

aArmy Research Laboratory, 2800 Powder Mill Rd., Adelphi, MD, 20783 USA bAir Force Institute of Technology, WPAFB, OH 45433-7765, USA cU.S. Army Armament RDEC, Picatinny Arsenal, NJ 07806, USA [email protected]; phone 1 301-394-4235 In response to the 2014 IEEE AIPR theme (Democratization of Imagery), we discuss a persistent imaging experiment dataset, which is being considered for public release in a foreseeable future, and present our observations analyzing a subset of the dataset. The experiment is a long-term collaborative effort among the Army Research Laboratory, Army Armament RDEC, and Air Force Institute of Technology that focuses on the collection and exploitation of longwave infrared (LWIR) hyperspectral and polarimetric imagery. In this paper, we emphasize the inherent challenges associated with using remotely sensed LWIR hyperspectral imagery for material recognition, and argue that the idealized data assumptions often made by the state of the art methods are too restrictive for real operational scenarios. We treat LWIR hyperspectral imagery for the first time as Longitudinal Data and aim at proposing a more realistic framework for material recognition as a function of spectral evolution over time, to include limitations. The defining characteristic of a longitudinal study is that objects are measured repeatedly through time and, as a result, data are dependent. This is in contrast to cross-sectional studies in which the outcomes of a specific event are observed by randomly sampling from a large population of relevant objects, where data are assumed independent. The scientific community generally assumes the problem of object recognition to be cross-sectional. We argue that, as data evolve over a full diurnal cycle, pattern recognition problems are longitudinal in nature and that by applying this knowledge it may lead to better algorithms.

Role of Context in ISR

Jonathan Tucker, Robert Stanfill and Suresh Subramanian, Lockheed Martin Company. [email protected]; [email protected]; [email protected]

Detection and tracking vehicles within Wide Area Motion Imagery can be a very difficult problem. Merging static detectors with motion based detectors can improve ROC curves. Traffic patterns, situational awareness, and local knowledge can be exploited to guide algorithm responses to better synthesize temporal information. Utilizing available shape files and context labels has allowed for more complete information exploitation in detection and tracking.

Page 21: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

21

21

Images Don’t Forget: Online Photogrammetry to Find Lost Graves Abby Stylianou, Joseph D. O'Sullivan, Austin Abrams & Robert Pless Department of Computer Science & Engineering, Washington University in St. Louis, St. Louis, Missouri [email protected], [email protected], [email protected], [email protected] The vast amount of photographic data posted and shared on Facebook, Instragram and other social media platforms offers an unprecedented visual archive of the world. This archive captures events ranging from birthdays, graduations and family trips, to lethal conflicts and human rights violations. The public availability of much of this archive plays an important role in a new genre of journalism, one led by citizens finding, analyzing, and synthesizing data into stories that describe important events. To support this, we have built a set of browser-based tools for the calibration and validation of online images. This paper presents these tools in the context of their use in finding two separate lost burial locations. Often, these locations would have been marked with a headstone or tomb, but for the very poor, the forgotten, or the victims of extremist violence buried in unmarked graves, the geometric cues present in a photograph may contain the only remaining reliable information about the burial location. The tools described in this paper allow individuals without any significant geometry background to utilize those cues to locate the lost graves, or any other outdoor image with sufficient correspondences to the physical world. This paper will also explain the difficulties that arise due to geometric inconsistencies between corresponding points, especially when significant changes have occurred in the physical world since the photo was taken, and highlight visualization features on our browser-based tools that help users to address this. Student Author: Joseph D. O'Sullivan A Day/Night Range Invariant Anomaly Detection For LWIR Polarimetric Imagery Joao M. Romano, US Army RDEC, Picatinny, NJ [email protected] and Dalton S. Rosario, US ARL, Adelphi, MD [email protected] Long Wave Infrared (LWIR) Polarimetric imagery has the potential of enhancing the detection of manmade objects from natural clutter background over conventional broadband by taking advantage of the fact that smooth manmade objects tend to emit strong linear polarization relative to natural clutter. In the past detection of manmade objects was accomplished by observing the Stokes vector information such as S1 and S2 components as well as the Degree of Linear Polarization or DoLP. Although quite useful one of the main complaints of Stokes vector is that the information is viewing angle dependent and the same sensor placed at two different angles from the manmade object surface may or may not detect the object of interest. This paper explores moving away from Stokes information and instead focusing on the independent polarization imagery measurements (e.g., 0 and 90) to compose a cube of polarization information. From this cube it will be shown that 3-dimensional manmade objects can be discriminated from natural clutter background by taking advantage of covariance-difference discriminating functions. The paper propose to use the M-Box covariance test to demonstrate how well it performs compared to the Stokes parameters. The paper will also present a background characterization methodology to be used with the M-Box covariance method to make

Page 22: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

22

22

it range invariant by random sampling the test image with N blocks of data. The intention is to show the following: 1) present the features that separate 3-dimensional manmade objects from natural clutter; 2) propose a covariance-difference discriminant function; 3) demonstrate how integrating a background random sampling approach with the proposed test allows the test to become range invariant; 4) demonstrate the impact of increasing N, the number of random blocks collected from the image, has on the probability of detection on a variety of targets. Calibration Methodology and Performance Characterization of a Polarimetric Hyperspectral Imager Joel Holder [email protected] Polarimetric hyperspectral imaging (P-HSI) has the potential to improve target detection, material identification, and background characterization over conventional hyperspectral imaging and polarimetric imaging. To fully exploit the spectro-polarimetric signatures captured by such an instrument, a careful calibration process is required to remove the spectrally- and polarimetrically-dependent system response (gain). Calibration of instruments operating in the long-wave infrared (LWIR, 8 μm to 12 μm) is further complicated by the polarized spectral radiation generated within the instrument (offset). This paper presents a calibration methodology developed for a LWIR Telops Hyper-Cam modified for polarimetry by replacing the entrance window with a rotatable holographic wire-grid polarizer (4000 line/mm, ZnSe substrate, 350:1 extinction ratio). A standard Fourier-transform spectrometer (FTS) spectro-radiometric calibration is modified to include a Mueller-matrix approach to account for polarized transmission through and polarized self-emission from each optical interface. It is demonstrated that under the ideal polarizer assumption, two distinct blackbody measurements at polarizer angles of 0°, 45°, 90°, and 135° are sufficient to calibrate the system for apparent degree-of-linear-polarization (DoLP) measurements. Noise-equivalent s1, s2, and DoLP are quantified using a wide-area blackbody. A polarization-state generator is used to determine the Mueller deviation matrix. Finally, a realistic scene involving buildings, cars, sky radiance, and natural vegetation is presented. Visible and SWIR Hyperspectral Polarimetric Imaging of Desert Soils John S. Furey, Neelam Gupta [email protected]

Desert soils exhibit a complex mix and fascinating interplay of mineral content, grain sizes and orientations, and other complicating features that make them of interest in military and civilian applications of imaging technologies. We discuss the implementation of acousto-optic tunable filter (AOTF) technology in the design of two novel hyperspectral polarimetric imagers. The construction of the separate imagers in the Visible and shortwave infrared (SWIR) wavelength bands is described, with particular attention to details enabling field deployment in the Summer of 2014, and the engineering challenges and obstacles that were overcome. The narrative of the rigors of the deployment in the hot American Southwest desert illustrates many of the technology issues in getting these imaging technologies used. The Big Data aspects of the proper analysis and handling of hyperspectral polarimetric images is outlined.

Page 23: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

23

23

Exploring Polarimetric Hyperspectral Imaging as a Tool for Improved Material Identification Kevin Gross and Jake Martin. AFIT/ENP [email protected] A new research effort is underway to investigate the degree to which polarimetric hyperspectral imaging (P-HSI) improves material identification over conventional hyperspectral imaging. To that end, the entrance window of a Telops LWIR (8-12 µm) hyperspectral camera was modified to incorporate a holographic wire-grid polarizer (4000 lines/mm, ZnSe substrate, 350:1 extinction ratio). To assess instrument performance and data reduction methods, preliminary measurements of an uncoated glass (BK7) cylindrical lens and an uncoated quartz window were made. In the LWIR, polarimetric measurements require careful two-point radiometric calibration to remove the effects of polarized system response (gain) and polarized instrument self-emission (offset). This was accomplished using on-board wide-area blackbodies which precede and overfill the polarizing element. Treating the polarizer as ideal, degree-of-linear-polarization (DOLP) spectra are formed from the appropriate apparent spectral radiances measured at polarizer angles of 0°, 45°, 90°, and 135°. Both unpolarized (S0) and DOLP spectra are compared to theoretical predictions based on known surface-normal angles and spectrally-resolved complex indices of refraction. Implications for material identification are discussed. The possibility of surface normal estimation is also discussed. Geodesic Paths For Time Dependent Covariance Matrices In A Riemannian Manifold Avishai Ben-David, Research Development and Engineering (RDECOM), Edgewood Chemical Biological Center, Aberdeen Proving Ground, MD 21010 [email protected]; Justin Marks, Bowdoin College, Brunswick ME 04011 [email protected]

Time dependent covariance matrices are important in remote sensing and hyperspectral detection theory. The difficulty is that C(t) is usually available only at two endpoints C(t0)=A and C(t1)=B where C(t0<t<t1) is needed. We present the Riemannian manifold of positive definite symmetric matrices as a framework for predicting a geodesic time dependent covariance matrix. The geodesic path A→B is the shortest and most efficient path (minimum energy). Although there is no guarantee that data will necessarily follow a geodesic path, the predicted geodesic C(t) is of value as a concept. The path for the inverse covariance is also geodesic and is easily computed. We present an interpretation of C(t) with coloring and whitening operators to be a sum of scaled, stretched, contracted, and rotated ellipses. We show that the volume of the geodesic covariance is smaller than that of linearly interpolated (between A and B) covariance matrix, and thus using time dependent geodesic covariance in detection algorithms will increase the separation between the H0 (target absent) and H1 (target present) detection scores, hence, detection performance will improve (false alarm and detection probabilities depend on the detection algorithm and location of targets).

Page 24: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

24

24

Polarized Microbolometers for Target Detection in Video David B. Chenault, John S. Harchanko, J. Larry Pezzaniti, Justin Vaden, Brian Hyatt. Polaris Sensor Technologies 200 Westside Square, Suite 320, Huntsville, AL 35801 {david.chenault,John.Harchanko, Larry.Pezzaniti}@PolarisSensor.com  Infrared polarization relies on surface temperature, roughness, material properties, aspect angle to the sensor, sky down-welling and background radiance reflecting from the object. Often times, the polarization signature of a manmade object is different than the surrounding background. Furthermore, that difference is often present even when the thermal signature of the same object blends into the background. A novel sensing approach to detecting these signatures takes the form of an infrared polarization imager or imaging polarimeter. In this paper, we describe several approaches to making IR polarization measurements and specifically we describe the Polaris IR Polarimetric Camcorder, a handheld infrared imaging polarimeter, that produces live polarimetric video for several polarization products. The system is immune to motion artifacts of either the sensor or the scene. The system is battery operated, rugged, and weighs about one pound and can be helmet mounted or handheld. The operator views the polarization products in real time on either a helmet mounted display or small external display integrated with a digital video recorder.Polarization sensitive microbolometers are used to produce, at a minimum, S0 and S1 polarization products. A top level description of the camcorder is given followed by performance characteristics and representative data including scenarios in which polarization contrast remains high during periods of zero thermal contrast. Also, data will be presented that shows the detection of objects in natural clutter, long after thermal equilibrium of the target with the background has been established.

Automated Geo-referenced 3D Reconstruction from Multiview Satellite Images L. Lee* K. Wang+ J. Frahm+ *MIT Lincoln Laboratory +University of North Carolina Chapel Hill [email protected] [email protected] [email protected] There is a growing need for high resolution three-dimensional representations of earth’s terrain and man-made features in applications such as urban planning, flood risk management, coastal erosion, and disaster relief. Traditionally, such high resolution 3D models can only be achieved through the use of LiDAR or IfSAR technologies. With the availability of high resolution imaging satellites capable of collecting multi-look data, DSM can be acquired by applying 3D reconstruction techniques adapted from computer vision field. Existing satellite sensors are also able to collect images of much larger areas in shorter amount of time than LiDAR sensors and hence provide the opportunity for country-scale coverage. Satellite images have sensor model representations (rational functional model, or RFM) that are significantly different from the sensor representations used in traditional 3D reconstruction techniques, which presents a major challenge to adaptation of existing algorithms. While people have produced DSM from satellite images before, the process involves manual initiation of feature registration, and is most often not geo-referenced, or manually geo-referenced. We propose an entirely automated DSM generation system that addresses the unique sensor model representation of satellite images by using the RFMs without approximating them using traditional linear projection matrices that are familiar in 3D reconstruction in computer vision. In addition, we provide a solution to the lack of automated geo-referencing capability by automatically registering the satellite images to a common

Page 25: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

25

25

reference map data set. Our end result of 3D surface model is a high resolution, geo-referenced data set produced without any human intervention. This work is sponsored by the Department of Air Force under Air Force Contract #FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government.

Integrating Spectral and Textural Features Using Wavelet Analysis for Hyperspectral Image Classification Brajesh Kumar, Onkar Dikshit. Geoinformatics Group, Department of Civil Engineering, Indian Institute of Technology Kanpur, India {brajeshk, onkar}@iitk.ac.in This paper presents a supervised classification framework that integrates wavelet transform based spectral and textural features for hyperspectral image classification. The experiments are performed on DAIS 7915 data (512 x 512 pixels with 5 m pixel size) acquired over an area known as ‘La Mancha Alta’ to the south of Madrid, Spain which is divided into eight land cover classes. It was acquired over 79 bands ranging from 0.4 to 12.5 μm. Only 65 bands were retained after preprocessing removing the noisy bands. Investigations involved application of 1-D discrete wavelet transform (DWT) along the wavelength dimension of the hyperspectral data for dimensionality reduction followed by 2-D DWT for texture feature extraction. The combined spectral textural feature set is used for classification. The pixel wise classification is performed using multi-class one-vs-one support vector machine (SVM) classifier. SVM is trained with Gaussian radial basis function (RBF) kernel. The parameters C and γ for RBF kernel SVM are determined optimally using 5-fold cross validation. Detailed accuracy analysis using parameters like overall accuracy, kappa, and tau statistics, etc. reveals that integration of spectral and spatial information significantly improves the classification accuracy. Extension of No-Reference Deblurring Methods Through Image Fusion Michael H Ferris , Dr. Erik Blasch , Dr. Soundararajan Ezekiel, and Michael McLaughlin, University of Binghamton, AFRL/RI, Indiana University of PA. [email protected], [email protected], [email protected], [email protected] An important and pressing issue in image quality enhancement is extracting an optimal amount of information from a blurred image without a reference image for comparison. Most studies have approached this issue by using iterative algorithms in an attempt to deconvolve the blurred image into the ideal image. This process is very difficult due to the need to estimate a point spread function for the blur after each iteration, which can be computationally expensive for much iteration. In fact this process often causes some amount of distortion or "ringing" in the deblurred image. However, image fusion may provide a solution. By deblurring a no-reference image, then fusing it with the blurred image, we were able to extract additional amounts of information from the fused image. As stated above, the deblurring process causes some degree of information loss; the act of fixing one section of the image causes distortion in another section of the image. Hence by fusing the blurred and deblurred images together we can retain salient information from the blurred image and gain important information from the deblurred image. We have found that this process significantly reduces the "ringing" in the deblurred image. The fusion process is then validated by three different evaluation metrics; Mutual Information (MI), Mean Square Error (MSE), and Peak Signal to Noise Ratio (PSNR). This paper details an extension of the no-

Page 26: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

26

26

reference image deblurring process and the initial results indicate that image fusion has the potential to be an incredibly useful tool in the image deblurring field.

Learning Tree-structured Approximations for Conditional Random Fields

Skurikhin, Alexei, MS D440, Space Data Systems Group, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA. [email protected] Exact probabilistic inference is computationally intractable in general probabilistic graph-based models, such as Markov Random Fields and Conditional Random Fields (CRFs). We investigate spanning tree approximations for the discriminative CRF model. We decompose the original computationally intractable grid-structured CRF model containing many cycles into a set of tractable sub-models using a set of spanning trees. The structure of spanning trees is generated uniformly at random among all spanning trees of the original graph. These trees are learned independently to address the classification problem and Maximum Posterior Marginal estimation is performed on each individual tree. Classification labels are produced via voting strategy over the marginals obtained on the sampled spanning trees. The learning is computationally efficient because the inference on trees is exact and efficient. Our objective is to investigate the capability of approximation of the original loopy graph model with loopy belief propagation inference via learning a pool of randomly sampled acyclic graphs. We focus on the impact of memorizing the structure of sampled trees. We compare two approaches to create an ensemble of spanning trees, whose parameters are optimized during learning: (1) memorizing the structure of the sampled spanning trees used during learning and, (2) not storing the structure of the sampled spanning trees after learning and regenerating trees anew. Experiments are done on two image datasets consisting of synthetic and real-world images. These datasets were designed for the tasks of binary image denoising and man-made structure recognition.

Secret Communication in Colored Images Using Saliency Map as Model Manish Mahajan and Navdeep Kaur [email protected] and [email protected]. Steganography is a process that involves hiding a message in an appropriate carrier for example an image or an audio file. Many algorithms have been proposed for this purpose in spatial & frequency domain. But in almost all the algorithms it has been noticed that as one embeds the secret data in the image certain characteristics or statistics of the image get disturbed. To deal with this problem another paradigm named as adaptive steganography exists which is based upon some mathematical model. Visual system of human beings does not process the complete area of image rather focus upon limited area of visual image. But in which area does the visual attention focused is a topic of hot research nowadays. Research on psychological phenomenon indicates that attention is attracted to features that differ from its surroundings or the one that are unusual or unfamiliar to the human visual system. Object or region based image processing can be performed more efficiently with information pertaining locations that are visually salient to human perception with the aid of a saliency map. So saliency map may act as model for adaptive steganography in images. Keeping this in view, a novel steganography technique based upon saliency map has been proposed in this work.

Page 27: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

27

27

Automatic detection of ice layers in synthetic aperture radar images Maryam Rahnemoonfar, School of Engineering and Computing Sciences, Texas A&M University-Corpus Christi, USA [email protected] Global warming has caused serious damage to our environment in recent years. Accelerated loss of ice from Greenland and Antarctica has been observed in recent decades. The melting of polar ice sheets and mountain glaciers has a considerable influence on sea level rise and altering ocean currents, potentially leading to the flooding of the coastal regions and putting millions of people around the world at risk. Synthetic aperture radar (SAR) systems are able to provide relevant information about subsurface structure of polar ice sheets. Manual layer identification is prohibitively tedious and expensive and is not practical for regular, long-term ice-sheet monitoring. Automatic layer finding in noisy echogram images is quite challenging due to huge amount of noise, limited resolution and variations in ice layers and bedrock. This study presents an efficient automatic algorithm to detect several layers of ice sheets using mathematical morphology operations. Our approach involves the identification and selection of internal layers. Experimental results of testing on publicly available SAR data of Greenland and Antarctica show promising capabilities for automatically detecting ice layers.

Robust Vehicle Edge Detection by Cross Filter Method Katy Po Ki Tang, Henry Y.T. Ngan, Senior Member, IEEE, Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong [email protected] In visual surveillance, vehicle tracking and identification is very popular and applied in many applications such as traffic incident detection, traffic control and management. Edge detection is the key to the success of vehicle tracking and identification. Edge detection is to identify edge locations or geometrical shape changes in term of pixel value along a boundary of two regions in an image. This paper aims to investigate different edge detection methods and introduce a Cross Filter (CF) method, with a two-phase filtering approach, for vehicle images in a given database. First, four classical edge detectors namely the Canny detector, Prewitt detector, Roberts detector and Sobel detector are tested on the vehicle images. The Canny detected image is found to offer the best performance in Phase 1. In Phase 2, the robust CF, based on a spatial relationship of intensity change on edges, is applied on the Canny detected image as a second filtering process. Visual and numerical comparisons among the classical edge detectors and CF detector are also given. The average DSR of the proposed CF method on 10 vehicle images is 95.57%. Student author: Katy Po Ki Tang. Time-adapted CLAHE for HDR Tonemapping of Full Motion Video Matthew Baran, Penn State University, [email protected]

Page 28: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

28

28

High bit-depth data is becoming ubiquitous in the world of imaging and remote sensing. Single frame images are often stored and processed at higher precision than can be visualized on standard display technology. This problem is addressed with High Dynamic Range (HDR) tonemapping, which nonlinearly maps brightness levels from a high bit-depth image into a low bit-depth format. High bit-depth video is becoming increasingly available, and the latest video encoding standards are being developed with high bit-depth support. We have developed an approach to HDR tonemapping on high bit-depth video to map HDR data into older formats and standard displays. We have updated the well-known Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm to perform HDR video tonemapping with a time-adaptive histogram transformation. In addition to brightness contrast, we use the L*a*b* colorspace to amplify color contrast in the video stream. The transformed HDR video data maintains important details in local contrast while maintaining relative brightness levels globally. Our results show that time-adapted HDR tonemapping methods can be used in real-time video processing to store and display HDR data in low bit-depth formats with less loss of detail compared to simple truncation. Timing Mark Detection on Nuclear Detonation Video Lt Col Dan Schmitt, Air Force Institute of Technology, Dayton, OH. [email protected] During the 1950s and 1960s the United States conducted and filmed over 200 atmospheric nuclear tests establishing the foundations of atmospheric nuclear detonation behavior. Each explosion was documented with about 20 videos from three or four points of view. Synthesizing the videos into a 3D video will improve yield estimates and reduce error factors. The videos were captured at a nominal 2500 frames per second, but range from 2300-3100 frames per second during operation. In order to combine them into one 3D video, individual video frames need to be correlated in time with each other. When the videos were captured a timing system was used that shined light in a video every 5 milliseconds creating a small circle exposed in the frame. This paper investigates several method of extracting the timing from images in the cases when the timing marks are occluded and washed out, as well as when the films are exposed as expected. Results show an improvement over past techniques. For normal videos, occluded videos, and washed out videos, timing is detected with 99.3%, 77.3%, and 88.6% probability with a 2.6%, 11.3%, 5.9% false alarm rate, respectively. Enhanced View Invariant Gait Recognition Using Feature Level Fusion Himanshu Chaubey*, Madasu Hanmandlu* and Shantaram Vasikarla# *Bharti School of Telecommunication Technology & Management, *Department of Electrical Engineering, IIT Delhi, New Delhi, India #Dept. of Computer Science, California State University, Northridge, CA 91330 [email protected] [email protected],[email protected] In this paper, following the model-free approach for gait image representation, an individual recognition system is developed using the Gait Energy Image (GEI) templates. The GEI templates can easily be obtained from an image sequence of a walking person. Low dimensional feature vectors are extracted from the GEI templates using Principal Component Analysis (PCA) and Multiple Discriminant Analysis (MDA), followed by the nearest neighbor classification for recognition. Genuine and imposter scores are computed to draw the Receiver Operating

Page 29: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

29

29

Characteristics (ROC). In practical scenarios, the viewing angles of gallery data and probe data may not be the same. To tackle such difficulties, View Transformation Model (VTM) is developed using Singular Value Decomposition (SVD). The gallery data at a different viewing angle are transformed to the viewing angle of probe data using the View Transformation Model. This paper attempts to enhance the overall recognition rate by an efficient method of fusion of the features which are transformed from other viewing angles to that of probe data. Experimental results show that fusion of view transformed features enhances the overall performance of the recognition system. Student author: Himanshu Chaubey A Container-based Elastic Cloud Architecture for Real-Time Full-Motion Video (FMV) Target Tracking

Ryan Wu, Yu Chen, Erik Blasch, Bingwei Liu, Genshe Chen, Dan Shen, Dept. of Electrical and Computing Engineering, Binghamton University, Binghamton, NY 13902, USA; Air Force Research Laboratory, Rome, NY 13440, USA; and, Intelligent Fusion Technology, Inc. Germantown, MD 20876, USA frwu10, ychen, [email protected], [email protected], fgchen, [email protected]

Full-motion video (FMV) target tracking requires the objects of interest be detected in a continuous video stream. Maintaining a stable track can be challenging as target attributes change over time, frame-rates can vary, and image alignment errors may drift. As such, optimizing FMV target tracking performance to address dynamic scenarios is critical. Many target tracking algorithms do not take advantage of parallelism due to dependencies on previous estimates which results in idle computational resources when waiting for such dependencies to resolve. To address this problem, a container-based virtualization technology is adopted to make more efficient use of computing resources for achieving an elastic information fusion cloud. In this paper, we leverage the benefits provided by container-based virtualization to optimize an FMV target tracking application. Using OpenVZ as the virtualization platform, we parallelize video processing by distributing incoming frames across multiple Virtual Environments (VEs). A concurrent VE reassembles processed frames into video output. We implement a system that dynamically allocates VE computing resources to match frame production and consumption between VEs. The experimental results verify the viability of container-based virtualization for improving FMV target tracking performance and provides a solution for mission-critical information fusion tasks. Student Author: Ryan Wu

Page 30: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

30

30

Analyzing the Effects of On-board Sensor Imprecision in WAMI Bundle Adjustment Hadi Aliakbarpour1, V. B. Surya Prasath1, Raphael Viguier1, Rengarajan Pelapur1, Mahdieh Poostchi1, Guna Seetharaman2, Kannappan Palaniappan1. Department of Computer Science, University of Missouri, Columbia, MO 65211 Information Directorate, Air Force Research Laboratory, Rome, NY 13441 [email protected] Camera pose estimation has been explored for the past few decades but still remains an active topic with the prevalence of new sensors and platforms. Among the existing pose estimation methods, Bundle Adjustment (BA) based approaches are robust providing reasonable results even when only partial information is available. BA refers to simultaneously refining the pose of $N$-cameras and 3D structure of scene points subject to a set of projective constraints such that an appropriate error measure is minimized. Normally, in BA after extracting salient features and establishing correspondences, an estimate of the camera rotation and translation, together known as the camera pose, is obtained using either fundamental or homography matrix estimation. These initial estimates are then used in a triangulation step where corresponding features are geometrically fused to obtain an initial estimate for the 3D point cloud reconstruction of the scene structure. The crucial part of BA is the optimization and refinement steps, given the initial estimates. Unlike general BA utilized in other computer vision tasks where there is often no sensor metadata, in BA for Wide Area Motion Imagery (WAMI) noisy camera pose measurements from on-board sensors are available. This enables us to develop efficient streamlined BA algorithms exploiting sensor and platform geometries and flightpaths. We show that the fundamental matrix or homography transformation estimation step can be bypassed, but errors in the metadata due to noisy sensor measurements and adverse operating environments, must be taken into account. In this paper, we analyze the effects of measurement noise in position (from GPS) and rotation (from IMU) sensors on BA results, in terms of accuracy and robustness of the recovered camera parameters using a simulation testbed. We also investigate how matching errors in a sequence of corresponding features used to perform 3D triangulation can affect overall precision. The impact on the robustness of camera pose estimation for N-view BA in the context of large scale WAMI-based 3D reconstruction is discussed. Student Authors: Raphael Viguier, Rengarajan Pelapur, Mahdieh Poostchi

Advanced Video Activity Analytics: Integrating and Maturing Computer Vision and Pattern Recognition Algorithms. Bruce Swett, Chief Scientist and Vice President, EOIR Technologies, Inc. [email protected] While significant strides continue to be made in computer vision and pattern recognition, the software solutions are often fragile, incompatible with other algorithms, and unable to operate either in real-time or on massive amounts of data (scale). The AVAA project has developed a framework for integrating, standardizing and testing computer vision and pattern recognition algorithms called VPEF (Video Processing and Exploitation Framework). VPEF allows algorithms from commercial, Government, open-source and academic organizations to operate together in a plug-in, pipeline architecture. AVAA parallelizes VPEF instances, allowing the algorithms to operate on a high volume of video data without re-engineering the algorithm

Page 31: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

31

31

software. The Hadoop-based AVAA cloud architecture also provides distributed ingestion, indexing, analysis, data storage, data search / retrieval, and data visualization capabilities. By providing an end-to-end cloud architecture, new computer vision and pattern recognition algorithms can easily be added, parallelized, and improved by combining them with the suite of existing capabilities. The AVAA project provides both a transition path for new algorithms for use by the Department of Defense, as well as a model for public-private partnership in developing and fielding new technologies.

Depth Data Assisted Structure-from-Motion Parameter Optimization and Feature Track Correction

Shawn Recker(1,2), Christiaan Gribble(2), Mikhail M. Shashkov(1), Mario Yepez(1), Mauricio Hess-Flores (1), and Kenneth I. Joy(1) 1 Institute of Data Analysis and Visualization, Univ. of California Davis, Davis, CA 2 Applied Technology Operations, SURVICE Engineering, Belcamp, MD 21234

Structure-from-Motion (SfM) applications attempt to reconstruct the three-dimensional (3D) geometry of an underlying scene, from a collection of images, taken from various camera viewpoints. Traditional optimization techniques in SfM, which compute and refine the camera poses and 3D structure, rely only on feature tracks, or sets of corresponding pixels, generated from color (RGB) images. With the advent of reliable depth sensor information, these optimization procedures can be augmented to increase the accuracy of the reconstruction. This paper presents a general cost function, which evaluates the quality of a reconstruction based upon a previously established angular cost function and depth data estimates. The cost function takes into account two error measures: first, the angular error between each computed 3D scene point and its corresponding feature track location, and second, the difference between the sensor depth value and its computed estimate. A bundle adjustment parameter optimization is implemented using the proposed cost function and evaluated for accuracy and performance. As opposed to traditional bundle adjustment, in the event of feature tracking errors, a corrective routine is also present to detect and correct inaccurate feature tracks. The algorithm involves clustering depth estimates of the same scene point and observing the difference between the depth point estimates and the triangulated 3D point. Results on both real and synthetic data are presented and show that reconstruction accuracy is improved. Student Authors: Shawn Recker, Mikhail M. Shashkov, and Mario Yepez Evaluating the HSI/Lidar Direct Method for physics-based scene modeling Ryan N. Givens, Karl C. Walli, Michael T. Eismann. Air Force Institute of Technology, [email protected] Recent work has been able to automate the process of generating three-dimensional, spectrally attributed scenes for use in physics-based modeling software using the Hyperspectral/Lidar Direct method. The Hyperspectral/Lidar Direct method autonomously generates three-dimensional Digital Imaging and Remote Sensing Image Generation (DIRSIG) scenes from input high-resolution imagery, lidar data, and hyperspectral imagery and has been shown to do this

Page 32: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

32

32

successfully using both modeled and real datasets. While the output scenes look realistic and appear to match the input scenes under qualitative comparisons, a more quantitative approach is needed to evaluate the full utility of these autonomously generated scenes. This paper seeks to improve the evaluation of the spatial and spectral accuracy of autonomously generated three-dimensional scenes using the DIRSIG model. Two scenes are presented for this evaluation. The first is generated from a modeled dataset created using the DIRSIG model and the second is generated using data collected over a real-world site. Synthetic imagery over the recreated scenes are then compared to the original input imagery to evaluate how well the recreated scenes match the original scenes in spatial and spectral accuracy and to determine the ability of the recreated scenes to produce useful outputs for algorithm development.

Particle Filter-based Vehicle Tracking Using Fused Spatial Features and a Non-Linear Motion Model Raphael Viguier1, Guna Seetharaman2, Kannappan Palaniappan1. Department of Computer Science, University of Missouri, Columbia, MO 65211 Information Directorate, Air Force Research Laboratory, Rome, NY 13441 [email protected] Tracking in full motion and wide area motion imagery poses many challenges for feature-based techniques since appearance changes can easily distract the tracker away from the true target. Motion prediction is often used to improve the robustness of target tracking by constraining appearance matching to be within an expected region of interest based on the target motion behavior. We improve upon the typical motion models by incorporating a more realistic non-isotropic prediction error noise model and incorporating an orientation and magnitude based representation of vehicle dynamics. A particle filter-based approach is used to handle both model complexity and fusion with object spatial appearance feature likelihood maps. Experiments using several video datasets show a significant increase in the average track length, especially during turns. The particle filter combined with a non-linear motion model outperforms our previous Kalman filter-based prediction using a linear motion model in the context of the Likelihood of Features Tracking (LoFT) system. Student Author: Raphael Viguier Multi-Target Tracking in Video with Adaptive Integration of Appearance and Motion Models Arslan Basharat1, Ilker Ersoy2, Kannappan Palaniappan2, and Anthony Hoogs1 1Kitware, Inc., 28 Corporate Drive, Clifton Park, NY 12065 2Dept of Computer Science, University of Missouri, Columbia, MO 65211 In recent years various appearance-based single target trackers have been proposed with high accuracy in FMV and WAMI. CSURF and LOFT trackers are two such examples that are able to continue tracking targets under difficult conditions but require manual initialization and additional computational cost. Tracking at urban scale is challenging when the goal is to

Page 33: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

33

33

automatically track hundreds to thousands of targets in real-time in WAMI or dozens of high-resolution targets in FMV. Here we propose a hybrid tracking architecture that utilizes motion detections to robustly initialize multiple tracks, uses a blended approach to integrate appearance-based trackers, provides a generalized API for interfacing such trackers, and adaptively uses motion detection or appearance match to update a track. High quality motion detections are evaluated for track updates prior to appearance-based updates due to lower computational complexity. On the other hand, appearance-based tracker updates are preferred under difficult conditions like temporary stopping, low contrast, partial occlusion, complex backgrounds and clutter. Independent of the approach used to update the track, the system allows for the appearance-based trackers to update their model after each track update. Moreover, this architecture also includes time-reverse backward tracking over a limited period of time to exploit asymmetric temporal information for increased target coverage and tracking success. We have successfully interfaced CSURF and LOFT appearance-based trackers into the proposed architecture. This was achieved by implementing the interface API from the Matlab library implementation of these trackers into the overall C++ system. We present quantitative evaluation of the proposed system with four different approaches for appearance modeling; CSURF and LOFT are the two recently demonstrated trackers and for baseline comparison we use template matching with sum-of-squared differences (SSD) and normalized cross-correlation. The results show that CSURF appearance-based tracking produces the best track quality when integrated in the proposed motion-based framework. Road Sign Detection on a Smartphone for Traffic Safety Carrie Pritt, IEEE Student Member [email protected]  According to the World Health Organization, 1.24 million deaths are attributed to traffic accidents each year. One approach to reducing traffic fatalities is the use of expensive Advanced Driver Assistance Systems, which are still under development. The goal of this work is the development of a low-cost driver assistance system that runs on an ordinary smartphone. It uses computer vision techniques and multiple-resolution template matching to detect speed limit signs and alert the driver if the speed limit is exceeded. It inputs an image of the sign to be detected and creates a set of multiple-resolution templates. It also inputs photographs of the road from the smartphone camera at regular intervals and generates multiple-resolution images from the photographs. In the first stage of processing, fast filters restrict the focus of attention to smaller areas of the photographs where signs are likely to be present. In the second stage, the system matches the templates against the photographs using fast normalized cross correlation (NCC) to detect speed limit signs. The multiple resolutions enable the NCC approach to detect signs at different scales. In the third stage, the system recognizes the sign by matching a series of annotated speed templates to the image at the position and scale that were determined by the detection stage. It compares the speed limit with the actual vehicle speed as computed from the smartphone GPS sensor and issues warnings to the driver as necessary. Student Author: Carrie Pritt Adaptive Automatic Target Recognition in Single and Multi-Modal Sensor Data

Page 34: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

34

34

Timothy S Khuon1, Robert S. Rand1 and Eric Truslow2 1National Geospatial Intelligence, 2Northeastern University (Truslow) [email protected], [email protected] For single-modal data, Target recognition and classification in a 3D point cloud is a non-trivial process due to the nature of the data collected from a sensor system where the signal can be corrupted by noise from the environment, electronic system, A/D converter, etc... Therefore, an adaptive system with a specific desired tolerance is required to perform classification and recognition optimally. The feature-based pattern recognition algorithm described below, is generalized for solving a particular global problem with minimal change. Since for the given class set, a feature set must be extracted accordingly. For instance, man-made urban target classification, rural and natural objects, and human organ classification would require different and distinct feature sets.This study is to compare the adaptive automatic target recognition in single sensor and the distributed adaptive pattern recognition in multi-sensor fusion. The similarity in automatic target recognition between sensor fusion and single-sensor is the ability to learn from experiences and decide on a given pattern. Their main difference is that the sensor fusion makes a decision from the decisions of all sensors whereas the single sensor requires a feature extraction for a decision. Change Detection and Classification of Land Cover in Multispectral Satellite Imagery using Clustering of Sparse Approximations (CoSA) over Learned Feature Dictionaries Daniela I. Moody, Steven P. Brumby Los Alamos National Laboratory, MS D436, PO Box 1663, Los Alamos, NM 87545 [email protected] Neuromimetic machine vision and pattern recognition algorithms are of great interest for landscape characterization and change detection in satellite imagery in support of global climate change science and modeling. We present results from an ongoing effort to extend machine vision methods to the environmental sciences, using adaptive sparse signal processing combined with machine learning. A Hebbian learning rule is used to build multispectral, multiresolution dictionaries from regional satellite normalized band difference indexes data. Land cover labels are automatically generated via our CoSA algorithm: Clustering of Sparse Approximations, using a clustering distance metric that combines spectral and spatial textural characteristics to help separate geologic, vegetative, and hydrologic features. Land cover labels are estimated in example Worldview-2 satellite images of a particular region taken at different times, and are used to detect seasonal and yearly surface changes. Our results suggest that neuroscience-based models are a promising approach to practical pattern recognition problems in remote sensing. Small Pixel Focal Plane Array Technology John Caulfielda, Jerry Wilsona, Nibir Dharb aCyan Systems, 5385 Hollister Ave, Suite 105 Dr. Santa Barbara, CA, 93111, email: [email protected]; bARMY, NVESD

Page 35: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

35

35

Improved imaging systems using smaller sub diffraction sized pixels have shown good imaging results. There are known limits in undersampled and critically sampled sensors regarding resolution and aliasing. Oversampling the image using sub-diffraction size pixels offers much more than improved resolution, smaller FPAs, optics, and dewar systems. Oversampled pixels results in processing techniques for smaller pixels that enable a number of related systems benefits such as improved Instantaneous Field of View (IFOV), Noise Equivalent Power (NEP), False Alarm Rate, and detection range, as well as other system level benefits. We will show data from the first 2.4 Megapixel 5 micron pitch ROIC and demonstrate that spatial oversampling can improve aliasing, sensitivity, and drive reductions in False Alarms through oversampled correlated processing. Oversampled pixels allow larger format FPAs and smaller optics, resulting in reductions in size, power, and weight. Oversampled IR sensors will also improve detection and acuity in turbulent and hazy conditions over larger pixel IR focal plane array sensors. We will review the phenomena of smaller pixels have lower SNR, and how using temporal and spatial oversampling can compensate and effectively increase SNR lost with smaller pixels. We will quantify the limits of performance of Oversampling based on theory, and also with Monte Carlo type analysis using realistic parameters such as shot noise and thermal noise. We will show quantitative data to illustrate the improvements in resolution, NEP, detection range, and false alarm suppression of the oversampled IR sensor as the temporal and spatial oversampling are increased. Entropy Metric Regularization for Computational Imaging with Sensor Arrays Prudhvi Gurram1 and Raguveer Rao2, 1MBO Partners Inc, and 2Army Research Lab, Adelphi, MD [email protected] and [email protected]

Correlative interferometric image reconstruction is a computational imaging method for synthesizing images from sensor arrays and relies on estimating source intensity by using the cross-correlation across near-field or far-field measurements from multiple sensors of the arrays. Key to using the approach is the exploitation of a relationship between the correlation across the sensor measurements and the source intensity. This relationship is of a Fourier transform type when the sensors are in the far-field of the source and the velocity of wave propagation in the intervening medium is constant. Often the estimation problem is ill-posed resulting in unrealistic reconstructions of images. Positivity constraints, boundary restrictions, l1 regularization, and sparsity constrained optimization have been applied on the recovered source intensity in previous work. In recent work, the sensor measurements were assumed to be noise-less for entropy metric optimization, which is untenable in reality. This paper considers the noisy case and formulates the estimation problem as least squares minimization with entropy metrics, either minimum or maximum, as regularization terms. Situations involving far-field interferometric imaging of extended sources will be considered and results illustrating the advantages of these entropy metrics and their applicability will be provided. A Comparative Study Of Methods To Solve The Watchman Route Problem In A Photon Mapping-Illuminated 3D Virtual Environment Bruce A. Johnsona, Hairong Qib, and Jason C. Isaacsa (a) Naval Surface Warfare Center, Panama City Division, 110 Vernon Ave., Panama City, FL 32407 (b) Min Kao Electrical Engineering and Computer Science Building 1520 Middle Dr., Knoxville, TN 37996 USA

Page 36: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

36

36

Understanding where to place static sensors such that the amount of information gained is maximized while the number of sensors used to obtain that information is minimized is an instance of solving the NP-hard art gallery problem (AGP). A closely-related problem is the watchman route problem (WRP) which seeks to plan an optimal route by an unmanned vehicle (UV) or multiple UVs such that the amount of information gained is maximized while the distance traveled to gain that information is minimized. In order to solve the WRP, we present the Photon-mapping-Informed active-Contour Route Designator (PICRD) algorithm. PICRD heuristically solves the WRP by selecting AGP-solving vertices and connecting them with vertices provided by a 3D mesh generated by a photon-mapping informed segmentation algorithm using some shortest-route path-finding algorithm. Since we are using photon-mapping as our foundation for determining UV-sensor coverage by the PICRD algorithm, we can then take into account the behavior of photons as they propagate through the various environmental conditions that might be encountered by a single or multiple UVs. Furthermore, since we are being agnostic with regard to the segmentation algorithm used to create our WRP-solving mesh, we can adjust the segmentation algorithm used in order to accommodate different environmental and computational circumstances. In this paper, we demonstrate how to adapt our methods to solve the WRP for single and multiple UVs using PICRD using two different segmentation algorithms under varying virtual environmental conditions. Human Re-Identification in Multi-Camera Systems Kevin Krucki, University of Dayton; Dr. Vijay Asari, University of Dayton; Dr. Chris Borel-Donohue, Air Force Institute of Technology; Dr. David Bunker, Air Force Institute of Technology [email protected]; [email protected]; [email protected]; and, [email protected] We propose a human re-identification algorithm for a multi-camera surveillance environment where a unique signature of an individual is learned and tracked in a scene. The feed from each camera is processed using a motion detector to get locations of all individuals. To compute the signature, we propose a combination of different descriptors on the detected body such as the Local Binary Pattern histogram (LBPH) for the local texture and a HSV color-space based descriptor for the color representation. For each camera, a signature computed by these descriptors is assigned to the corresponding individual along with their direction in the scene. Knowledge of the person’s direction allows us to make separate identifiers for the front, back, and sides. These signatures are then used to identify individuals as they walk across different areas monitored by different cameras. The challenges involved are the variation of illumination conditions and scale across the cameras. We test our algorithm on a dataset captured with 3 Axis cameras arranged in the UD Vision Lab as well as a subset of the SAIVT dataset and provide results which illustrate the consistency of the labels as well as precision/accuracy scores. Student Author: Kevin Krucki

Gender And Age Recognition for Video Analytics Solution

Page 37: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

37

37

Vladimir Khryashchev, Andrey Priorov and Alexander Ganin, 150000, Sovetskaya 14-309, P.G. Demidov Yaroslavl State University, Yaroslavl, Russia e-mail: [email protected]

An application for video data analysis based on computer vision and machine learning methods is presented. Novel gender and age classifiers based on adaptive features, local binary patterns and support vector machines are proposed. Gender recognition, for example, can be used to collect and estimate demographic indicators. Besides, it can be an important preprocessing step when solving the problem of person identification, as gender recognition allows twice to reduce the number of candidates for analysis (in case of identical number of men and women in a database), and thus twice to accelerate the identification process. More than 94% accuracy of viewer's gender recognition is achieved. Human age estimation is another problem in the field of computer vision which is connected with face area analysis. Among its possible applications one should note electronic customer relationship management, security control and surveillance monitoring, biometrics. All the stages are united into a real-time system of audience analysis. The system allows to extract all the possible information about people from the input video stream, to aggregate and analyze this information in order to measure different statistical parameters. The noted features allow applying the proposed system in various spheres of life: places of mass stay of people (stadiums, theaters and shopping centers), transport knots (airports, railway and auto stations), digital signage network optimization, etc.

Medical Image Segmentation using Multiscale and Super-Resolution methods En-Ui Lin, Soundararajan Ezekiel, Waleed Farag, Indiana University of Pennsylvania. [email protected], [email protected], [email protected]

In many medical imaging applications, a clear delineation and segmentation of areas of interest from low resolution images is crucial. It is one of the most difficult and challenging tasks in image processing and directly determines the quality of final result of the image analysis. In preparation for segmentation, we first use preprocessing methods to remove noise and blur and then we use super-resolution to produce a high resolution image. Next, we will use wavelets to decompose the image into different sub-band images. In particular, we will use discrete wavelet transformation (DWT) and its enhanced version double density dual discrete tree wavelet transformations (D3-DWT) as they provide better spatial and spectral localization of image representation and have special importance to image processing applications, especially medical imaging. The multi-scale edge information from the sub-bands is then filtered through an iterative process to produce a map displaying extracted features and edges, which is then used to segment homogenous regions. We have applied our algorithm to challenging applications such as gray matter and white matter segmentations in Magnetic Resonance Imaging (MRI) images. Finally, we apply performances metrics which demonstrate the strength of our proposed method in the problem of medical image segmentation.

Student Author: En-Ui Lin.

Volumetric Features for Object Region Classification in 3D LiDAR Point Clouds

Page 38: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

38

38

Nina Varney and Vijayan Asari; University of Dayton [email protected] LiDAR data is a set of geo-spatially located points which contain (X, Y, Z) location and intensity data. This paper presents the extraction of a novel set of volume and texture based features from segmented point-clouds. The data is first segmented into individual object regions using an automatic seeded region growing technique. These object regions are normalized to an N x N x N voxel space, where each voxel contains information about the location and density of points within that voxel. A set of volumetric features including 3D form factor, fill, stretch, rotation invariant local binary pattern (RILBP), corrugation, contour, plainness and relative variance are extracted to represent the object region. The form factor, fill and stretch provide a series of meaningful relationships between the volume, surface area and shape of the object. RILBP provides a textural descriptor from the intensity of the LiDAR data. The corrugation, contour and plainness are extracted by 3D Eigen analysis of the object volume to describe the details of the object’s surface. Relative variance provides a representation of the distribution of points throughout the object. The new feature set is found to be robust, and scale and rotation invariant for object region classification. The performance of the proposed feature extraction technique has been evaluated on a set of segmented and voxelized point cloud objects in the aerial LiDAR data from Surrey, British Columbia, that was available through the Open Data Program. The volumetric features, when used as an input to a two class SVM classifier, correctly classified the object regions with an accuracy of 97.5%, with a focus on separating man made versus vegetation objects. Future research will aim to extend these features as inputs to a multi-class classifier, to identify man-made objects such as fences, vehicles, buildings, etc.

Analysis Of Diurnal, Long-Wave Hyperspectral Measurements Of Natural Background And Manmade Targets Under Different Weather Conditions Christoph Borel, Research Associate Professor, Department of Engineering Physics, Air Force Institute of Technology, WPAFB, OH 45433, [email protected] ; Dalton Rosario, U.S. Army Research Laboratory, Adelphi, MD 20783, [email protected] ; Joao Romano,U.S. Army Armament RDEC, Picatinny Arsenal, NJ 07806, USA, [email protected]    In this paper we describe results of the analysis of diurnal Fourier Transform spectrometry data taken at Picatinny Arsenal in New Jersey with the long-wave hyper-spectral camera from Telops under different weather conditions. In the near future all or part of the data will be made available to the public. The first part of the paper discusses the processing from raw data to calibrated radiance and emissivity data. Data was taken during several months under different weather conditions every 6 minutes from a 213ft high tower of surrogate tank targets for a project sponsored by the Army Research Laboratory in Adelphi, MD. An automatic calibration and analysis program was developed which creates calibrated data files and HTML files. The first processing stage is a flat-fielding operation where for the minimum and maximum extent of the interferogram is used to estimate the gain or vignetting and the mean value or baseline of the interferogram is the offset. During this step the mean base line is used to find dead pixels (baseline low or at the maximum). Noisy pixels are detected where the standard deviation over the part of the interferogram which is outside of the center-burst region is computed. If a pre-defined threshold is exceeded then the pixel is flagged as noisy. A bad pixel map for dead and noisy pixels is created and for each scan line the bad pixels interferograms are replaced. Then a

Page 39: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

39

39

flat-fielded and bad pixel corrected calibration cube using the gain and offset determined by a single blackbody measurement is created. In the second stage each flat-fielded cube is Fourier transformed and a real-valued un-calibrated radiance cube is created for a pre-defined wavenumber range. Next the radiometric calibration is performed using a 2-point calibration computed from the two blackbody measurements and then applied to each data cube. Two-point calibrated radiance cubes are then created in ENVI format and the HTML file contains quicklooks of spectra of selected pixels, the original and flat-fielded cubes as animated GIF images and links to all the intermediate files that are created. For selected cubes a temperature-emissivity separation algorithm is applied where the cloudyness and cloud temperature is varied. Resulting retrieved cloudyness fractions will be compared with measured cloud cover fractions for opaque and thin clouds. The second part discusses environmental effects such as diurnal and seasonal atmospheric and temperature changes and the effect of cloud cover on the data. The effect of environmental conditions on the temperature-emissivity separation will be discussed. Background Subtraction Algorithm for Moving Object Detection and Evaluation of Performance Using Different Dataset Kakali Das1, Mrinal Kanti Bhowmik1 Department of Computer Science and Engineering, Tripura University, Suryamaninagar 799022, Agartala, India

Moving object detection using video streams has an important role in different computer vision applications such as video surveillance, navigation systems, recognition, classification & activity analysis. In this paper, a modified object detection technique is proposed, which is an improvised version of an existing object detection technique called ViBe (Visual Background Extractor) by using a post-processing step. The post processing step includes simple median filter for getting more accurate result of background subtraction. This paper also presents about the newly created SAMEER-TU (Society for Applied Microwave Electronics Engineering & Research- Tripura University) dataset containing Visual videos for moving object detection. This technique is tested on three video sequences, and ground truths for each frame for each video sequence are created. Some of the input frames of these videos and their corresponding resultant ground truths are reported in this paper. Comparative study is also carried out between the existing different benchmark datasets and SAMEER-TU (Society for Applied Microwave Electronics Engineering & Research- Tripura University) dataset in terms of accuracy. Some experimental results are reported over some typical video sequences. The video sequences include some simple video and some video incorporate dynamic background, which is a critical problem for moving object detection. Dynamic Background describes a background where the background objects are moving such as, swaying tree in the background, flowing water, fountain in background, etc. this paper shows that the use of the median filter in the post processing step of moving object detection algorithm increase the accuracy approximately 13%. Experimental results validate the improvements of the existing algorithm ViBe (Visual Background Extractor). A 3D Pointcloud Registration Algorithm Based on Fast Coherent Point Drift Min Lu, Jian Zhao, Yulan Guo, Jianping Ou, Janathan Li

Page 40: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

40

40

[email protected], [email protected], [email protected], [email protected], [email protected] Pointcloud registration has a number of applications in various research areas. Computational complexity and accuracy are two major concerns for a pointcloud registration algorithm. This paper proposes a novel Fast Coherent Point Drift (F-CPD) algorithm for 3D pointcloud registration. The original CPD method is very time-consuming. The situation becomes even worse when the number of points is large. In order to overcome the limitations of the original CPD algorithm, a global convergent squared iterative expectation maximization (gSQUAREM) scheme is proposed. The gSQUAREM scheme uses an iterative strategy to estimate the transformations and correspondences between two pointclouds. Experimental results on a synthetic dataset show that the proposed algorithm outperforms the original CPD algorithm and the Iterative Closest Point (ICP) algorithm in terms of both registration accuracy and convergence rate. Student Author: Yulan Guo

Democratizing the Visualization of 500 Million Webcam Images Joseph D. O'Sullivan, Abby Stylianou, Austin Abrams & Robert Pless, Department of Computer Science & Engineering, Washington University in St. Louis, St. Louis, Missouri [email protected], [email protected], [email protected], [email protected] Five years ago we reported at AIPR on a nascent project to archive images from every webcam in the world and to develop algorithms to geo-locate, calibrate, and annotate this data. This archive of many outdoor scenes (AMOS) has now grown to include 28000 live outdoor cameras and over 550 million images. This is actively being used in projects ranging from large scale environmental monitoring to characterizing how built environment changes (such as adding bike lanes in DC) affects physical activity patterns over time. But the biggest value in a very long term, widely distributed image dataset is the rich set of “before” data that can be used to characterize changes in natural experiments. To exploit this we build and share a collection of web-tools to support large scale data driven exploration that allow anyone to compare imagery and find unusual events. One visualization tools is “A Tale of Two Years”, an image browser that visualizes each image in comparison to an earlier image. When the earlier image is exactly a year earlier, this tool highlights changes in biological response to climate. When used in urban images, the tool highlights changes in built environment structures. A second visualization tool uses PCA to find the subspace that characterizes the variations in this scene, and highlights imagery that does not fit that subspace. This anomaly detection captures both imaging failures such as lens flare and also unusual situations such as street fairs. These tools, while not technically complicated, are the parts of the AMOS that are most widely used by a non-technical audience, and we share case studies where these tools highlight interesting scene features and events. Student Authors: Joseph D. O'Sullivan

Page 41: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

41

41

Automated 3D Geo Registration Methods Andrew Brown, Mike Moore, Tim Fair, and John Berger, Toyon Research Corporation. [email protected], [email protected], [email protected], and [email protected] Toyon Research Corporation has developed a robust library of image processing algorithms for automated 3D reconstruction and has recently developed a software application called 3D Geo Registration (3DGR) for use in automating track alignment procedures and improving on registration accuracy for various Wide Area Motion Imagery (WAMI) systems. Toyon’s algorithms include advanced sensor models for focal plane and scanning imagers, in addition to data driven algorithms for automatically registering and aligning images to produce highly accurate models that represent the world. Toyon’s 3D model outputs enable alignment with known 2D and 3D reference sources for use in airborne and satellite based surveillance applications. Toyon has developed these algorithms in conjunction with various Government sponsors and through various Small Business Innovative Research (SBIR) efforts with the Department of Defense and the Department of Homeland Security. Human Activity Detection using Sparse Representation Dipti Killedar, Sreela Sasi. Department of Computer and Information Science, Gannon University, Erie, PA, USA [email protected]; [email protected] Human activity detection from videos is very challenging. It has numerous applications in sports evalution, video surveillance, elder/child care, etc. Lots of research has been done to develop different techniques in the area of human activity detection such as Hidden Markov Model (HMM), Maximum Entropy Markov Model (MEMM), Sensor based methods, etc. In this research, a model using sparse representation is presented for human activity detection from video data. Sparse representation creates a model for the video data using a linear combination of the dictionary and a coefficient matrix. The dictionary is created using spatio temporal features of video data. These Spatio temporal features are extracted from the training video data using the Spatio Temporal Interest Points (STIP) algorithm. The K-Singular Value Decomposition (K-SVD) algorithm is used for learning dictionaries from these spatio temporal points for the training video dataset. This algorithm is a generalization of the K-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the test data based on the current dictionary and a process of updating the dictionary. After dictionary learning stage, L1-normalization is used to solve the linear equations to get a sparse solution. Finally, human action is classified using a minimum threshold residual value of the corresponding action class in the testing video dataset. Experiments are conducted using the KTH dataset, which contains a number of action videos recorded in a controlled environment. The current approach performed well in classifying activities with a success rate of 90%. Modeling of Nuclear Detonation Optical Output in DIRSIG 1st Lt Ashley Green, Capt Robert Slaughter, Dr. John McClory,

Page 42: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

42

42

Air Force Institute of Technology [email protected] Previous research has demonstrated the capability to simulate the sensorresponse to a nuclear fireball within the Digital Imaging and Remote Sensing Image Generation (DIRSIG) model, using both an analytical single temperature model and historic data. Modern nuclear effects codes have been developed that incorporate multidimensional interactions. This research combines the output of a modern radiation and shock physics multi-dimensional nuclear effects code with DIRSIG, a Monte Carlo multi-bounce photon tracking code. The nuclear effects code was used to determine the temperature and density of the three dimensional nuclear fireball. The output of the code was then used as the input to DIRSIG. DIRSIG was used to analyze how environmental interactions change the optical signal received by a realistic sensor. Emphasis was placed on the significant effects of photon interactions in an urban canyon scenario following a nuclear detonation such as reflections off buildings or other surrounding objects. Sensor models were developed for silicon bhangmeters, terrestrial security cameras, and standard vehicle dash cameras to analyze the performance constraints of these sensors from an optical diagnostic perspective.

Student Author: 1St Lt Ashley Green

Automatic Segmentation of Carcinoma in Radiographs Fatema Albalooshi1, Sara Smith2, Paheding Sidike1, Yakov Diskin1 and Vijayan Asari1 1University of Dayton, 2University of Cincinnati College of Medicine [email protected] A strong emphasis has been made on making the healthcare system and the diagnostic procedure more efficient. In this paper, we present an automatic detection technique designed to segment out abnormalities in X-ray imagery. Utilizing the proposed algorithm allows radiologists and their assistants to more effectively sort and analyze large amount of imagery. In radiology, X-ray beams are used to detect various densities within a tissue and to display accompanying anatomical and architectural distortion. Lesion localization within fibrous or dense tissue is complicated by a lack of clear visualization as compared to tissues with an increased fat distribution. As a result, carcinoma and its associated unique patterns can often be overlooked within dense tissue. We introduce a new segmentation technique that integrates prior knowledge, such as intensity level, color distribution, texture, gradient, and shape of the region of interest taken from prior data, within segmentation framework to enhance performance of region and boundary extraction of defected tissue regions in medical imagery. Prior knowledge of the intensity of the region of interest can extremely help in guiding the segmentation process, especially when the carcinoma boundaries are not well defined and when the image contains non-homogeneous intensity variations. We evaluate our algorithm by comparing our detection results to the results of the manually segmented regions of interest. Through metrics, we also illustrate the effectiveness and accuracy of the algorithm in improving the diagnosis efficiency for medical experts.

Novel Geometric Coordination Registration in Cone-beam Computed Tomogram

Page 43: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

43

43

Walter Lam, [email protected] The use of the cone-beam computed tomography (CBCT) in medical field enables the clinicians to visualise the hard tissue of head and neck region in a cylindrical field of view (FOV). The images are usually presented with reconstructed three-dimensional (3D) imaging and its orthogonal (x-, y- and z- planes) images. Spatial relationship of the structures in these orthogonal views is important for diagnosis of disease as well as planning for treatment. However, the non-standardized positioning of the object during the CBCT data acquisition often induces errors in measurement since orthogonal images cut at different planes might look similar. In order to solve the problem, this study proposes an effective mapping from the Cartesian coordinates in a cube physically to its respective coordinates in 3D imaging. Therefore, the object (real physical domain) and the imaging (computerized virtual domain) can be linked up and registered. In this way, the geometric coordination of the object/imaging is defined and its orthogonal images would be fixed on defined planes. The images can then be measured with vector information and serial imagings can also be directly compared.

Bayesian Solutions to Non-Bayesian Detection Problems: Unification Through Fusion Alan Schaum, Navy Research Lab, [email protected] In 1950 Abraham Wald proved that every admissible statistical decision rule is either a Bayesian procedure or the limit of a sequence of such procedures. He thus provided a decision-theoretic justification for the use of Bayesian inference, even for non-Bayesian problems. It is often assumed that his result also justified the use of Bayesian priors to solve such problems. However, the principles one should use for defining the values of prior probabilities have been controversial for decades, especially when applied to epistemic unknowns. A new approach indirectly assigns values to the quantities usually interpreted as priors by imposing specific constraints on a detection algorithm. No assumptions about prior “states of belief” are necessary. The result shows how Wald’s theorem can accommodate both Bayesian and non-Bayesian problems. The unification is mediated by the fusion of clairvoyant detectors. Rapid Location of Radiation Sources in Complex Environments using Optical and Radiation Sensors Christoph Borela, David Bunkera and Graham W Alfordb. Center for Technical Intelligence Studies and Research, Department of Engineering Physics, Air Force Institute of Technology, WPAFB, OH 45433; and, University of Tennessee Knoxvill, Knoxville, TN [email protected]; [email protected]; [email protected]  Baseline radiation background is almost never known and constantly changes particularly in urban areas. It is difficult to know what the expected background radiation should be and how a radiological incident may elevate the radiation. Naturally occurring radiation from rocks and

Page 44: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

44

44

building materials often contributes significantly to measured radiation. Buildings and other tall structures also shield radiation and thus need to be taken into account. Models of natural occurring background radiation can be derived from knowledge of geology, building material origins, vegetation, and weather conditions. After a radiological incident, the radiation will be elevated near the event, and some material may be transported by mechanisms such as airborne transport and/or run-off. Locating and characterizing the sources of radiation quickly and efficiently are crucial in the immediate aftermath of a nuclear incident. The distribution of radiation sources will change naturally and also due to clean-up efforts. Finding source strengths and locations during both the initial and clean-up stages is necessary to manage and reduce contaminations. The overall objective of the “Rapid Location Of Radiation Sources In Complex Environments Using Optical And Radiation” research project is to design and validate gamma ray spectrum estimation algorithms that integrate optical and radiation sensor collections into high resolution, multi-modal site models for use in radiative transport codes. Our initial focus will be on modeling the background radiation using hyper-spectral information from visible through the shortwave infrared sensors and thermal imagers. The optical data will complement available ancillary data from other sources such as Geographic Information Systems (GIS) layers, e.g. geologic maps, terrain, surface cover type, road network, vegetation (e.g. serpentine vegetation), 3-D building models, known users of radiological sources, etc. In absence of GIS layers, the data from the hyper-spectral imager would be analyzed with special with special software to automatically create GIS layers and radiation survey data to come up with a method to predict background radiation distribution. We believe the estimation and prediction of the natural background will be helpful in finding anomalous point, line and small area sources and minimize the number of false alarms due to natural and known man-made radiation sources such as radiological medical facilities, industrial users of radiological sources. Sparse 3D Reconstructions of Atmospheric Nuclear Detonations Robert Slaughter, Tyler Peery, John McClory and Karl Walli. [email protected], [email protected], [email protected] and [email protected] Air Force Institute of Technology Researchers at Lawrence Livermore National Laboratory (LLNL) have started digitizing nearly 10,000 technical films spanning the above ground atmospheric nuclear testing operations conducted by the United States between 1945 and the 1960s. Researchers at the Air Force Institute of Technology (AFIT) have begun employing modern digital image processing and computer vision techniques to exploit this data set and determine specific invariant features of the early fireball dynamic growth. The focus of this presentation is to introduce the methodology behind three dimensional reconstructions of time varying nuclear fireballs. Multi-view geometry algorithms were used to perform sparse reconstructions of the nuclear events from the multiple cameras observing the detonation at different look angles. From matched image pairs, sparse reconstructions are determined through manual selection of key features. This presentation will focus on the applied technique, pursued techniques, initial results, and difficulties of performing a 3D reconstruction of a time varying nuclear fireball from digitized films.

Stduent Authors: Robert Slaughter and Tyler Peery

Smoothed Chemical Vapor Detection

Page 45: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

45

45

Stefan Wager and Guenther Walther, Stanford University. [email protected], [email protected]

Manolakis and D’Amico [2005] describe methods for chemical vapor detection by hyperspectral imaging. These methods work well with strong signals; however, with weaker signals, their performance is reduced by the presence of background noise. In this paper, we show how to improve their performance using spatial smoothing with a spline basis tailored to the gas detection problem. We present results on the 2014 DTRA/NSF/NGA chemical detection challenge.

Student Author: Stefan Wager

Machine Learning Nuclear Detonation Features Lt Col Dan Schmitt, Air Force Institute of Technology, Dayton, OH. [email protected] Nuclear explosion yield estimation equations based on a 3D model of the explosion volume will have a lower uncertainty than the current radius based estimation. To accurately collect data for a volume model of atmospheric explosions requires building a 3D representation from 2D images. The majority of 3D reconstruction algorithms use the SIFT (scale-invariant feature transform) feature detection algorithm which works best on feature-rich objects with continuous angular collections. These assumptions are different from the archive of nuclear explosions that have only 3 points of view. This paper reduces 300 dimensions derived from an image based on Fourier analysis and five edge detection algorithms to a manageable number to detect sunspots that may be used to correlate videos of different viewpoints for 3D reconstruction Furthermore, experiments test whether histogram equalization, Wiener filtering, and median filters improve detection of these features using four kernel sizes passed over these features. Dimension reduction using principal component analysis (PCA), forward subset selection, ReliefF, and FCBF(Fast Correlation-Based Filter) are combined with a Mahalanobis distance classifiers to find the best combination of dimensions, kernel size, and filtering to detect the sunspots. Results indicate that sunspots can be detected with hit rates of 90% and false alarms < 1%.

Foley: The Art of the Transparent Soundscape

Andy Malcom, [email protected]

In this talk by Emmy Award winning foley artist Andy Malcolm, you will be introduced to the unique art of foley.Foley is the art of performing sounds for film, video and other media to make viewers believe that the sound effects are actually real. Except for situations where the foley becomes animated, the best foley tracks are transparent. The viewers should not be able to realize that the sound was not actually part of the filming process itself. Andy will explain and demonstrate how and why this process is used in producing sound for movies. The art of foley reinforces what the audience is experiencing and brings the action to life. The most effective sound isn't always the actual one. You will see how Andy creates the illusion of the real sound using surprisingly unconventional objects in unexpected ways. The term "Foley" is named after Jack Foley, the first practitioner of the art. Jack started in the motion picture business in the silent picture era and lived through the exciting times when the industry converted to sound. Sound and

Page 46: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

46

46

image have remained the primary components of cinema for many decades. Each is as sophisticated and as carefully constructed as the other. Foley artists represent an anachronism; the survival of acoustic invention in an era of digitized technology. Representing Pictures with Sound Edward Schaefer, System Development Engineer, [email protected] A coarse representation of pictures can be created with sound. A series of such sounds can be used to represent an animation or a movie. In this project, pictures are divided into a 4x4 array of "sound pixels". The position of each sound pixel is assigned a musical note, and the contents of each sound pixel is used to create a volume. The resultant sound is the representation of the picture. Algorithms for creating notes and volumes will be described. The behavior of the program will be illustrated with sequences pictures with sounds. Generating sounds for movies using this technique will be discussed.

Marginal Space Deep Learning for Efficient Anatomy Detection in Volumetric Image Data  Bogdan Georgescu, Yefeng Zheng, Hien Nguyen, Vivek Singh, David Liu, Dorin Comaniciu Imaging and Computer Vision, Siemens Corporate Technology, Princeton, NJ [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; Fast and robust anatomical object detection is a fundamental task in medical image analysis, supporting the entire clinical workflow from diagnosis, patient stratification, therapy planning, intervention and follow-up. State-of-the-art methods for automatic image analysis exploit large annotated image databases, being based on machine learning techniques applied to relevant image features associated with the anatomy of interest. Nevertheless, when the object appearance is complex or the dimensionality of the object parameter space is large there are still challenges in providing effective anatomy detection solutions. With this work we introduce a class of marginal space deep learning techniques that combine the strengths of deep learning artificial neural networks with learning on projected parameter subspaces of increasing dimensionality. One of the characteristics of the deep learning architecture is the ability to encode complex image patterns in hierarchical features on each neural network layer. In addition, by using approximate decomposable weights, we can preserve the classification performance while significantly improving the speed of applying such classifier over high dimensional data. Furthermore, the mechanism of marginal space learning allows us to learn classifiers in marginal spaces of gradually increasing dimensionality. For example, to detect a 3D object described by 9 pose parameters (there parameters each for position, orientation and scale) we learn marginal classifiers in the position space, position-orientation space and position-orientation-scale space. As a result, the overall learning process is efficient, focusing the search on high probability regions of the parameter space, thus providing excellent run-time performance. We demonstrate the proposed marginal space deep learning technique for landmark detection in volumetric computed-tomography data and cardiac magnetic resonance images. The cross-validated

Page 47: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

47

47

experiments show a significant error reduction and speed-up in comparison with previously reported results. Cloud Based High Performance Video Transcoding Platform Amit Puntambekar, Mike Coward, Craig Lee and Garrett Choi, QuickFire Networks Corporation. San Diego, CA. [email protected], [email protected], [email protected], [email protected] The rapid proliferation of video capture devices (e.g. smartphones) coupled with the desire to view content on TV Everywhere devices (tablets, smartphones, etc.) has exposed a new bottleneck, namely the need to quickly transcode (translate) that massive amount of video into formats that TV Everywhere devices can understand. We showcase one of the fastest video transcoding service, currently known in the state of the art. Available via the public and private cloud, QuickFire.TV is based on highly parallel / distributed processing software, enabling video processing at 10-100x real-time vs. existing solutions that transcode at 1-2x real-time.

Kwiver:A Open-Source Cross-Platform Video Exploitation Framework Keith Fieldhouse, Matthew J. Leotta, Arslan Basharat, Russell Blue, David Stoup, Charles Atkins, Linus Sherrill, Benjamin Boeckel, Pau Tunison, Jacob Becker, Matthew Dawkins, Matthew Woehlke, Roderic Collins, Matt Turek, Anthony Hoogs [All with Kitware, Inc.] Corresponding Author: [email protected] We introduce KWiver, a cross-platform video exploitation framework that Kitware has begun releasing as open source. Kitware is utilizing a multi-tiered open-source approach to reach as wide an audience as possible. Kitware's government-funded efforts to develop critical defense technology will be released back to the defense community via Forge.mil, a government open source repository. Infrastructure, algorithms, and systems without release restrictions will be provided to the larger video analytics community via KWiver.org and github. Our goal is to provide a video analytics technology baseline forrepeatable and reproducible experiments, and to provide a focal point for collaboration and contributions from groups across the community. KWiver plans to provide several foundational capabilities. A multi-processing framework allows algorithmic worker code to execute and communicate in a multiprocessing environment. A companion data abstraction layer allows code to scale from small-scale desktop environments based on file I/O to large multi-core systems communicating via databases. Visualization tools provide cross-platform GUIs for viewing algorithmic results overlaid on source video data. Finally, an integrated evaluation framework enables not only quantitative evaluation via common detection and tracking metrics, but qualitative feedback by feeding annotation and scoring states to the visualization tools. KWiver is the technology behind a full-frame, frame-rate WAMI tracker which has been deployed OCONUS and has been released as government open source on Forge.mil. Upcoming releases will include FMV source data, ground truth, baseline tracking capability, computed tracking results, and evaluation products.

Page 48: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

48

48

An Automated Workflow for Observing Track Data in 3-Dimensional Geo-Accurate Environments Derek J. Walvoord and Bernard V. Brower, Exelis, 400 Initiative Drive, Rochester, New York 14606. [email protected], and [email protected] Recent developments in computing capabilities and persistent surveillance systems have enabled advanced analytics and visualization of image data. Using our existing capabilities, this work focuses on developing a unified approach to address the task of visualizing track data in 3-dimensional environments. Our current structure from motion (SfM) workflow is reviewed to highlight our point cloud generation methodology, which offers the option to use available sensor telemetry to improve performance. To this point, an algorithm outline for navigation guided feature matching and geo-rectification in the absence of ground control points (GCPs) is included in our discussion. We then provide a brief overview of our on-board processing suite,which includes real-time mosaic generation, image stabilization, and feature tracking. Exploitation of geometry refinements, inherent to the SfM workflow, is then discussed in the context of projecting track data into the point cloud environment for advanced visualization. Results using the new Exelis airborne collection system, Corvus Eye, are provided to discuss conclusions and areas for future work. Large Displacement Optical Flow Based Image Predictor Model Nishchal K. Verma and Aakansha Mishra Department of Electrical Engineering, Indian Institute of Technology Kanpur, Kanpur, India [email protected] and [email protected] This paper proposes a Large Displacement Optical Flow based Image Predictor Model for generating future image frames by applying past and present image frames. The predictor model is an Artificial Neural Network (ANN) and Radial Basis Function Neural Network (RBFNN) Model whose input set of data is horizontal and vertical components of velocities estimated using Large Displacement Optical Flow for every pixel intensity in a given image sequence. There has been a significant amount of research in the past to generate future image frames for a given set of image frames. The quality of generated images is evaluated by Canny’s edge detection Index Metric (CIM) and Mean Structure Similarity Index Metric (MSSIM). For our proposed algorithm, CIM and MSSIM indices for all the future generated images are found better when compared with the most recent existing algorithms for future image frame generation. The objective of this study is to develop a generalized framework that can predict future image frames for any given image sequence with large displacements of objects. In this paper, we have validated our developed Image Predictor Model on an image sequence of landing jet fighter and obtained performance indices are found better as compared to most recent existing image predictor models

On Parzen Windows Classifiers

Page 49: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

49

49

Jing Peng1 and Guna Seetharaman2 1 Computer Science Department, Montclair State University, Montclair, NJ. 2 Information Directorate, Air Force Research Laboratory, Rome, NY

[email protected] and [email protected]

Parzen Windows classifiers have been applied to a variety of density estimation as well as classification tasks with considerable success. Parzen Windows are known to converge in the asymptotic limit. However, there is a lack of theoretical analysis on their performance with finite samples. In this paper we show a connection between Parzen Windows and the regularized least squares (RLS) algorithm, which has a well-established foundation in computational learning theory. This connection allows us to provide interesting insight into Parzen Windows classifiers and their performance in finite sample settings. Finally, we show empirical results on the performance of Parzen Windows classifiers using a number of real data sets. These results corroborate well our analysis. Modified Deconvolution using Wavelet Image Fusion Michael McLaughlin1, Erik Blasch2, Soundararajan Ezekiel1, Mark Alford2, Maria Cornacchia2, Adnan Bubalo2, Millicent Thomas3 1Indiana University of Pennsylvania, Indiana, PA, 2Air Force Research Lab, Rome, NY, 3Northwest University, Kirkland, WA.

[email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Image quality can be affected by a number of factors. The two predominant ones are noise and blur. Blur typically manifests itself as a smoothing of edges. It can be described as the convolution of an image with an unknown blur kernel or function. The inverse to this process is known as deconvolution, which is a very difficult process even in the absence of noise. Removing blur from an image has multiple stages: first, we must identify or approximate the blur kernel, and then perform a deconvolution of the estimated kernel and blurred image. This is often an iterative process with successive approximations of the kernel leading to optimal results. However, it is unlikely that a given image is blurred uniformly; in real world situations most images are already blurred. Such blur can be a product of object motion or camera motion/defocus which will lead to separate blur kernels. The result of this process will sharpen blurred regions, but also degrade the regions previously unaffected by blur. It should be noted that this process is complex and computationally expensive. To remedy this, we propose a novel modified deconvolution approach to removing blur from a no-reference image. First, we estimate the blur kernel, and then we perform a deconvolution on the blurred image. Finally, wavelet techniques are implemented to fuse the blurred and de-blurred images. In this way we recover the details in the blurred image that are lost by deconvolution, but retain the sharpened features in the de-blurred image. We evaluate the effectiveness of our proposed approach using several metrics and compare them to standard approaches. Our results show that this approach has potential applications to many fields including medical imaging, topography, and computer vision.

Page 50: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

50

50

Sparse Generalized Fourier Series via Collocation-Based Optimization Ashley Prater, Air Force Research Laboratory, Information Directorate, Rome, NY [email protected] Generalized Fourier series with orthogonal polynomial bases have useful applications in several fields, including differential equations, pattern recognition, and image and signal processing. However, computing the generalized Fourier series can be a challenging problem, even for relatively well behaved functions. In this paper, a method for approximating a sparse collection of Fourier-like coefficients is presented that uses a collocation technique combined with an optimization problem inspired by recent results in compressed sensing research. The discussion includes approximation error rates and numerical examples to illustrate the effectiveness of the method. One example displays the accuracy of the generalized Fourier series approximation for several test functions, while the other is an application of the generalized Fourier series approximation to rotation-invariant pattern recognition in images.

Indoor Non-Linear Target Tracking Using Wi-Fi and Video Tommy Chin, Rochester Institute of Technology, [email protected] Target tracking through object recognition software and wireless signal measurements on Wi-Fi enabled devices have been used in the past decade to enhance the security and safety of an area of interest. Many organizations—such as a municipal airport or a grocery store—use an array of distributed cameras to monitor the wellbeing of their premise. In object recognition software, many pitfalls are shown when a target of interest is out of focus due to overbearing items—such as a shelf—that hide the directionality of the individual. Additionally, visual tracking is also lost when the target is positioned into a crowded region that creates confusion for the recognition system. A common solution to estimate the direction and track of the individual is through the utilization of Kalman and Gaussian filters in aims to continuously track the target in obscured environments. This approach is also prone to a weakness when the prediction is deemed invalid when the direction becomes non-linear. To solve this problem, the main intent of this research is to apply information fusion to Received-Signal-Strength-Indication (RSSI) measurements collected through Wi-Fi enabled mobile devices and object recognition data to track a target through the distributed camera system within an indoor environment. The primary filtering mechanism will be focused towards Kalman and Gaussian methods. As a result of the research, measurement is drawn to compare the newly formed tracking solution to that of the actual path for correlation. This is a novel and intuitive approach to target tracking as it can be utilized in indoor environments. Motion Estimation of Flotation Froth Using Mutual Information and Kalman Filter Anthony Amankwah1 and Chris Aldrich2 1Computer Science Department, University of Ghana, P O Box LG Accra , Ghana (E-mail: [email protected]). 2Department of Mining Engineering and Metallurgical Engineering, Western Australian School of Mines, Curtin University, GPO Box U1987, Perth, WA, 6824, Australia

Page 51: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

51

51

The estimation of motion of froth using image processing techniques is difficult since bubbles collapse and merge leading to bubble deformations. The most popular motion estimation technique is block matching. In the block matching algorithm the image frame is typically divided in nonoverlapping rectangular blocks. The best match to the current block of pixels is searched in the previous frame of the video sequence within a search area about the location of the current block. The best solution is the full search algorithm which exhaustively searches for the best matched block within all locations of the search window. Due to the high computational cost full search several fast search techniques have been introduced. The fast search techniques reduce the number of matching calculations per block image by selecting a subgroup of possible candidate locations. The fast search algorithms include three-step-search, four-step-search, diamond search, and 2D logarithmic search. The mean square error (MSE) and mean absolute difference (MAD) are considered the best similarity metrics for motion estimation. In this work we use mutual information with a bin size of two as the similarity metric. The computational cost is similar to MSE and MAD. To further improve accuracy of the estimated results of our algorithm we use the Kalman filter. Experimental results show that the proposed motion estimation technique improves the motion estimation accuracy in terms of peak signal-to-noise ratio of the reconstructed frame.

Multi-resolution Deblurring Michael McLaughlin1, Erik Blasch2, Soundararajan Ezekiel1, Mark Alford2, Maria Cornacchia2, Adnan Bubalo2, Millicent Thomas3 1Indiana University of Pennsylvania, Indiana, PA, 2Air Force Research Lab, Rome, NY, 3Northwest University, Kirkland, WA.

[email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

As technology advances, blur in an image remains as an ever present issue in the image processing field. A blurred image is mathematically expressed as a convolution of a blur function with a sharp image plus noise. Removing blur from an image has been widely researched and is still an active field of research today. Without a reference image, identifying, measuring, and removing blur from a given image is very challenging. This process involves estimating the blur kernel to match with various types of blur including camera motion/defocus or object motion. Various blur kernels have been studied over many years, but the most common function is the Gaussian. Once the blur kernel (function) is estimated, a deconvolution is performed with the kernel and the blurred image. Many existing methods operate in this manner, however, these methods remove blur from the blurred region, but damage the un-blurred regions of the image. This is due to the actual intensity values of the pixels in the image being used in the deblurring process and become easily distorted. The method proposed in this paper uses multi-resolution analysis techniques to separate blur, edge, and noise coefficients. Deconvolution with the estimated blur kernel is then performed on these coefficients instead of the actual pixel intensity values before reconstructing the image. Additional steps will be taken to retain the quality of un-blurred regions of the blurred image. The result will then be compared against standard deblurring techniques using several metrics including mutual information and structural similarity based metrics. Experimental results on simulated and real data show that our approach achieves

Page 52: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

52

52

higher quality results than previous approaches on various blurry and noise images. Further, our approaches have military, medical, and topographic application.

Towards Visual Analysis of Unconstrained Images in Social Forums: Studies On Concept Detection and Personalized Economy Of Images In Social Networks

Sangmin Oh, Eric Smith, Yiliang Xu, Anthony Hoogs. KitWare Inc, NY. [email protected] In this work, we present our recent work on visual analysis of unconstrained images in social forums. Recently, enormous amount of images are being shared via social network, which exhibit extreme diversity in terms of semantic contents, visual quality, and styles. While such large quantity of shared content can be used as a resource to extract valuable information, it has become crucial to develop algorithmic solutions to automate visual understanding to enable knowledge discovery from such challenging resources. For visual content retrieval, we show that concept detectors which are trained from example images based on extracted features can be effectively used to identify variety of concepts from a large archive of data shared on social forums. Such categories include objects, scenes, and events, among others, and advanced algorithmic solution is used to accelerate both the learning and detection process. In addition, we present a novel study on analyzing individual user's behavioral patterns regarding images shared on social forums. In particular, we view diverse user activities on social multimedia services as an economy, where the first activity mode of sharing or posting is interpreted as supply, and another mode of activity such as commenting on images is interpreted as consumption. To characterize user profiles in these two behavioral modes, we propose an approach to characterize users' supply and consumption profiles based on the image content types with which they engage. We then present various statistical analyses, which confirm that there is an unexpected significant difference between these two behavioral modes. Furthermore, we introduce a statistical approach to identify users with salient profiles, which can be useful for social multimedia services to block users with undesirable behavior or to promote viral content. Polarimetric Calibration and Characterization of the Telops Field Portable Polarimetric-Hyperspectral Imager in the LWIR Joel Holder Air Force Institute of Technology, Dayton, OH [email protected] Polarimetric-hyperspectral imaging brings two traditionally independent modalities together to potentially enhance scene characterization capabilities. This could increase confidence in target detection, material identification, and background characterization over traditional hyperspectral imaging. In order to fully exploit the spectro-polarimetric signal, a careful calibration process is required to remove both the radiometric and polarimetric response of the system (gain). In the long-wave infrared (LWIR, 8 μm to 12 μm ), calibration is further complicated by the polarized self-emission of the instrument itself (offset). This paper presents a calibration methodology developed for a LWIR Telops Hyper-Cam that has been modified with a rotatable linear wire-grid polarizer (4000 line/mm, ZnSe, 350:1 extinction ratio). A standard spectro-radiometric calibration method for Fourier-transform spectrometers (FTS) is modified with a Mueller matrix approach to

Page 53: Applied Imagery Pattern Recognition Workshop 2014... · 2:10 Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging 2:40 Q&A 5 min 2:45 Poster Reviews. 12 Posters,

53

53

account for polarized transmission through and polarized self-emission from each optical component. This is done for two cases: one assuming that the instrument polarizer is ideal, and a second method which accounts for a non-ideal instrument polarizer. It is shown that a standard two-point radiometric calibration at each instrument polarizer angle is sufficient to remove the polarimetric bias of the instrument, if the instrument polarizer can be assumed to be ideal. For the non-ideal polarizer case, the Mueller deviation matrix is determined for the system, and used to quantify how non-ideal the system is. The noise-equivalent s1, s2, and DoLP are also quantified using a wide-area blackbody. Finally, a scene with a variety of features in it is imaged and analyzed.