defense applications white paper
TRANSCRIPT
Jul-2012
4984 El Camino Real Suite 205
Los Altos, CA 94022 T. 650-967-4067
[email protected] www.piXlogic.com
Intelligent Image and Video Search for Defense Applications
Government Reseller for piXlogic 10314 Thornbush Lane Bethesda, MD 20814 (301) 787-2989
A piXlogic White Paper
Sponsored by Flex Analytics
July 2012 pg. 3
Contents
Introduction 3
Problem Statement 3
Previous Options 4
The piXserve Solution 5
Key Features of piXserve 5
Security Applications 8
Implementation 9
Summary 9
About piXlogic 10
Introduction
Images and videos have always been key
elements of intelligence and defense
operations. In recent years, the scope and
diversity of digital imagery has greatly
increased in every area: ground, satellite,
UAV, surveillance, broadcast, etc. The
volume of material being acquired and
stored is staggering, with no visible plateau
in sight. Traditional methods of organizing,
cataloguing, and distributing this material to
analysts and the war-fighter are becoming
impractical due to the scale involved. On
the other hand, timely access to nuggets of
vital information contained in images/videos
is key to operational success. The ability to
cross-correlate the information, whether it’s
being obtained from live sources or from
archived repositories, is more important than
ever.
In this environment, image/video search and
retrieval has become the new “must have”
element of any comprehensive solution.
Unfortunately, today’s image/video
management systems are not well suited to
help make sense of the data collected, and
can only provide, at best, very limited search
and retrieval capabilities.
Problem Statement
Most video management systems offer
limited options for automating processes
such as searching archived footage, or
generating alerts from live video. For the
most part, these features are either not
available, or only available in a very limited
sense. Often, a significant amount of
manpower is required to carry out even
simple search tasks. This is well known in
the field. Correlating visual data from
July 2012 pg. 4
different sources is another very
challenging task, mostly done
manually today. Automated change
detection is yet another largely
elusive goal.
Industry/government efforts during
the last few years have focused on
building infrastructure and have
resulted in great improvements in the
ability to acquire higher resolution
imagery/full motion video, moving
this material around the network
efficiently, and storing it. These are
great accomplishments, but by
themselves they are not enough.
Now is the time to leverage previous
investments and provide a much
needed level of automation so that
analysts can deal with the size and
scale of the problems they face.
However, for most solution
providers, this remains a significant
technical challenge.
Previous Options
When automated video analysis tools
are available, they tend to be single-
purpose with limited scope of
applicability and stringent operating
requirements. Consider the
following three examples:
Automated License Plate
Recognition: For most systems, the
hurdle is to know where the license
plate is in the image being analyzed.
To circumvent this problem, solution
providers either require the use of
specialized cameras (infrared) or that
the cameras be placed such that the
license plate to be recognized is
generally in the same location on the image.
Both of these requirements limit the scope
of applications possible with such systems.
Face Recognition: Much as in the ALPR
case, a big hurdle is to know where the face
to be measured is on the image. To solve
this, typical systems require that the distance
between the camera and the subject be
within a predefined range. Lighting
variations are also critical which is why the
more successful implementations are limited
to indoor, entry-way, type of set-ups.
Outdoor video in unconstrained
environments presents a challenge that is
outside the realm of most commercial
solutions available today.
Object Detection: The ability to
detect/recognize/search for specific objects
in a video or an image is not usually
available. Some attempts have been made
for video, but the methods used are overly
simplistic and unreliable. A typical
technique relies on “frame differencing” to
separate moving things from a stationary
background. The idea is simple but
unfortunately it only works in trivial
situations. If the camera is moving, the
background will move as well and frame
differencing techniques won’t work.
Turning off a light, a cloud passing in the
sky, a moving shadow, these are all things
that can yield undesired results. Even when
the background and the camera are
stationary, the amount of information that is
obtained is limited. If the camera is
calibrated, some guess about the size of the
object can be made and from this an
inference can be derived about what is in the
scene (perhaps an adult, may be not a dog),
but even this too can be quite unreliable (is
July 2012 pg. 5
it a dog, a tumbleweed, or a far away
person). Crowded environments
present a critical challenge to today’s
systems.
The piXserve Solution
piXserve is a general-purpose
image/video search and alerting
solution. Breakthrough technology
developed by piXlogic allows the
software to automatically “see” the
contents of an image/video frame
and create a searchable index and
uses this information so that users
can search and create alerts in a very
natural and logical way.
1. piXserve automatically
“segments” an image in a
way that discerns the
individual objects in the
image. It creates a
mathematical description of
the appearance of these
objects "on the fly", and
stores it as a searchable index
in a database.
2. piXserve reasons about what it "saw"
in the image and develops an initial
level of "understanding" about
content and context. Where it can, it
automatically creates searchable
"tags" for what it saw in the image
(piXlogic calls these tags "Notions").
For example, it can detect the
presence of things such as: sky,
vegetation, flower, face, building,
car, map, airplane, helicopter, etc.
3. piXserve uses all the information
calculated from the image to make
comparisons between a search image
and previously indexed
images/videos so that users can find
results that most closely match what
they are looking for.
4. piXserve can "see" not only visual
objects but also text strings that may
appear anywhere in the field of view
of the image. This text is also
indexed and made searchable.
piXserve works with text from many
languages (alphanumeric/latin-
character based languages, Japanese,
Korean, Chinese, etc.)
5. Depending on the quality of the
imagery involved and the type of
search being done, piXserve has
been designed to achieve accuracies
in excess of 85%.
Key Features of piXserve
Automatic Indexing
Point piXserve to a repository of
July 2012 pg. 6
images/video files or to a live
video feed, and automatically
index content. No manual
intervention or data entry
required.
Powerful Search
Through a web browser
interface, users login to
piXserve, connect to
available databases and
formulate search queries to
retrieve desired images/video
segments:
1. Use an arbitrary image
to search for
images/video segments
that contain the same or
similar items
2. U
s
e
the mouse to point to an area of
the query image to indicate
which specific item(s) should
be searched for.
3. Browse the contents of existing
databases, grab a frame “on the
fly” from a video that is
playing, and use that frame to
formulate a visual search
query.
4. Search images and videos by
object class ("Notion")
5. Type a text string to search
pictures/videos where that
string appears in the field of
view (a license plate, a street
sign, a name tag, etc.)
6. Search for faces of specific
individuals
7. Perform not only simple but
also complex multi-modal
searches. (Example: find video
sequences where something
like the bag in this picture
AND this face from this other
picture AND this text string I
just typed all appear in the field
of view at the same time.)
Use AND, OR, and NOT
operators to combine up to
6 criteria in a single query.
July 2012 pg. 7
8. Search by file name
9. Search by keyword or
other external metadata,
if available.
10. Submit sample images
of non-deformable
objects of interest and
automatically tag
images/video frames
when these items are
visible.
Powerful Automated Tagging
1. Automatically tag
images/video frames
with the name of
recognized
individuals that
appear therein
(automated face
naming).
2. Suggest keywords to
describe the contents
of a picture/video
frame (automated
keyword
recommendations)
3. Submit sample
images of non-
deformable objects of
interest and
automatically tag
images/video frames
when these items are
visible. (automated
2D-object detection
and naming)
Powerful Alerts
Create alert criteria just as you would
formulate a search query. piXserve-
ALERT keeps track of what
piXserve machines on the network
are indexing and when a match is
made consistent with what the user
specified, it generates a signal. The
user receives an e-mail with a link to
the alert results. A JMS (Java
Messaging Service) signal is also
generated to pass the alert on to other
systems and applications for further
action.
Powerful Metadata
The richness of metadata calculated
by piXserve about each image/video
frame processed (objects and tags),
can be exploited to enable
customized applications that are of
high value in a variety of settings
such as:
1. Automatic determination of
change detection when
videos taken at different
times from different angles
are compared.
2. Determining which portions
of a video archive contain
useful information, and
which could be safely deleted
to minimize storage
requirements.
Scaleable Architecture
piXserve is a multi-threaded, J2EE
scalable application that is suitable
for the most demanding
July 2012 pg. 8
implementations.
Web Services API
A REST-based API package
is available to support
integrations with third party
applications and workflow
environments.
Security Applications
If you are concerned with the cost,
speed, and accuracy of your video
investigative work, whether it be
forensic in nature or dealing with
live situations, then you should
consider piXserve as a “must-have”
add-on to your current system.
Conventional systems focus on
managing and manipulating cameras
and storage devices. Unfortunately,
they only provide limited capabilities
for searching the captured video:
time, date, motion, transaction
trigger…these are among the more
common set of options available.
While useful, these features alone are
inadequate to support a productive
workflow and significant manpower
effort is required even for the simpler
tasks. Common situations involve
several operators having to stare at a
bank of monitors for hours on end in
order to catch an event of interest, or
having to wade through hundreds of
hours of video from many cameras
looking for a specific event or trying
to correlate separate ones. These
situations are labor intensive, error
prone, and do not scale well.
piXserve extends the capabilities of today’s
systems by adding the ability to
automatically analyze the video that is being
collected and stored. These video streams
can be intercepted by piXserve and analyzed
for alerting purposes. Similarly, recorded
video can be analyzed, searched and
correlated using piXserve. The analytical
capabilities in piXserve support: facial
recognition, general purpose object
detection and recognition, text recognition,
license plate recognition, automatic tagging,
and more. All the indexing work is done
automatically, server side, in the
background. Users are then free to create
visually-based search criteria and navigate
the body of accumulated material. They can
do all of this “on the fly”, as they see fit at
the moment, based on whatever problem or
situation they are dealing with.
The piXserve search environment is
intuitive and productive, and the user
interface is through a web browser (Internet
Explorer, Mozilla Firefox, Safari, Google
Chrome, or equivalent). Users can drag-
and-drop a picture from anywhere to
formulate a similarity search query, or pause
a video while it’s playing, and use that
frame to create a new search criteria or
refine an existing one. This latter capability
greatly simplifies the discovery process
precisely in those situations when the user
isn’t quite sure what they are looking for and
are working in an investigative/exploratory
mode.
July 2012 pg. 9
Implementation
piXserve can process videos in a
variety of formats (MPEG-1, MPEG-
2, MPEG-4, H-263, H-264, etc.).
piXserve can also process still
images in over 90 different formats
(jpeg, tiff, png, bmp, psd, etc.)
piXserve can index both archived
video as well as live video broadcast
from Multicast IP cameras. piXserve
and piXserve-ALERT run on
standard 2-CPU rack servers (multi-
core Intel-Xeon processors or
equivalent), in a Windows Server
2003 or 2008 environment.
Customers typically choose Dell or
HP hardware for implementation.
piXserve is available in both x32 and
x64 bit versions.
In order to index archived video
piXserve requires that the storage
device be accessible via a network
share (Linux/Unix/Windows).
Further, the stored video should not
be in a proprietary, non-standard
format.
A single server can process large
amounts of archived material, or live
video from multiple feeds/sources.
The higher the number of cores on
the server, the higher the number of
hours of video per day that can be
processed by a single machine.
piXserve implementations can range
in size, from as little as a single
server to scalable multi-server and
distributed configurations. The
architecture of the product is such
that as the needs of the customer
grow, hardware can be added to
parallelize throughput and serve growing
needs.
The metadata created by piXserve is stored
in an RDBMs (Oracle or MS-SQL are
supported, PostgreSQL is bundled with
piXserve). The data and the piXserve output
can be integrated/correlated to that from
other systems that the customer may be
using. The alerting functionality is provided
by piXserve-ALERT. A single instance of
piXserve-ALERT can serve many users and
monitor potentially thousands of alert
criteria. Here too scaling is achieved by
adding additional ALERT servers. In
configurations were several hundreds or
thousands of individuals will be searching
piXserve generated data, the use of
piXserve-Enterprise Edition is
recommended.
Summary
Images and videos are a critical element of
defense and intelligence operations. It is
very difficult to deal with an ever-growing
amount of captured video without
July 2012 pg. 10
automation. The alternatives to
automation are expensive, time
consuming, and prone to errors. At
the same time, there is a lack of
suitable tools to provide a
meaningful level of real-world
automation.
piXserve provides an unparalleled
level of image analysis and
understanding. In a single tool it
provides capabilities that span:
object detection and recognition,
face recognition, license plate
recognition, text recognition,
automatic tagging, and more. In
each of these areas, piXserve
redefines the state of the art and can
help your meet the efficiency and
effectiveness goals that you have set
for yourself.
About piXlogic
piXlogic is a privately held company
located in Los Altos, CA, USA, the
heart of Silicon Valley. piXlogic is
an In-Q-Tel portfolio company (a
venture capital organization that
serves the needs of the US
Intelligence Community). The
company’s flagship products are
piXserve and piXserve-ALERT.
The software enables:
Content Discovery (find
pictures/videos that contain
specific objects, scenes, text, or
people of interest)
Content Auto-tagging (automatically
label an image/video)
Content Alerting (automatically inform
users when items of interest appear in a
live video stream or web crawl)
Content Change Detection
(automatically compare images and
video segments to detect changes at
the object level)
piXlogic serves the needs of government
and industrial customers. piXlogic sells its
products directly and through a network of
resellers in the US, the UK, Japan, Australia,
Argentina, Israel, and Italy.
Corporate piXlogic, Inc. 4984 El Camino Real Suite 205 Los Altos, CA 94022
T. +1-650-967-4067 E. [email protected] W. www.piXlogic.com
Flex Analytics is a systems integrator and
software reseller in the U.S. Intelligence
Community. It supports the sale,
implementation and customization of
piXserve in government installations.
Government Sales Flex Analytics LLC 10314 Thornbush Ln Bethesda, MD 20814
+1-301-787-2989 [email protected] www.flexanalytics.com