chipwrights videokit app note 7-10 release... 3 introduction chipwrights’ videokit is an image...
TRANSCRIPT
www.chipwrights.com
VideoKit
Application Note
July 2010 DOC-30008-1.0
This application note describes the key features of ChipWrights’ VideoKit source code. This note
provides usage examples for many of the video processing functions available on the ChipWrights
DSP, as well as an example of using open-source projects FFMPEG and Live555 to stream live video
from a CW5631-based device.
INTRODUCTION ................................................................................................................. 3
FEATURES......................................................................................................................... 3
SOURCE CODE AND LIBRARY DEPENDENCIES........................................................................ 3
CW5631 SOC LIBRARIES ...................................................................................................... 4
STARTUP, SETTINGS AND HOST INTERFACE.......................................................................... 5
STARTUP .......................................................................................................................... 5 SETTINGS FUNCTIONS ........................................................................................................... 5 HOST INTERFACE................................................................................................................. 5
Remote Procedure Calls .................................................................................................. 5
VIDEO PIPELINE ................................................................................................................ 7
VIDEO THREAD INITIALIZATION ................................................................................................ 8 VIDEO FRAME PROCESSING..................................................................................................... 8 V4L2 .............................................................................................................................. 9
VIDEO ENCODING AND STREAMING ....................................................................................10
SELECTING THE CODEC TO ENCODE THE STREAM ............................................................................10 VIDEO ENCODING ...............................................................................................................11 VIDEO STREAMING..............................................................................................................11
AUDIO FRAMEWORK..........................................................................................................12
www.chipwrights.com 2
HOST APPLICATION...........................................................................................................12
To Access the RPC server from Python.............................................................................12
CONCLUSION ...................................................................................................................13
REFERENCES ....................................................................................................................14
www.chipwrights.com 3
Introduction
ChipWrights’ VideoKit is an image processing and encoding application provided with full source
code as part of the ChipWrights Linux® Application Development Kit. Although primarily designed
as a showcase for the CW5631 DSP's image processing functionality, it serves as a useful starting
point for developers wishing to implement IP camera solutions.
Features
• Video capture through Video4Linux2 from PAL/NTSC video or CMOS image sensor
(depends on hardware availability).
• Runtime-configurable image processing pipeline with live display at resolutions up to 720p.
• Image filters include: de-interlacer; steerable linear, fisheye camera dewarping; blur; erode;
dilate; threshold; unsharpen mask; negative image; brightness and contrast control.
• Real-time chroma-key; configurable background image and key color for green screen effects
(e.g. weather map).
• Live RTP streaming of processed video using either MPEG4 or H.264.
• Real-time control from a PC-based graphical user interface.
Source Code and Library Dependencies
The following libraries are required when building VideoKit:
• libavcodec (FFMPEG)
• live555
• libSDL
• libSDL_image
• libSDL_ttf
• libXMLRPC
• libcwdsp
www.chipwrights.com 4
CW5631 SoC Libraries
CW5631 versions of the libraries are built as part of the ChipWrights demo image; they are
available in the staging area when building using the OpenEmbedded tool chain.
Module Function
main.c Entry point. Thread startup. Main loop.
display.c Display initialization using SDL. Display update, scaling using the DSP.
audio.c Framework for audio capture. Not currently enabled.
alsa.c Low-level audio interface using ALSA.
video.c Main video pipeline.
video_filter.c Video filter functions.
video_warp.c Video de-warping functions.
v4l2.c Low-level video capture interface using V4L2.
encoder.c Interface to libavcodec (FFMPEG) for compressing video using the DSP.
stream.cpp Interface to live555 (C++) and additional classes required for live streaming of
MPEG4 and H.264
rpcserver.c XMLRPC server thread for remote control from host PC
settings.c Provides a mechanism for storing configuration settings using key/value pairs.
www.chipwrights.com 5
Startup, Settings and Host Interface
Startup
The program begins by calling settings_open to read global configuration data from a save file
held at /etc/videokit/videokit.conf into a dictionary. Some saved settings can be
overridden from the command line (see cmdopts in settings.c). The configuration module
implements a database of key=value pairs and handles type conversion to string, integer or
floating point (double) types. The key names are unrestricted and are defined by the calling
modules as needed.
Settings Functions
Once the configuration dictionary is loaded, a particular item can be queried or updated by any part
of the program using the settings_get_ and settings_set_ functions. Configuration
changes are saved during program termination by a call to settings_close.
Host Interface
The host interface uses the XMLRPC1 protocol to communicate with the client GUI. This protocol is
lightweight and easy-to-use, particularly from high-level languages such as Python. The protocol
operates over HTTP and the server, which is based on libxmlrpc, and incorporates its own web
server based on Abyss2.
Remote Procedure Calls
Remote Procedure Call (RPC) methods are added to the server at runtime using the macros
provided in rpcserver.h, after which the server thread is started by calling rpcserver_start.
Most modules use the configuration and RPC subsystems together to implement a local
configuration space that can be manipulated at runtime via RPC. By updating the settings
dictionary at the same time, state is retained in the save file between sessions.
Each subsystem uses a set of macros to simplify access to its configuration space (see examples in
video.c).
www.chipwrights.com 6
The configuration variables are held in a static structure (video_defaults) and most are write-
only via RPC. For these variables, use the RPC_FUNC to generate the function needed to set the
variable. Its usage is straightforward:
RPC_FUNC(<subsystem name>,<property name>,<property type>)
Where the resulting RPC call takes the form:
<subsystem name>.<property name>
The macro assumes that the corresponding variable in the configuration structure is called
<subsystem name>_<property name>, and that its type matches the <property type>,
(defined in the libxmlrpc documentation).
Note: Generally, the type is either integer (i) or Boolean (b), which also takes an integer variable.
RPC methods that must do more than set a variable must be implemented directly using the
RPC_METHOD macros defined in rpcserver.h. See examples in video.c.
The INIT_VAR macro reads the variables’ initial values from the save file and adds the RPC
methods to the server.
These are called during module startup, e.g. in video_open, and must be executed before the
RPC server is started. The syntax is the same as for RPC_FUNC, except that the type argument
derives the settings_get_ call, and must be unquoted and either int, string, or double.
RPC functions that were not generated using the RPC_FUNC macro are added to the server
directly by calling rpcserver_add_method.
Configuration state is pushed back to the settings dictionary during shutdown (video_close) by
calling the SAVE_VAR macro. This takes the same arguments as INIT_VAR.
Next, the video pipeline and streaming servers are started.
www.chipwrights.com 7
Video Pipeline
The video pipeline (video.c) runs continuously in its own thread and, generally, calls the DSP
directly using the API provided by ChipWrights in libcwdsp. An exception is in video compression,
which uses libavcodec and is discussed later.
Figure 1: Video Pipeline Arrangement
Image and context data to be exchanged with the DSP must be held in physically contiguous pages
of RAM. Since this is generally not the case with buffers allocated on the Linux heap using malloc,
libcwdsp provides a pair of equivalent functions called cwdsp_malloc and cwdsp_free. The
syntax of these two functions is the same as their standard library counterparts, but the memory
returned is guaranteed to be in a single physical block. Anything to be passed to the DSP by
reference must be held within one of these areas.
www.chipwrights.com 8
Video Thread Initialization
During startup (in video_thread), buffers are allocated. A dspdata_t structure is allocated to
hold the various context structures needed by the DSP functions, plus two image buffers are
reserved, which alternately function as input and output buffers when chaining the DSP functions.
A third image buffer is allocated to hold the background image used by the chroma-key feature.
Video Frame Processing
Each cycle of the pipeline starts via a requesting a frame from V4L2 using the wrappers provided in
v4l2.c. This function is blocked until a new video frame is available, and the returned pointer is a
reference to the actual DMA buffer used by the hardware. Although not allocated using
cwdsp_malloc, these buffers are guaranteed to be physically contiguous so the images can be
used directly by the DSP without copying.
For each DSP operation in the pipeline an imageInfo structure is required for each image buffer
involved. In most cases there are two; one for input and another for output. The structures contain
information about the image including width, height, line stride and color space. It also contains a
pointer to each plane in the image – in this case all processing is done in the same color space as
the captured image, YCC422I, hence there is only one plane pointer. The same pair of imageInfo
structures is re-used each time the DSP is called, with the plane pointers updated to reflect the
current source and destination buffers. By processing the image between the input buffer and the
pair of buffers allocated during startup the pipeline can be executed without unnecessary copying.
Example of a typical DSP call
dspdata->outImg.width = ctx->outwidth;
dspdata->outImg.height = ctx->outheight;
dspdata->outImg.stride = ctx->outwidth;
dspdata->inImg.components[0] = (void*)cwdsp_v2p(srcbuf);
dspdata->outImg.components[0] = (void*)cwdsp_v2p(dstbuf);
scaleImage((void*)cwdsp_v2p(&dspdata->outImg),
(void*)cwdsp_v2p(&dspdata->inImg),pool, 0);
The rest of the imageInfo structures have been filled in previously.
www.chipwrights.com 9
In this example, the output imageInfo is updated to reflect the desired dimensions of the scaled
output image (the input image dimensions were previously configured). The component pointers
are filled in to refer to the start of their respective buffer, after which the actual DSP call is made,
passing pointers to the input and output imageInfo data.
Note the calls to cwdsp_v2p, which are required whenever a pointer is passed to the DSP either
directly as an argument to a function, or within a structure such as imageInfo. This function
converts the virtual addresses used by Linux into a physical address that can be used by the DSP.
It will return NULL if the pointer is not suitable for passing to the DSP (see above).
Each operation in the pipeline can be bypassed at runtime if desired by clearing the corresponding
enable flag in the configuration structure. This can be done using a remote procedure call. At the
end of the processing stage the image is compressed and sent to the streaming thread. It is then
copied to an off-screen display surface by calling display_update. Here the image undergoes a
final scale step and a color conversion to RGBA as required by the primary frame buffer. Access to
the display memory is handled by SDL so SDL_Flip is called to make the off-screen surface
visible.
Once all processing has been completed the original image buffer can be passed back to V4L2 for
re-use using v4l2_release_frame. The process then repeats for the next frame.
V4L2
Low-level video capture is handled by the functions provided by v4l2.c. Generally, there is
nothing unique about using Video4Linux3 (V4L2) with the CW5631. Example code found on the
Internet should be applicable. However, to use the captured images with the DSP it is important to
use mmap mode.
The VideoKit the capture device (usually /dev/video0) is opened by calling v4l2_open.
Some ioctl calls are made to set the capture hardware to the desired mode (width, height, video
standard where applicable). Next, pointers into a pool of image capture buffers are obtained (see
init_mmap). This begins by asking the driver for the desired number of buffers using the
VIDIOC_REQBUFS ioctl. The ioctl responds with the actual number of buffers that can be
accommodated by the driver. The number of buffer descriptors is allocated. For each buffer
descriptor the start address of the DMA buffer must be determined – because this is physically
contiguous it can be used directly in DSP operations, which is a key
advantage over using the simpler “read” API to V4L2. The
www.chipwrights.com 10
VIDIOC_QUERYBUF ioctl is used to obtain the parameters of the buffer including its byte length
and an offset value that can be passed to the mmap system call. The actual base address is then
obtained using mmap. Finally, the buffer is readied for use by passing it to the VIDIOC_QBUF
ioctl.
Calls to v4l2_grab_frame obtain a new frame of video from the capture driver. In the preferred
mmap mode this is a case of calling the VIDIOC_DQBUF ioctl to obtain an index for the next
valid buffer. The corresponding base address can then be looked up. The ioctl blocks if no new
frame is available.
Once the application finishes with the buffer it must be returned to V4L2 for re-use. This is carried
out by calling v4l2_release_frame, which uses the VIDIOC_QBUF ioctl to return the buffer
to the driver.
Video Encoding and Streaming
The output of the video pipeline can be viewed in real time over the network from a PC running the
VideoKit GUI or a media player capable of showing an MPEG4 or H.264 RTP stream (such as VLC4).
The URL to open the stream is rtsp://<ip address>:8554/stream2/.
Selecting the codec to encode the stream
Select the codec from the command line using the -V option. The codec remains the same for the
duration of the session.
• -V 0 selects MPEG4
• -V 1 selects H.264
Note: You can set the stream’s bit rate in kbps using the -b option.
www.chipwrights.com 11
Video Encoding
Each frame processed by the video pipeline is passed to FFMPEG for encoding on the ChipWrights
DSP (encoder.c). The encoder is opened by calling encoder_open when the video pipeline first
starts. This opens the selected codec and sets encoding parameters such as image dimensions and
frame rate. During execution the frames are passed individually to encoder_encode_video,
which returns a buffer containing the encoded frame data. The buffer contents are then passed to
the streaming server (stream.cpp) through a call to stream_push_frame, which pushes the
data into a pipe before returning immediately.
Video Streaming
The streaming server itself is an RTSP/RTP server based on Live5555. It can support multiple RTSP
sources, each consisting of multiple elementary RTP streams, although here only a single stream is
used. The server is started by calling stream_open, followed by stream_add and
stream_add_substream. The call to stream_add allows various metadata common to one
RTSP URL to be defined. The sub-stream contains information about the individual RTP streams
that will be generated, and it needs to be told the estimated bandwidth for the stream as well as
the codec that will be used. Both of these calls are made during startup from video_open.
Once all streams are defined, the server starts in its own thread by a call to stream_start. The
thread itself acts as a bridge to Live555, which in contrast to the rest of VideoKit is written in C++.
To keep the rest of the streaming module as pure C as possible all classes are instantiated from
within the thread function. After instantiation the thread simply executes the Live555 event loop to
run the server.
The streaming server contains a number of sub-classes derived from Live555. These are interface
classes used to supply the live video stream (most of the streaming examples supplied with
Live555 are for streaming from files).
For codec types that are not fully supported by Live555, a framing class is provided.
For example, CWH264VideoStreamFramer splits an incoming H.264 stream into individual
Network Abstraction Layer (NAL) units as required by the underlying RTP protocol.
A similar framer exists for MPEG4 (MPEG4VideoStreamDiscreteFramer), but this is part of the
standard Live555 distribution.
www.chipwrights.com 12
For all codecs, the sub-streams are encapsulated in a GenericVideoLiveMediaSubsession
class, which selects the appropriate framer and RTP payloader during instantiation as well as
creating a StreamSource object, which is the source of the raw data and is also defined in
stream.cpp. This class takes advantage of the fact that Live555 can be throttled by blocking on
an operating system file handle (Background Read Handling). In this case the file handle is one end
of a pipe created using the pipe system call. The other end of the pipe is written to by
stream_push_frame. Whenever the pipe is readable StreamSource::deliverFrame() is
invoked and the next block of compressed video is read from the pipe and passed into Live555.
From there it is streamed to one or more clients.
Audio Framework
A framework for audio processing is included in alsa.c and audio.c, although this is simply a
pass-through in the current implementation and is disabled at compile-time.
Host Application
An example host application is included, written in Python using the GTK+ toolkit6. Python is
capable of making XMLRPC calls directly7, so VideoKit's operation can also be influenced at runtime
from the Python command line or from simple scripts.
To Access the RPC server from Python
>>> from xmlrpclib import ServerProxy >>> s = ServerProxy(“http://<ip address of target>:8080/RPC2”) >>> s.system.listMethods() If the connection is successful then a list of the supported RPC methods will be returned. Any of these methods can be accessed through the 's' object in the above example. So to return the current capture resolution: >>> s.capture.xres() 720 >>> s.capture.yres() 576 To write to a variable the desired value is passed as an argument: >>> s.pipeline.enable_deinterlace(True) 0 The host application incorporates an embedded media player for viewing the live stream. On Linux
this is implemented using Python GStreamer8, or on Windows by using the Python bindings for
VLC. In both cases, the video is streamed by accessing the RTSP URL described above, and the
same stream can also be viewed in a compatible media player
application of the user's choice.
www.chipwrights.com 13
Conclusion
The VideoKit source code gives usage examples for many of the video processing functions
available on the ChipWrights DSP, as well as an example of using open-source projects FFMPEG
and Live555 to stream live video from a CW5631-based device. The software therefore serves as
an ideal basis for customers wishing to develop advanced camera or video server applications.
www.chipwrights.com 14
References
1. libXMLRPC, http://xmlrpc-c.sourceforge.net/doc/libxmlrpc.html
2. Abyss Webserver, http://abyss.sourceforge.net/
3. Video4Linux2, http://v4l2spec.bytesex.org/spec/
4. VLC, http://www.videolan.org/vlc/
5. Live555, http://www.live555.com/liveMedia/
6. Python GTK+, http://www.pygtk.org/
7. Python xmlrpc, http://docs.python.org/library/xmlrpclib.html
8. GStreamer, http://www.gstreamer.net/
Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. Micron, the M logo, and the Micron logo are trademarks of Micron Technology; Inc. Windows is a registered trademark of Microsoft Corporation in the United States and other countries. All other trademarks and trade names are the property of their respective companies.