chipwrights videokit app note 7-10 release... 3 introduction chipwrights’ videokit is an image...

www.chipwrights.com

VideoKit

Application Note

July 2010 DOC-30008-1.0

This application note describes the key features of ChipWrights’ VideoKit source code. This note

provides usage examples for many of the video processing functions available on the ChipWrights

DSP, as well as an example of using open-source projects FFMPEG and Live555 to stream live video

from a CW5631-based device.

INTRODUCTION ................................................................................................................. 3

FEATURES......................................................................................................................... 3

SOURCE CODE AND LIBRARY DEPENDENCIES........................................................................ 3

CW5631 SOC LIBRARIES ...................................................................................................... 4

STARTUP, SETTINGS AND HOST INTERFACE.......................................................................... 5

STARTUP .......................................................................................................................... 5 SETTINGS FUNCTIONS ........................................................................................................... 5 HOST INTERFACE................................................................................................................. 5

Remote Procedure Calls .................................................................................................. 5

VIDEO PIPELINE ................................................................................................................ 7

VIDEO THREAD INITIALIZATION ................................................................................................ 8 VIDEO FRAME PROCESSING..................................................................................................... 8 V4L2 .............................................................................................................................. 9

VIDEO ENCODING AND STREAMING ....................................................................................10

SELECTING THE CODEC TO ENCODE THE STREAM ............................................................................10 VIDEO ENCODING ...............................................................................................................11 VIDEO STREAMING..............................................................................................................11

AUDIO FRAMEWORK..........................................................................................................12

www.chipwrights.com 2

HOST APPLICATION...........................................................................................................12

To Access the RPC server from Python.............................................................................12

CONCLUSION ...................................................................................................................13

REFERENCES ....................................................................................................................14


Introduction

ChipWrights’ VideoKit is an image processing and encoding application provided with full source

code as part of the ChipWrights Linux® Application Development Kit. Although primarily designed

as a showcase for the CW5631 DSP's image processing functionality, it serves as a useful starting

point for developers wishing to implement IP camera solutions.

Features

• Video capture through Video4Linux2 from PAL/NTSC video or CMOS image sensor

(depends on hardware availability).

• Runtime-configurable image processing pipeline with live display at resolutions up to 720p.

• Image filters include: de-interlacer; steerable linear, fisheye camera dewarping; blur; erode;

dilate; threshold; unsharpen mask; negative image; brightness and contrast control.

• Real-time chroma-key; configurable background image and key color for green screen effects

(e.g. weather map).

• Live RTP streaming of processed video using either MPEG4 or H.264.

• Real-time control from a PC-based graphical user interface.

Source Code and Library Dependencies

The following libraries are required when building VideoKit:

• libavcodec (FFMPEG)

• live555

• libSDL

• libSDL_image

• libSDL_ttf

• libXMLRPC

• libcwdsp


CW5631 SoC Libraries

CW5631 versions of the libraries are built as part of the ChipWrights demo image; they are

available in the staging area when building using the OpenEmbedded tool chain.

Module Function

main.c Entry point. Thread startup. Main loop.

display.c Display initialization using SDL. Display update, scaling using the DSP.

audio.c Framework for audio capture. Not currently enabled.

alsa.c Low-level audio interface using ALSA.

video.c Main video pipeline.

video_filter.c Video filter functions.

video_warp.c Video de-warping functions.

v4l2.c Low-level video capture interface using V4L2.

encoder.c Interface to libavcodec (FFMPEG) for compressing video using the DSP.

stream.cpp Interface to live555 (C++) and additional classes required for live streaming of

MPEG4 and H.264

rpcserver.c XMLRPC server thread for remote control from host PC

settings.c Provides a mechanism for storing configuration settings using key/value pairs.


Startup, Settings and Host Interface

Startup

The program begins by calling settings_open to read global configuration data from a save file

held at /etc/videokit/videokit.conf into a dictionary. Some saved settings can be

overridden from the command line (see cmdopts in settings.c). The configuration module

implements a database of key=value pairs and handles type conversion to string, integer or

floating point (double) types. The key names are unrestricted and are defined by the calling

modules as needed.

Settings Functions

Once the configuration dictionary is loaded, a particular item can be queried or updated by any part

of the program using the settings_get_ and settings_set_ functions. Configuration

changes are saved during program termination by a call to settings_close.

Host Interface

The host interface uses the XMLRPC1 protocol to communicate with the client GUI. This protocol is

lightweight and easy-to-use, particularly from high-level languages such as Python. The protocol

operates over HTTP and the server, which is based on libxmlrpc, and incorporates its own web

server based on Abyss2.

Remote Procedure Calls

Remote Procedure Call (RPC) methods are added to the server at runtime using the macros

provided in rpcserver.h, after which the server thread is started by calling rpcserver_start.

Most modules use the configuration and RPC subsystems together to implement a local

configuration space that can be manipulated at runtime via RPC. By updating the settings

dictionary at the same time, state is retained in the save file between sessions.

Each subsystem uses a set of macros to simplify access to its configuration space (see examples in

video.c).


The configuration variables are held in a static structure (video_defaults) and most are write-

only via RPC. For these variables, use the RPC_FUNC to generate the function needed to set the

variable. Its usage is straightforward:

RPC_FUNC(<subsystem name>,<property name>,<property type>)

Where the resulting RPC call takes the form:

<subsystem name>.<property name>

The macro assumes that the corresponding variable in the configuration structure is called

<subsystem name>_<property name>, and that its type matches the <property type>,

(defined in the libxmlrpc documentation).

Note: Generally, the type is either integer (i) or Boolean (b), which also takes an integer variable.

RPC methods that must do more than set a variable must be implemented directly using the

RPC_METHOD macros defined in rpcserver.h. See examples in video.c.

The INIT_VAR macro reads the variables’ initial values from the save file and adds the RPC

methods to the server.

These are called during module startup, e.g. in video_open, and must be executed before the

RPC server is started. The syntax is the same as for RPC_FUNC, except that the type argument

derives the settings_get_ call, and must be unquoted and either int, string, or double.

RPC functions that were not generated using the RPC_FUNC macro are added to the server

directly by calling rpcserver_add_method.

Configuration state is pushed back to the settings dictionary during shutdown (video_close) by

calling the SAVE_VAR macro. This takes the same arguments as INIT_VAR.

Next, the video pipeline and streaming servers are started.


Video Pipeline

The video pipeline (video.c) runs continuously in its own thread and, generally, calls the DSP

directly using the API provided by ChipWrights in libcwdsp. An exception is in video compression,

which uses libavcodec and is discussed later.

Figure 1: Video Pipeline Arrangement

Image and context data to be exchanged with the DSP must be held in physically contiguous pages

of RAM. Since this is generally not the case with buffers allocated on the Linux heap using malloc,

libcwdsp provides a pair of equivalent functions called cwdsp_malloc and cwdsp_free. The

syntax of these two functions is the same as their standard library counterparts, but the memory

returned is guaranteed to be in a single physical block. Anything to be passed to the DSP by

reference must be held within one of these areas.


Video Thread Initialization

During startup (in video_thread), buffers are allocated. A dspdata_t structure is allocated to

hold the various context structures needed by the DSP functions, plus two image buffers are

reserved, which alternately function as input and output buffers when chaining the DSP functions.

A third image buffer is allocated to hold the background image used by the chroma-key feature.

Video Frame Processing

Each cycle of the pipeline starts via a requesting a frame from V4L2 using the wrappers provided in

v4l2.c. This function is blocked until a new video frame is available, and the returned pointer is a

reference to the actual DMA buffer used by the hardware. Although not allocated using

cwdsp_malloc, these buffers are guaranteed to be physically contiguous so the images can be

used directly by the DSP without copying.

For each DSP operation in the pipeline an imageInfo structure is required for each image buffer

involved. In most cases there are two; one for input and another for output. The structures contain

information about the image including width, height, line stride and color space. It also contains a

pointer to each plane in the image – in this case all processing is done in the same color space as

the captured image, YCC422I, hence there is only one plane pointer. The same pair of imageInfo

structures is re-used each time the DSP is called, with the plane pointers updated to reflect the

current source and destination buffers. By processing the image between the input buffer and the

pair of buffers allocated during startup the pipeline can be executed without unnecessary copying.

Example of a typical DSP call

dspdata->outImg.width = ctx->outwidth;

dspdata->outImg.height = ctx->outheight;

dspdata->outImg.stride = ctx->outwidth;

dspdata->inImg.components[0] = (void*)cwdsp_v2p(srcbuf);

dspdata->outImg.components[0] = (void*)cwdsp_v2p(dstbuf);

scaleImage((void*)cwdsp_v2p(&dspdata->outImg),

(void*)cwdsp_v2p(&dspdata->inImg),pool, 0);

The rest of the imageInfo structures have been filled in previously.


In this example, the output imageInfo is updated to reflect the desired dimensions of the scaled

output image (the input image dimensions were previously configured). The component pointers

are filled in to refer to the start of their respective buffer, after which the actual DSP call is made,

passing pointers to the input and output imageInfo data.

Note the calls to cwdsp_v2p, which are required whenever a pointer is passed to the DSP either

directly as an argument to a function, or within a structure such as imageInfo. This function

converts the virtual addresses used by Linux into a physical address that can be used by the DSP.

It will return NULL if the pointer is not suitable for passing to the DSP (see above).

Each operation in the pipeline can be bypassed at runtime if desired by clearing the corresponding

enable flag in the configuration structure. This can be done using a remote procedure call. At the

end of the processing stage the image is compressed and sent to the streaming thread. It is then

copied to an off-screen display surface by calling display_update. Here the image undergoes a

final scale step and a color conversion to RGBA as required by the primary frame buffer. Access to

the display memory is handled by SDL so SDL_Flip is called to make the off-screen surface

visible.

Once all processing has been completed the original image buffer can be passed back to V4L2 for

re-use using v4l2_release_frame. The process then repeats for the next frame.

V4L2

Low-level video capture is handled by the functions provided by v4l2.c. Generally, there is

nothing unique about using Video4Linux3 (V4L2) with the CW5631. Example code found on the

Internet should be applicable. However, to use the captured images with the DSP it is important to

use mmap mode.

The VideoKit the capture device (usually /dev/video0) is opened by calling v4l2_open.

Some ioctl calls are made to set the capture hardware to the desired mode (width, height, video

standard where applicable). Next, pointers into a pool of image capture buffers are obtained (see

init_mmap). This begins by asking the driver for the desired number of buffers using the

VIDIOC_REQBUFS ioctl. The ioctl responds with the actual number of buffers that can be

accommodated by the driver. The number of buffer descriptors is allocated. For each buffer

descriptor the start address of the DMA buffer must be determined – because this is physically

contiguous it can be used directly in DSP operations, which is a key

advantage over using the simpler “read” API to V4L2. The


VIDIOC_QUERYBUF ioctl is used to obtain the parameters of the buffer including its byte length

and an offset value that can be passed to the mmap system call. The actual base address is then

obtained using mmap. Finally, the buffer is readied for use by passing it to the VIDIOC_QBUF

ioctl.

Calls to v4l2_grab_frame obtain a new frame of video from the capture driver. In the preferred

mmap mode this is a case of calling the VIDIOC_DQBUF ioctl to obtain an index for the next

valid buffer. The corresponding base address can then be looked up. The ioctl blocks if no new

frame is available.

Once the application finishes with the buffer it must be returned to V4L2 for re-use. This is carried

out by calling v4l2_release_frame, which uses the VIDIOC_QBUF ioctl to return the buffer

to the driver.

Video Encoding and Streaming

The output of the video pipeline can be viewed in real time over the network from a PC running the

VideoKit GUI or a media player capable of showing an MPEG4 or H.264 RTP stream (such as VLC4).

The URL to open the stream is rtsp://<ip address>:8554/stream2/.

Selecting the codec to encode the stream

Select the codec from the command line using the -V option. The codec remains the same for the

duration of the session.

• -V 0 selects MPEG4

• -V 1 selects H.264

Note: You can set the stream’s bit rate in kbps using the -b option.


Video Encoding

Each frame processed by the video pipeline is passed to FFMPEG for encoding on the ChipWrights

DSP (encoder.c). The encoder is opened by calling encoder_open when the video pipeline first

starts. This opens the selected codec and sets encoding parameters such as image dimensions and

frame rate. During execution the frames are passed individually to encoder_encode_video,

which returns a buffer containing the encoded frame data. The buffer contents are then passed to

the streaming server (stream.cpp) through a call to stream_push_frame, which pushes the

data into a pipe before returning immediately.

Video Streaming

The streaming server itself is an RTSP/RTP server based on Live5555. It can support multiple RTSP

sources, each consisting of multiple elementary RTP streams, although here only a single stream is

used. The server is started by calling stream_open, followed by stream_add and

stream_add_substream. The call to stream_add allows various metadata common to one

RTSP URL to be defined. The sub-stream contains information about the individual RTP streams

that will be generated, and it needs to be told the estimated bandwidth for the stream as well as

the codec that will be used. Both of these calls are made during startup from video_open.

Once all streams are defined, the server starts in its own thread by a call to stream_start. The

thread itself acts as a bridge to Live555, which in contrast to the rest of VideoKit is written in C++.

To keep the rest of the streaming module as pure C as possible all classes are instantiated from

within the thread function. After instantiation the thread simply executes the Live555 event loop to

run the server.

The streaming server contains a number of sub-classes derived from Live555. These are interface

classes used to supply the live video stream (most of the streaming examples supplied with

Live555 are for streaming from files).

For codec types that are not fully supported by Live555, a framing class is provided.

For example, CWH264VideoStreamFramer splits an incoming H.264 stream into individual

Network Abstraction Layer (NAL) units as required by the underlying RTP protocol.

A similar framer exists for MPEG4 (MPEG4VideoStreamDiscreteFramer), but this is part of the

standard Live555 distribution.


For all codecs, the sub-streams are encapsulated in a GenericVideoLiveMediaSubsession

class, which selects the appropriate framer and RTP payloader during instantiation as well as

creating a StreamSource object, which is the source of the raw data and is also defined in

stream.cpp. This class takes advantage of the fact that Live555 can be throttled by blocking on

an operating system file handle (Background Read Handling). In this case the file handle is one end

of a pipe created using the pipe system call. The other end of the pipe is written to by

stream_push_frame. Whenever the pipe is readable StreamSource::deliverFrame() is

invoked and the next block of compressed video is read from the pipe and passed into Live555.

From there it is streamed to one or more clients.

Audio Framework

A framework for audio processing is included in alsa.c and audio.c, although this is simply a

pass-through in the current implementation and is disabled at compile-time.

Host Application

An example host application is included, written in Python using the GTK+ toolkit6. Python is

capable of making XMLRPC calls directly7, so VideoKit's operation can also be influenced at runtime

from the Python command line or from simple scripts.

To Access the RPC server from Python

>>> from xmlrpclib import ServerProxy >>> s = ServerProxy(“http://<ip address of target>:8080/RPC2”) >>> s.system.listMethods() If the connection is successful then a list of the supported RPC methods will be returned. Any of these methods can be accessed through the 's' object in the above example. So to return the current capture resolution: >>> s.capture.xres() 720 >>> s.capture.yres() 576 To write to a variable the desired value is passed as an argument: >>> s.pipeline.enable_deinterlace(True) 0 The host application incorporates an embedded media player for viewing the live stream. On Linux

this is implemented using Python GStreamer8, or on Windows by using the Python bindings for

VLC. In both cases, the video is streamed by accessing the RTSP URL described above, and the

same stream can also be viewed in a compatible media player

application of the user's choice.


Conclusion

The VideoKit source code gives usage examples for many of the video processing functions

available on the ChipWrights DSP, as well as an example of using open-source projects FFMPEG

and Live555 to stream live video from a CW5631-based device. The software therefore serves as

an ideal basis for customers wishing to develop advanced camera or video server applications.


References

1. libXMLRPC, http://xmlrpc-c.sourceforge.net/doc/libxmlrpc.html

2. Abyss Webserver, http://abyss.sourceforge.net/

3. Video4Linux2, http://v4l2spec.bytesex.org/spec/

4. VLC, http://www.videolan.org/vlc/

5. Live555, http://www.live555.com/liveMedia/

6. Python GTK+, http://www.pygtk.org/

7. Python xmlrpc, http://docs.python.org/library/xmlrpclib.html

8. GStreamer, http://www.gstreamer.net/

Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. Micron, the M logo, and the Micron logo are trademarks of Micron Technology; Inc. Windows is a registered trademark of Microsoft Corporation in the United States and other countries. All other trademarks and trade names are the property of their respective companies.

chipwrights videokit app note 7-10 release... 3 introduction chipwrights’ videokit is an image...

Documents