jayaram college of engineering and technology, pagalavdi ... it/gm/gm-notes.pdf · aliasing occurs...

1

Jayaram college of Engineering and Technology, Pagalavdi Department of CSE

CS1354 - Graphics and multimedia Unit I

Introduction to Aliasing

Aliasing is a potential problem whenever an analog signal is point sampled to convert it into a digital signal. It can occur in audio sampling, for example, in converting music to digital form to be stored on a CD-ROM or other digital device. Aliasing happens whenever an analog signal is not sampled at a high enough frequency. In audio, Aliasing manifests itself in the form of spurious low frequencies. An example is shown below of two sin waves.

Image Reference

In the top sin wave, the sampling is fast enough that a reconstructed signal (the small circles) would have the same frequency as the original sin wave. In the bottom wave, with a higher frequency but the same point sampling rate, a reconstructed signal (the small circles) would appear to be a sin wave of a lower frequency, i.e., an aliased signal.

From Point Sampling Theory it turns out that to accurately reconstruct a signal, the signal must be sampled at a rate greater than or equal to two times the highest frequency contained in the signal. This is called the Nyquist Theorem and the highest frequency that can be accurately represented with a given sampling rate is called the Nyquist limit. For example, to produce a music CD-ROM the analog signal is sampled at a maximum rate of 44 Khz, therefore the highest possible audio frequency is 22khz. Any audio frequencies greater than 22Khz must be removed from the input signal or they will be aliased, i.e., appear as low frequency sounds.

Aliasing occurs in computer graphics, since we are point sampling an analog signal. The mathematical model of an image is a continuous analog signal which is sampled at discrete points (the pixel positions). When the sampling rate is less than Nyquist Limit then there are aliasing artifacts that are called "jaggies" in computer graphics. In general, aliasing is when high frequencies appear as low frequencies which produce regular patterns easy to see.

Super-Sampling

In supersampling we sample more than 1 sample per pixel. These samples can be at regularly spaced intervals. For example, we might compute an image at 2K by 2K or 4K by 4K points and and display at 1K by 1K pixel resolution. Sampling at 2K x 2K for a 1K x 1K image increases the number of samples and graphics computations (and Z-buffer requirements for scan-line graphics) by a factor of 4. An alternative method would be to sample at the corners and center of each pixel. This only increases the number of computations by a factor of two but requires increased overhead for bookkeeping (since the samples from the previous row must be stored). Supersampling has the effect of moving the Nyquist Limit to higher frequencies.

2

Rather than combining pixels with unweighted average (box filter) might use a weighted filter. We might even average pixel samples over several pixels , i.e. combine over a wider range with weighted average. This is an example of digital filtering. Even if we move the Nyquist Limit to higher frequencies, we will still have aliasing. Adaptive supersampling: additional sampling in region of high frequency, eg. near edges. We continue supersampling as long as |Ic - Ii| > threshold variance. It does better but requires large number of rays/samples and it is a more complicated algorithm.

2 Dimensional Viewing Transformation

Introduction

When we define an image in some world coordinate system, to display that image we must somehow map the image to the physical output device. This is a two stage process. For 3 dimensional images we must first project down to 2 dimensions, since our display device is 2 dimensional. Next, we must map the 2 D representation to the physical device. This section concerns the second part of this process, i.e. the 2D WDC to 2D physical device coordinates (PDC). We will first discuss the concept of a Window on the world (WDC), and then a Viewport (in NDC), and finally the mapping WDC to NDC to PDC.

Window

When we model an image in World Device Coordinates (WDC) we are not interested in the entire world but only a portion of it. Therefore we define the portion of interest which is a polygonal area specified in world coordinates, called the "window".

We can use the window to change the apparent size and/or location of objects in the image. Changing the window affects all of the objects in the image. These effects are called "Zooming" and "Panning".

Panning

Moving all objects in the scene by changing the window is called "panning".

2D Viewing Transformation: Viewport

The user may want to create images on different parts of the screen so we define a viewport in Normalized Device Coordinates (NDC). Using NDC also allows for output device independence. Later we will map from NDC to Physical Device Coordinates (PDC).

Clipping refers to the removal of part of a scene. Internal clipping removes parts of a picture outside a given region; external clipping removes parts inside a region. We'll explore internal clipping, but external clipping can almost always be accomplished as a by-product.

There is also the question of what primitive types can we clip? We will consider line clipping and polygon clipping. A line clipping algorithms takes as input two endpoints of line segment and returns one (or more) line segments. A polygon clipper takes as input the vertices of a polygon and returns one (or more) polygons. There are several clipping algorithms. We'll study the Cohen-Sutherland line clipping algorithm to learn some basic concepts. Develop the more efficient Liang-Barsky algorithm and us its insights to culminate with Blinn's line clipping algorithm. The Sutherland-Hodgman polygon clipping algorithm will then be covered and the Weiler-Atherton algorithm, time permitting.

There are other issues in clipping that we will not have time to cover. Some of these are:

Text character clipping Scissoring -- clips the primitive during scan conversion to pixels Bit (Pixel) block transfers (bitblts/pixblts)

o Copy a 2D array of pixels from a large canvas to a destination window o Useful for text characters, pulldown menus, etc.

3

Cohen-Sutherland Line Clipping

The Cohen-Sutherland algorithm clips a line to an upright rectangular window. It is an application of triage, or make the simple case fast. The algorithm extended window boundaries to define 9 regions:

top-left, top-center, top-right, center-left, center, center-right, bottom-left, bottom-center, and bottom-right.

See figure 1 below. These 9 regions can be uniquely identified using a 4 bit code, often called an outcode. We'll use the order: left, right, bottom, top (LRBT) for these four bits. In particular, for each point

Left (first) bit is set to 1 when p lies to left of window Right (second) bit is set to 1 when p lies to right of window Bottom (third) bit is set to 1 when p lies below window Top (fourth) bit set is set to 1 when p lies above window

The LRBT (Left, Right, Bottom, Top) order is somewhat arbitrary, but once an order is chosen we must stick with it. Note that points on the clipping window edge are considered inside (the bits are left at 0).

Given a line segment with end points and , here's the basic flow of the Cohen-Sutherland algorithm:

1. Compute 4-bit outcodes LRBT0 and LRBT1 for each end-point

2. If both outcodes are 0000, the trivially visible case, pass end-points to draw routine This occurs when the bitwise OR of outcodes yields 0000.

3. If both outcodes have 1's in the same bit position, the trivially invisible case, clip the entire line (pass nothing to the draw routine). This occurs when the bitwise AND of outcodes is not 0000.

4. Otherwise, the indeterminate case, - line may be partially visible or not visible. Analytically compute the intersection of the line with the appropriate window edges

Let's explore the indeterminate case more closely. First, one of two end-points must be outside the window,

pretend it is .

1. Read P1's 4-bit code in order, say left-to-right

2. When a set bit (1) is found, compute intersection point I of corresponding window edge with line from p0 to p1.

Now this may not complete the clipping of the line, so we replace p0 by the intersection point I and repeat Cohen-Sutherland algorithm. (Clearly we can save some state to avoid some computations)

The Cohen-Sutherland was one of, if not, the first clipping algorithm to be implemented in hardware. Yet the intersection was not computed analytically, as we have done, but by bisection (binary search) of the line segment.

4

Cohen-Sutherland in 3D

The Cohen-Sutherland algorithm extends easily to 3D. Extended the 3D clipping window boundaries to

define 27 regions. Assign a 6 bit code to each region, that is, for each point

Left (first) bit set (1) point lies to left of window Right (second) bit set (1) point lies to right of window Bottom (third) bit set (1) point lies below window Top (fourth) bit set (1) point lies above window Near (fifth) bit set (1) point lies to in front of window (near) Far (sixth) bit set (1) point lies to behind of window (far)

Liang-Barsky Line Clipping

The Liang-Barsky is optimized for clipping to an upright rectangular clip window (the Cyrus-Beck algorithms is similar but clips to a more general convex polygon). Liang-Barsky uses parametric equations, clip window edge normals, and inner products can improve the efficiency of line clipping over Cohen-Sutherland. Let

denote the parametric equation of the line segment from p0 to p1Let denote the outward pointing normal of the clip window edge e, and let pe be an arbitrary point on edge e.

Using the 4 edge normals for an upright rectangular clip window and 4 points, one on each edge, we can calculate 4 parameter values where L(t) intersects each edge Let's call these parameter values tL, tR, tB, tT

Note any of the t's outside of the interval can be discarded, since they correspond to points before p0 (when t < 0) and points after p1 (when t>1). The remaining t values are characterized as ``potentially entering'' (PE) or ``potentially leaving'' (PL)

The parameter ti is PE if when traveling along the (extended) line from p0 to p1 we move from the outside to the inside of the window with respect to the edge i.

The parameter ti is PL if when traveling along the (extended) line from p0 to p1 we move from the inside to the outside of the window with respect to the edge i.

The inner product of the outward pointing edge normal with p1-p0 can be used to classify the parameter tias either PE or PL.

1. If

the parameter ti is potentially entering (PE). The vectors and p1-p0 point in opposite directions.

Since is outward, the vector p1-p0 from p0 to p1 points inward.

5

2. If

the parameter ti is potentially leaving (PL). The vectors and p1-p0 point in similar directions.

Since is outward, the vector p1-p0 from p0 to p1 points outward too. 3.

Let tpe be the largest PE parameter value and tpl the smallest PL parameter value 4.

The clipped line extends from L(tpe) to L(tpl), where

Sutherland-Hodgman Polygon Clipping

Since polygons are basic primitives, algorithms have been developed for clipping them directly. The Sutherland-Hodgman algorithm is a polygon clipper. It was a basic component in James Clark's ``Geometry Engine,'' which was the precursor to the first Silicon Graphics machines. This algorithm clips any subject polygon (convex or concave) against any convex clipping window, but we will usually pretend the clipping window is an upright rectangle.

Given a subject polygon with an ordered sequence of vertices

Sutherland-Hodgman compares each subject polygon edge against a single clip window edge, saving the vertices on the in-side of the edge and the intersection points when edges are crossed. The clipper is then re-entered with this intermediate polygon and another clip window edge.

Given a clip window edge and a subject polygon edge, there are four cases to consider:

1. The subject polygon edge goes from outside clip window edge to outside clip window edge. In this case we output nothing.

2. The subject polygon edge goes from outside clip window edge to inside clip window edge. In this case we save intersection and inside vertex.

3. The subject polygon edge goes from inside clip window edge to outside clip window edge. In this case we save intersection point.

4. The subject polygon edge goes from inside clip window edge to inside clip window edge. In this case we save second inside point (the first was saved previously).

To complete the description, we need to consider the first vertex of the subject polygon and its last edge. If the first vertex is inside the current edge we save it to the list of vertices in the intermediate polygon, otherwise we drop it out. For the last edge, note that if nothing has yet been saved in the intermediate polygon, the entire subject must not be visible in the clip window, so we can quit. Otherwise, if the last subject edge crosses clip window edge, the intersection point must be appended to the intermediate polygon.

6

Unit II

Introduction

When you represent a 3-D object on the computer or on paper, you should be aware of what type of projection method you are using in representing the object. Your goal should always be to represent the object with the

least amount of distortion critical features most visible highest degree of realism

You will find that these requirements are in most cases mutually exclusive. Instead you will have to decide which are the most important and which projection method will best achieve those goals.

Projection

Projection, as we refer to it here, is when you reduce the three dimensions of a real or virtual (computer) object to two dimensions. This is necessary to represent the object on a computer screen or a flat piece of paper. Different projection methods will collapse these three dimensions to two in different ways, creating differing representations of the object. One way of thinking of this process of projecting is to use projection lines to 'map' the three dimensions onto a sheet of paper. You would 'see' this projection if you oriented your line of sight parallel to the projection lines. Projection lines are a way of mimicking how you might see the object with your eyes.

The orientation of these projection lines to:

the object the paper each other

Parallel Projection

With parallel projection, all of the projection lines are parallel to each other. The parallel projection lines means that edges that are parallel on the real object are also parallel in the projection. This allows for the least amount of distortion of features within the object. It also allows for, under some circumstances, measurements of the object to be taken off the projection. With most types of parallel projection, the projection lines are perpendicular to the projection surface. Different types of parallel projections are created by orienting the object differently relative to the projection surface and, thus, the lines of projection. Parallel projections include multiview and pictorial projections.

Multiview Projection

With a multiview projection, the projection lines are oriented parallel to one of the principle axes of the object. Notice that the projection lines are following a large number of edges on the object. These edges define one of the principle axes which, in turn, defines one of the primary dimensions of the object. By projecting along one of the primary dimensions, this dimension is collapsed completely into the projection. Another way of stating this is that when you orient a principle axis perpendicular to a projection surface, it is not seen at all. Notice in the projection, all of the features represented along that primary dimension are completely missing on the projection. How do you determine what are the principle axes? Objects don't come predefined this way, so you end up having to decide what orientation of three mutually perpendicular (orthogonal) axes follows a majority of the key edges of the object. These axes can also be thought of as the Cartesian coordinate axes, X, Y, and Z. One way of thinking of this is imagine putting the object in the smallest box possible. How is the object oriented in the box? The corners of the box now represent the primary axes relative to the object.

7

Multiview projection gets its name because only two dimensions of the object are shown in each projection. These two principle dimensions displayed are shown in true size, there is no distortion. This is the case because these principle axes are both parallel to the surface it is being projected onto. If you are going to describe all three dimensions of the object, you must have two or more (multi) views. Multiple multiview projections brought together into a single drawing is the standard format for technical drawings used in engineering and architecture.

front side top

The 3-D arrow shows the direction you would view the object to see this particular multiview projection. Note that each of these views is parallel to one of the principle axes. In addition to the three multiview projections, there is also another parallel projection (the isometric pictorial) that will be discussed later. The next set of figures show what these different multiviews would look like if viewed along the 3-D arrows:

Pictorial Projection

Pictorial projection, unlike multiview projection, is designed to allow the viewer to see all three primary dimensions of the object in the projection. The degree to which a dimension gets 'collapsed' in the projection depends on the orientation of the line of sight relative to the object. Whereas a multiview is designed to focus in on only two of the three dimensions of the object, a pictorial provides a holistic view of the object. The tradeoff is that a multiview allows, in general, a more undistorted view of the features in the two dimensions displayed while lacking a holistic view of the object (thus needing multiple views to fully describe the object).

Axonometric Pictorial Projections

When parallel projection is used to create a view showing all three dimensions of an object, this is called an axonometric pictorial projection. Axonometric projections are classified according to the orientation of the principle axes relative to the projected surface. This orientation of axes determines how much each principle dimension is distorted. When a principle axis is oriented at something other than parallel to the projected surface, then the lengths of features in that dimension are shown shorter than their true length. This is called foreshortening. The closer the axis comes to being perpendicular to the projection surface, the more foreshortened it becomes, until it finally collapses to zero length.

The most common type of axonometric projection is called an isometric pictorial projection. With this pictorial, all three principle axes are oriented at the same angle to the projection plane, creating an equal amount of foreshortening in all three dimensions. Figure 6 shows an example of an isometric projection. Notice that the three principle axes overdrawn on the object make an angle of 120 degrees to each other on the projected surface. In the real object, these axes would actually be 90 degrees to each other.

Perspective Projection

If parallel projection provides the least of amount of distortion, why would you want to do anything else? In fact, parallel projection does not do a very good job of mimicking how we see the real world around us. When looking around us, objects that are farther away look as though they are smaller. Similarly, single objects that span a great distance, such as roads or railroad tracks, look as though parallel edges are getting closer together as they recede into the distance. Finally, when an object gets to a theoretical 'far point', they disappear all together. This happens at what we call the horizon line. We mimic this effect by allowing edges that are parallel on the object to converge as they move towards the theoretical horizon line on the projection surface. This technique uses perspective projection, which has lines of site which are not parallel to each other nor perpendicular to the projection surface. The rate at which parallel edges converge is called the perspective angle . This angle is determined by the distance an imaginary viewer is from the object being represented.

8

Parallel project mimics the case were the 'viewer' is infinitely far away from the object. In this case, the perspective angle is zero and the lines of projection are parallel. As the viewer gets nearer to object, the angle increases and the rate of convergence of edges grows.

1. Introduction spline curves

Every graphics system has some form of primitive to draw lines. Using these primitives we can draw many complex shapes. However, as these shapes get ever more complex and finely detailed, so does the data needed to describe them accurately.

The worst case scenario is the curve. A curve can be described by a finite number of short straight segments. However, on close inspection this is only an approximation. To get a better approximation we can use more segments per unit length. This increases the amount of data required to store the curve and makes it difficult to manipulate. We clearly need a way of representing these curves in a more mathematical fashion. Ideally, our descriptions will be:

Reproducable - the representation should give the same curve every time; Computationally Quick; Easy to manipulate, especially important for design purposes; Flexible; Easy to combine with other segments of curve.

The types of curve I will discuss fall into two broad categories: interpolating or approximation curves. Interpolating curves will pass through the points used to describe it, whereas an approximating curve will get near to the points, though exactly what is called near will be discussed later. The points through which the curve passes are known as knots; the curve described by the equation is often referred to as a spline. This term originated in manual design, where a spline is a thin strip. This strip was held in place by weights to create a curve which could then be traced. In the same way we now use knots to describe a curve.

2. Explicit spline curves

The most basic definition of a curve in two dimensions is y=f(x). This definition can be used to draw some simple curves. Although virtually any function can be used, the speed consideration for our application usually resticts us to polynomial functions, that is those where f(x) consists of powers of x only.

Cubic curves

As an example, consider the cubic polynomial

where a,b,c,d are constants. Together with a range for x, this defines a segment of curve. The shape of the curve is fairly flexible, but can't turn back on itself or be vertical. We could represent vertical curves using x=... instead, but the extra computational time involved in switching between the two methods inhibits the method's usefulness. The diagram below shows a cubic polynomial, with the four variables and the range of x available to change on the right hand side of the diagram.

Quadratic curves derived from control points

It is clear the above curve is not very easy to manipulate. It would be much easier if we could move points on the curve on the diagram and see the curve change shape. Fortunately that's not too hard for a computer. I will only discuss the case for a quadratic curve here; the idea generalises to further degrees, but each extra degree adds another parameter and more complexity in the calculations. Given three points that are on a quadratic curve, there is only one solution, given by solving the three simultaneous equations

9

where are three points on the curve. The solution of these equations is

Therefore by using these equations we can draw a quadratic using any three points, so long as no two x or y values are identical. The diagram below illustrates this idea; it assumes the range of x is as wide as needed to accomodate the given points.

3. Parametric spline curves

Parametric equations can be used to generate curves that are more general than explicit equations of the form y=f(x). A quadratic parametric spline may be written as

where P is the point we are trying to find, are three vectors defining the curve and t is the parameter. In order to solve this equation we can specify three points on the curve, labelled ; these are at positions along the curve given by the relevant parameter t. The curve will by convention be from t=0 to t=1, and two of the points we specify are the end points of the curve. By substituting suitable values into the equation we can specify three simultaneous equations thus

and by solving these we can find in terms of :

We can now apply this to any set of three points, as shown in the diagram below. It is easy to see the much higher degree of flexibility achieved through the use of parametric equations, and we will see this exploited with more advanced methods later on.

10

4. Bezier curves

So far we have only considered defining curves wholly in terms of the points through which they pass. This is a logical way of thinking, though it does suffer from drawbacks. We wish to make arbitrarily complex curves. Using just one equation to get more and more complex curves leads to higher degrees of polynomial and becomes mathematically awkward. One solution is to create complex curves out of many simpler curves. We call these patches. The key to creating curves in this way is how we match the end of one curve to the start of the next. It is not acceptable to match just the end points; we must match gradients as well. Defining curves by the points through which they pass does not lend itself very well to patching.

Bezier curves are defined using four control points, known as knots. Two of these are the end points of the curve, while the other two effectively define the gradient at the end points. These two points control the shape of the curve. The curve is actually a blend of the knots. This is a recurring theme of approximation curves; defining a curve as a blend of the values of several control points. The diagram below shows a bezier curve; you can see how the shape of the curve is affected by changing the knots.

Bezier curves are more useful than any other type we have mentioned so far; however, they still do not achieve much local control. Increasing the number of control points does lead to slightly more complex curves, but as you can see from the following diagram, the detail suffers due to the nature of blending all thecurve points together.

Up to now the diagrams have showed just one curve. I mentioned earlier that we can join up many simple curves to form a more complex one. The following diagram demonstrates this by showing two bezier curves being changed as one. No matter how you change the control points, the join always seems smooth. (When you move the joining point itself, the applet assumes you are moving the red curve, so the red curve's second knot remains stationary while the blue curve's second knot reflects your movement).

5. B-Spline curves

As shown in the last example, the main problem with Bezier curves is their lack of local control. Simply increasing the number of control points adds little local control to the curve. This is due to the nature of the bleanding used for Bezier curves. They combine all the points to create the curve. The obvious solution is to combine only those points nearest to the current parameter. For this we define our points to lie in parametric space at equal intervals:

These points are labelled internally from 0 to (number of points)-1. To calculate the curve at any parameter t we place a gaussian curve over the parameter space. This curve is actually an approximation of a gaussian; it does not extend to infinity at each end, just to +/- 2 by using the following equations:

11

Figure 5.2 - The Approximate Gaussian Curve

This curve peaks at a value of 2/3, and at +/- 1 its value is 1/6. When this curve is placed over the array of control points, it gives the weighting of each point. As the curve is drawn, each point will in turn become the heaviest weighted, therefore we gain more local control. The diagram below shows this curve in action. During the animation, the weighting is shown by both the size of the marker and the darkness of the line joining the marker to the curve.

Notice how the curve seems to go haywire at either end. At P0, the Gaussian curve covers points from -1 to 1 (at points -2 and 2 the Gaussian weight is zero). The point at -1 is not defined, so the curve has an undefined value. In this example it is being pulled towards the origin. This behaviour is not really acceptable. One way to correct this behaviour is to duplicate the end points; if you place two points on the above diagram on top of each other, you can see how the curve comes that much closer. This is still not perfect.

What we want to do is have the curve end at the end knots. If we create the knot at -1 and just past the other end of the curve, we can do this. These made-up knots are called phantom knots. By looking at the gaussian shape, it is easy to see that when the curve is at knot 0, it takes 2/3 weight from point 0 and 1/6 from both point 1 and -1. For the point to be at knot 0, the two points at 1 and -1 must cancel each other out; that is, they must be opposite each other on the graph. You can see this in the next diagram; as you move point 0, the phantom knot always stays opposite knot 1.

There's one other useful thing we can do with b-splines. We can make the spline go through all the knots. To do this, we define a set of parametric knots to be those required to make a b-spline go through our geometric knots. For N+1 geometric knots (those we define) there will need to be N+3 parametric knots to create the curve.

This gives us two degrees of freedom; these are the gradients at knot 0 and knot N. These two sets of knots are related by equations based on the gaussian curve; for example

where P are the geometric knots and A are the parametric knots. The whole set of equations can be made into the following matrix equation:

12

To calculate the parametric knots this matrix must be inverted, a job computers find much easier than us. Then the values may be calculated directly. You can see this curve in action below.

That completes the two dimensional segment of this tutorial. We now move on to consider how these methods are extended to defining curved surfaces.

6. Introduction to Surfaces

When comparing mathematics in two and three dimensions, there are many similarities. Very often, the techniques used in the simpler two-dimensional case easily extend to cover three dimensions. Some of the curve representations presented in the previous sections easily extend to three dimensions and can therefore represent surfaces.

When creating a curve, we used a single parametric dimension, defined points within this dimension then used this to create our curve. For a surface, we need two orthogonal parametric dimensions of points. These form a rectangular mesh. At any point in parametric space, we use two blending functions, one in each parametric direction. For every knot defined, we calculate the Cartesian product of the two blending functions and this is the weight given to that knot. The sum of all the weights will still be one as it was for a curve.

The most commonly used methods of representing curved surfaces in computing are by Bézier surfaces and B-spline surfaces, and this tutorial is limited to these types.

7. Bezier surfaces

To create a Bézier surface, We blend a mesh of Bézier curves using the blending function

where j and k are points in parametric space and represents the location of the knots in real space. The Bézier functions specify the weighting of a particular knot. They are the Bernstein coefficients. The definition of the Bézier functions is

where C(n,k) represents the binary coefficients. When u=0, the function is one for k=0 and zero for all other points. When we combine two orthogonal parameters, we find a Bézier curve along each edge of the surface, as defined by the points along that edge. Bézier surfaces are useful for interactive design and were first applied to car body design.

8. B-Spline surfaces

13

We can create a B-Spline surface using a similar method to the Bézier surface. For B-Spline curves, we used two phantom knots to clamp the ends of the curve. For a surface, we will have phantom knots all around the eal knots as shown below for an M+1 by N+1 knot surface.

There are two extra rows and two extra columns of knots in parametric space surrounding the real knots. Where we place these knots determines the shape of the surface at the edges. The method described here gives similar results to the method used for Bézier surfaces; that is, the edges of the surface form a B-Spline curve of the edge knots. This means some of the boundary conditions are

14

Unit III Multimedia:

Multimedia is an efficient combination of all the multimedia objects like text, image, and video,

audio. It is a general term used for documents, applications, presentations and any information dissemination that uses the different multimedia objects.

Multimedia Elements: Facsimile Document Images Photographic Images Geographic Information system maps Voice commands Voice synthesis Audio Messages Video Messages Full Motion stored and Live Video Holographic Images Fractals

Applications of Multimedia:

Document Imaging Image Processing and Image Recognition Full Motion Digital Video Applications Electronic messaging Entertainment Corporate Communications

Multimedia Systems Architecture:

High resolution Graphics display 1. VGA Mixing 2. VGA Mixing with scaling 3. Dual Buffered VGA mixing/scaling

IMA architectural framework 1. The Desktops 2. The servers

Network architecture for multimedia systems 1. Task based Multi-Level Networking 2. High speed server to server links

Duplication Replication

Networking standards 1. Asynchronous Transfer Mode (ATM) 2. Fiber Distributed Data Interface (FDDI)

Evolving technologies for Multimedia Systems

Hypermedia Documents 1. Hypertext 2. Hypermedia 3. Hyperspeech

HDTV and UDTV 3-D Technologies and Holography

1. Pointing Devices 2. Displays

Fuzzy Logic (FL)

15

Digital Signal Processing 1. DSP Architecture and Applications

Memory Management Hardware-Interrupt Handling Multitasking Inter task Synchronization and communication Multiple Timer Services Device Independent I/O

Defining Objects for Multimedia Systems

Text Images

1. Visible 2. Non Visible 3. Abstract

Audio and Video Fill Motion and Live Video

Multimedia Data Interface Standards

File format for Multimedia Systems Video Processing standards

1. Intel’s AVI 2. Microsoft’s AVI

Multimedia Database

Multimedia data typically means digital images, audio, video, animation and graphics together with text data. The acquisition, generation, storage and processing of multimedia data in computers and transmission over networks have grown tremendously in the recent past.

This astonishing growth is made possible by three factors. Firstly, personal computers usage becomes widespread and their computational power gets increased. Also technological advancements resulted in high-resolution devices, which can capture and display multimedia data (digital cameras, scanners, monitors, and printers). Also there came high-density storage devices. Secondly high-speed data communication networks are available nowadays. The Web has wildly proliferated and software for manipulating multimedia data is now available. Lastly, some specific applications (existing) and future applications need to live with multimedia data. This trend is expected to go up in the days to come.

Multimedia data are blessed with a number of exciting features. They can provide more effective dissemination of information in science, engineering , medicine, modern biology, and social sciences. It also facilitates the development of new paradigms in distance learning, and interactive personal and group entertainment.

The huge amount of data in different multimedia-related applications warranted to have databases as databases provide consistency, concurrency, integrity, security and availability of data. From an user perspective, databases provide functionalities for the easy manipulation, query and retrieval of highly relevant information from huge collections of stored data.

MultiMedia Databases (MMDBs) have to cope up with the increased usage of a large volume of multimedia data being used in various software applications. The applications include digital libraries, manufacturing and retailing, art and entertainment, journalism and so on. Some inherent qualities of multimedia data have both direct and indirect influence on the design and development of a multimedia database. MMDBs are supposed to provide almost all the functionalities, a traditional database provides. Apart from those, a MMDB has to provide some new and enhanced functionalities and features. MMDBs are required to provide unified frameworks for storing, processing, retrieving, transmitting and presenting a variety of media data types in a wide variety of formats. At the same time, they must adhere to numerical constraints that are normally not found in traditional databases.

16

Contents of MMDB

An MMDB needs to manage several different types of information pertaining to the actual multimedia data. They are:

Media data - This is the actual data representing images, audio, video that are captured, digitized, processes, compressed and stored. Media format data - This contains information pertaining to the format of the media data after it goes through the acquisition, processing, and encoding phases. For instance, this consists of information such as the sampling rate, resolution, frame rate, encoding scheme etc. Media keyword data - This contains the keyword descriptions, usually relating to the generation of the media data. For example, for a video, this might include the date, time, and place of recording , the person who recorded, the scene that is recorded, etc This is also called as content descriptive data. Media feature data - This contains the features derived from the media data. A feature characterizes the media contents. For example, this could contain information about the distribution of colors, the kinds of textures and the different shapes present in an image. This is also referred to as content dependent data.

The last three types are called meta data as they describe several different aspects of the media data. The media keyword data and media feature data are used as indices for searching purpose. The media format data is used to present the retrieved information.

Designing MMDBs

Many inherent characteristics of multimedia data have direct and indirect impacts on the design of MMDBs. These include : the huge size of MMDBs, temporal nature, richness of content, complexity of representation and subjective interpretation. The major challenges in designing multimedia databases arise from several requirements they need to satisfy such as the following:

1. Manage different types of input, output, and storage devices. Data input can be from a variety of devices such as scanners, digital camera for images, microphone, MIDI devices for audio, video cameras. Typical output devices are high-resolution monitors for images and video, and speakers for audio. 2. Handle a variety of data compression and storage formats. The data encoding has a variety of formats even within a single application. For instance, in medical applications, the MRI images of brain has lossless or very stringent quality of lossy coding technique, while the X-ray images of bones can be less stringent. Also, the radiological image data, the ECG data, other patient data, etc. have widely varying formats. 3. Support different computing platforms and operating systems. Different users operate computers and devices suited to their needs and tastes. But they need the same kind of user-level view of the database. 4. Integrate different data models. Some data such as numeric and textual data are best handled using a relational database model, while some others such as video documents are better handled using an object-oriented database model. So these two models should coexist together in MMDBs. 5. Offer a variety of user-friendly query systems suited to different kinds of media. From a user point of view, easy-to-use queries and fast and accurate retrieval of information is highly desirable. The query for the same item can be in different forms. For example, a portion of interest in a video can be queried by using either

1) a few sample video frames as an example,

2) a clip of the corresponding audio track or

3) a textual description using keywords.

6. Handle different kinds of indices. The inexact and subjective nature of multimedia data has rendered keyword-based indices and exact and range searches used in traditional databases

17

ineffective. For example, the retrieval of records of persons based on social security number is precisely defined, but the retrieval of records of persons having certain facial features from a database of facial images requires, content-based queries and similarity-based retrievals. This requires indices that are content dependent, in addition to key-word indices. 7. Develop measures of data similarity that correspond well with perceptual similarity. Measures of similarity for different media types need to be quantified to correspond well with the perceptual similarity of objects of those data types. These need to be incorporated into the search process 8. Provide transparent view of geographically distributed data. MMDBs are likely to be a distributed nature. The media data resides in many different storage units possibly spread out geographically. This is partly due to the changing nature of computation and computing resources from centralized to networked and distributed. 9. Adhere to real-time constraints for the transmission of media data. Video and audio are inherently temporal in nature. For example, the frames of a video need to be presented at the rate of at least 30 frames/sec. for the eye to perceive continuity in the video. 10. Synchronize different media types while presenting to user. It is likely that different media types corresponding to a single multimedia object are stored in different formats, on different devices, and have different rates of transfer. Thus they need to be periodically synchronized for presentation.

The recent growth in using multimedia data in applications has been phenomenal. Multimedia databases are essential for efficient management and effective use of huge amounts of data. The diversity of applications using multimedia data, the rapidly changing technology, and the inherent complexities in the semantic representation, interpretation and comparison for similarity pose many challenges. MMDBs are still in their infancy. Today's MMDBs are closely bound to narrow application areas. The experiences acquired from developing and using novel multimedia applications will help advance the multimedia database technology.

18

Unit IV Compression and Decompression

Compression and decompression techniques are utilized for a number of applications, such as

facsimile systems, printer systems, document storage and retrieval systems. When information is

compressed, the redundancies are removed. Some times the real information also removed.

Type of compression

Lossless compression

Lossy compression

Lossless compression, data is not altered or lost in the process of compression or decompression.

Decompression generates an exact replica of the original object. Text compression is a good example of

lossless compression.

Run-length encoding

CCITT-group 3 1D

CCITT-group 3 2D

CCITT-group 4

Lempel-Ziv and Welch algorithm LZW

Lossy compression

Results contain some information loss. Often used for audio, images and video.

Mechanisms

JPEG

MPEG

Intel DVI

Fractals

Data and File Format Standards

Data and File Format Standardization is crucial for the sharing of data among multiple applications

and for exchanging information between applications.

Data and File Formats For Multimedia Systems

A large number of different formats standard as well as proprietary, are in use. Since the personal

computer industry and, more specifically, the Microsoft –Windows-based systems form the largest base for

multimedia systems, we discuss the formats used primarily in the personal computer environments. The

multimedia file formats discussed in this chapter including the following:

Rich-text format (RTF)

19

Tagged image file format (TIFF)

Resource image file format (RIFF)

Musical instrument digital interface (MIDI)

Joint Photographic digital interface (JPEG)

Audio Video Interleaved (AVI) Indeo file format

TWAIN

The resource interchange file format forms the basis of a number of the above file formats, and Microsoft

recommends using the RIFF file format structure for an application requiring new file formats.

TWAIN

With the advent of multimedia system, the business world has driven the need to use objects like

still images. Real-time video clips, and audio and voice soundtracks to create dramatically live and exciting

presentations, hypermedia reports, and other documents. These objects are captured using sophisticated

and complex devices such as scanners, digital still and video cameras, and so on. This allows an

application to create a document with complex output to display text along with graphics and images, and

also to play video clips with sound. More and more sophisticated devices are developed to cater the need

of applications wanting to use them.

Benefits:

1. Application developers can code to a single TWAIN specification that allows application to interface

to all twain compliant i/p devices.

2. Devices manufacturers can write device drivers for their proprietary devices and by complying to

the twain specification, allow the device to be used by all twain compliant applications.

3. It allows users to invoke “Acquire” and “select source” menu pick options to select one of multiple

devices that use the same TWAIN driver.

TWAIN Specification Objectives:

1. Support multiple platforms including Microsoft windows, apple Macintosh OS system 6.x or 7.x,

Unix, and IBM OS/2.

2. Supports multiple devices like scanners, digital camera. Frame grabbers, and so on

3. Widespread acceptance with standard interface.

4. Standard extendibility and backward compatibility.

5. Multimedia format.

6. Easy to use.

Resource Interchange File Format (RIFF)

20

RIFF is not really a new file format. Rather it provides a framework or an envelop for multimedia file format

for Microsoft Windows based applications. Just as it has been used for some standardized formats, it can

be used to convert a custom file format to a RIFF file format by wrapping a RIFF structure around it.

For Ex- a MIDI file format is converted to RIFF MIDI by adding RIFF structure in the form of RIFF “chunks”

to a MIDI file.

Kinds of Chunks:

RIFF Chunk-defines the contents of the RIFF file.

List Chunk-allows embedding additional file information such as archival location, copyright

information, creation date, and so on.

Subchunk-allows adding more information to a primary chunk when the primary chunk is not

sufficient.

Multimedia input and output technologies

The traditional i/o devices are not suitable for multimedia application. The keyboard is just a

numeric and alphanumeric values feed to the system. It does not support the GUI environment. Multimedia

objects are the need to convert data from analog to digital form. Each object type has some measure of

resolution. Sound quality is measured in terms of sampling rate and the number of bits used for

representing amplitude.

PEN Input:

Pen provides better interaction for the multimedia applications

Natural device for writing

Direct pointing device

To make gesture

Natural drawing tool

Working principle

Electronic pen and digitizer

Pen driver

Recognition context manager

Recognizer

Dictionary

Display driver

Video and image display systems

Display system requirements

Combine graphics and imaging technologies

Display system technologies

VGA mixing

VGA mixing with scaling

Dual buffered VGA mixing/scaling

21

Display performance issue

Network bandwidth

Decompression

Display technology

Video display technology standards

Monochrome display adapter

Color graphics adapter

Enhanced graphics adapter

Professional graphics adapter

Video graphics adapter

Extended graphics array

Flat panel display system

Passive-matrix monochrome

Active matrix monochrome

Passive matrix color

Active matrix color

Passive LCD matrix display

Active LCD matrix display

Print output technologies

Laser print quality requirements

Printer prices, resolution

Technology

Paper feed mechanism

Paper guide

Laser assembly

Corona assembly

Fuser

Toner cartridge

Image scanners

Document imaging system, documents are scanned using a scanner.

Scanner types

A and B size scanners

Large form factor scanners

Flatbed scanners

Rotary drum scanners

Handheld scanners

Scanner components

CCD(charge coupled devices).

CCD the output voltage of a CCD device is directly proportional to the amount of the charge accumulated.

22

CCD devices are sensitive to small changes in light intensity which results in a precise measure of the pixel

value digitized upto 16bits.

Image enhancement techniques

A number of software techniques are used to improve the quality of an image. Scanners and

printers use a technique called half-tones to address gray-scale issues.

Half-tones

Dithering

Enhancement

Brightness

Deskew

Contrast

Sharpening

Emphasis

Image manipulation

Scaling

Rotation

Cropping

Digital voice and audio

Applications: product presentation, product catalogs, product brochures, product manuals,

installation instructions, maintenance manuals, training manuals and on-line help.

Amplitude of the signal represents the intensity of the sound, and the frequency represents the pitch of the

sound. Sound is made up of continuous analog sine waves that tend to repeat for seconds at a time

depending on the music or the voice.

Video images and animation:

A video frame grabber is used to capture, manipulate, and enhance video images. A video channel

multiplexer has multiple inputs for different video inputs. The video channel multiplexer allows the video

channel to be selected under program control and switches to the control circuitry appropriate for the

selected channel in a TV with multisystem inputs.

The ADC takes inputs from video multiplexer and converts the amplitude of a sampled analog

signal to either an 8-bit digital value for monochrome or a 24-bit digital value for color.

The input lookup table along with the ALU allows performing image processing functions on a pixel

basis and an image frame basis.

The image frame buffer is organized as a 1024 x 1024 x24 storage buffer to store images for image

processing and display. The frame buffer has dual ports.

Video and still image processing

Video image processing as the process of manipulating a bit map image so that the image can be

enhanced, restored, distorted, or analyzed.

Pixel point to point processing

Histogram sliding

Histogram stretching and shrinking

23

Pixel threshold

Interframe image processing

Interframe image processing is the same as point-to-point image processing, except that image

processor operates on two images at the same time.

Image averaging

Image subtraction

Logical image operations

Spatial filter processing

The process of generating images with either low-spatial frequency components or high frequency

components is called spatial filter processing.

Low pass filter

High pass filter

Laplacian filter

Frame processing

Image scaling

Image rotation

Image translation

Scale to gray

Image transformation

Image compression and decompression

Image animation techniques

An illusion of movement created b sequentially playing still image frames at the rate of 15-20

frames per second.

Animation methods

Toggling between image frames

Rotating through several image frames

Delta frame animation

Palatte animation

24

Unit V Multimedia authoring and user interface

Multimedia systems are different from other systems in two main respects the variety of information objects

used in application and the level of integration achieved in using these objects in complex interconnected

applications. In multimedia applications, the user creates and controls the flow of data and determines the

expected rendering of it. For this reason, applications that allow users to create multimedia objects and link

or embed them in other compound objects such as documents or database records are called authoring

systems.

Multimedia authoring systems

Authoring systems for multimedia applications are designed with the following two primary target users in

mind: professionals who prepare documents, audio or soundtracks, and full motion video clips for wide

distribution and average business users preparing documents, audio recording, or full motion video clips for

stored messages or presentations.

Design issues of multimedia authoring Display resolution

Data formats for captured data

Compression algorithms

Network interfaces

Storage formats

A number of design issues must be considered for handling different display outputs such as

Level of standardization on display resolutions

Display protocol standardization

Corporate norms for service degradations

Corporate norms for network traffic degradations as they relate to resolutions issues.

File format and data compression issues

The primary concern with very large objects is being able to locate them quickly and being able to

play them back efficiently. In almost all cases the objects are compressed in some form. There is, however,

another aspect of storage that is equally important from a design perspective. It is useful to have some

information about the object itself available outside the object to allow a user to decide if they need to

access the object data.

1.compression type

2.Estimated time to decompress and display or play back the object [for audio and full motion video]

3.Size of the object [for images or if the user wants to download the object to a notebook]

4.Object orientation [for images]

5.Annotation markers and history[for images and sound or full motion video]

6.Index markers [for sound full motion video]

7.Data and time of creation

25

8.Source file

Design approaches to authoring

Designing an authoring system spands a number of critical design issues,inclding

The following;

1.hypermedia application design specifics

2.user interface aspects

3.embedding / linking streams of objects to a main document or presentation

4.storage of and access to multimedia objects.

5. Playing back combined streams in a synchronized manner.

Hypermedia applications bring together a number of design issues not commonly encountered in

other types of applications. However as in any other application type, a good user interface design is crucial

to the success of a hypermedia application. The user interface presents a window to the user for controlling

storage and retrieval , inserting objects in the document and specifying the exact point of insertion, and

defining index marks for combining different multimedia streams and the rules for playing them back.

Types of multimedia authoring systems

Dedicated authoring systems

Timeline-based authoring

Structured multimedia authoring

Programmable authoring systems

Multisource multi-user authoring system

Telephone authoring systems

Hypermedia application design considerations

Multimedia applications are based on a totally new metaphor that combines the television, VCR and

window-based application manager in one screen. The user interface must be highly intuitive to allow the

user to learn the tools quickly and be able to use them effectively.

A good designers needs to determine the strategic points during the execution of an application

where user feedback is essential or very useful.

The following steps for good hypermedia design

1. Determining the type of hypermedia application

2. Structuring the information

3. Determining the navigation throughout the application

4. Methodologies for accessing the information

5. Designing the user interface

Integration of applications

26

Depending on the job function of the knowledge worker, the computer may be called upon to run a

diverse set of applications, including some combination of the following.

Electronic mail

Word processing

Graphics and formal presentation preparation software

Spreadsheet

Access to a relational

Customized applications directly related to job function

Common UI and application integration

The Microsoft has different user interface for a large number of applications by providing

standardization at the following levels.

Overall visual look and feel of the application windows

Menus

Dialog boxes

Buttons

Help feature

Scroll bars

Tool bars

File open and save

Data Exchange

The MS clipboard allows exchanging data in any format. The clipboard can be used to exchange

multimedia objects as well, including cutting or copying a multimedia object in one document and pasting it

in another.

Distributed data access

Application integration succeeds only if all applications required for a compound object can access

the sub objects that they manipulate. Fully distributed data access implies that any application at any client

workstation in the enterprise-wide WAN must be able to access any data object s if it were local.

User interface design

User interface design for multimedia applications is more involved than for other applications due to

the number of types of interactions with the user.

Four kinds of user interface design is available such as

Media editors

An authoring application

Hypermedia object creation

Multimedia object locator and browser

A media editor is an application responsible for the creation and editing of a specific multimedia

object such as an image, voice or video object.

Designing user interface

The correctness of a user interface is a perception of a user.

Guidelines

27

Planning the overall structure of the application

Planning the content of the application

Planning the interactive behavior

Planning the look and feel of the application

Special metaphors for multimedia applications

Multimedia applications bring together two key technologies; entertainment and business

computing.

The organizer metaphor

The multimedia aspects of the organizer are not very obvious until one begins to associate the

concept of embedding multimedia objects in the appointment diary or notepad for future filling.

The telephone metaphor

The telephone, until very recently, was considered an independent office appliance. The advent of

voice mail systems was the first step in changing the role of the telephone.

Aural user interface

The common approach for speech-recognition based user interfaces has been to graft the speech

recognition interface into existing graphical user interfaces. This is a mix of conceptually mismatched media

that makes the interface cumbersome and not very efficient.

The real challenge in designing AUI systems is to create an aural desktop that substitutes voice

and ear for the keyboard and display, and be able to mix and match them.

The VCR metaphor

The easiest user interface for functions such as video capture, channel play and stored video

playback is to emulate the camera, television, and VCR on screen.

Audio and video indexing functions

Audio tape indexing has been used by a large number of tape recorders since the early 1950s.

Index marking on tape is a function that has been available in many commercial VCRs. Index marking on

tape left a physical index mark on the tape. These index marks could be used in fast forwarded and rewind

searches.

Hypermedia messaging

E-mail based document interchange, generally known as messaging services. Messaging is one of

the major multimedia applications. Mobile messaging represents a major new dimension in the user’s

interaction with the messaging system. Handheld and desktop devices, an important growth area for

messaging, require complementary back-end services to effectively manage communications for a large

organization. An answering service can take multiple messages simultaneously irrespective of line usage.

The roles of telephone carries and local cable companies are starting to blur.

Hypermedia message components:

A hypermedia message may be a simple message in the form of text with an embedded graphics,

soundtrack, or video clip. The components of hypermedia messages are handled through the following

three steps.

28

The user may have watched some video presentation on the material and may want to attach a part

of that clip in the message.

Some pages of the book are scanned as images. The images provide an illustration or a clearer

analysis of the topic.

The user writes the text of the message using a word processor.

When the message is fully composed, the user signs it and mails the message to the addressee. The

messaging system must ensure that the images and video clips referenced in the message are also

transferred to a server local to recipient.

Message types

Text messages

Rich-text messages

Voice messages

Full-motion video management

Hypermedia linking and embedding

Linking and embedding are two methods for associating multimedia objects with documents.

Linking as in hypermedia applications. Hypertext systems associate keywords in a document

with other documents.

Linking multimedia objects stored separately from the document and the link provides a pointer

to its storage. An embedded object is a part of the document and is retrieved when the

document is retrieved.

Linking and embedding in a context specific to Microsoft object linking and embedding.

When a multimedia object is incorporated in a document, its behavior depends on whether it is linked or

embedded. The difference between linking and embedding stems from how and where the actual source

data that comprises the multimedia object resides.

Linking objects:

When an object is linked, the source data object, called the link source, continues to reside

wherever it was as the time the link was created. This may be at the object server where it was created, or

where it may have been copied in a subsequent replication.

Embedding objects:

When the multimedia object is embedded, a copy of the object is physically stored in the

hypermedia document. Graphics and images can be inserted in a rich-text document or embedded using

such techniques as OLE.

Design issues:

For users who have a requirement for component documents, OLE represents an important

advancement in systems and application software on distributed platforms. OLE will create significant

support headaches for users if there is incomplete link tracking between documents that have been mailed

between PCs and the application which created those objects.

Creating hypermedia messages:

A hypermedia message can be a complex collection of a variety of objects. It is an integrated

message consisting of text, binary files, images, bitmaps, voice and sound.

29

Procedure:

Planning

Creating each component

Integrating components

Integrated multimedia message standards:

As text-based technologies have progressed and have become increasingly integrated with

messaging systems, new standards are being developed to address interoperability of application from

different software vendors.

Vendor-independent messaging

Vendor independent messaging interface is designed to facilitate messaging between VIM-enabled

electronic mail systems as well as other applications. A VIM interface makes mail and messaging services

available through a well-defined interface. A messaging service enables its clients to communicate with

each other in a store-and –forward manner. VIM defines messaging as the data exchange mechanism

between VIM aware applications.

VIM mail message is a message of a well defined type that must include a message header and may

include note parts, attachments, and other application-defined components.

VIM services

The VIM interface provides a number of services for creating and mailing a message such as,

Electronic message composition and submission

Electronic message sending and receiving

Message extraction from mail system

Address book services

The developers of VIM targeted four areas in which VIM could fit into the business process: mail

enabling existing applications, creating alert utilities, creating scheduling applications, and helping workflow

applications. The benefits of implementing applications in each of these four areas vary significantly.

MAPI

MAPI is to provide a messaging architecture rather than just providing a messaging API in

windows. MAPI provides a layer of functionality between applications and underlying messaging systems.

Goals

Separate client applications from the underlying messaging services

Make basic mail-enabling a standard feature for all applications

Support messaging- reliant workgroup applications

Telephony API

The TAPI standard has been defined by Microsoft and Intel, and has been upgraded through

successive release to stay abreast of on going technology changes.

X 400 Message handling service

The MHS describe a functional model that provides end users the ability to send and receive

electronic messages. A user agent is an entity that provides the end user function for composing and

sending messages as well as for delivering messages. Most user agent implementations also provide local

mail management functions such as storage of mail, sorting mail in folders, purging and forwarding. When a

30

user composes a message and sends it the UA communicates the message to and MTA. If there is no local

MTA, the message is forwarded to the MTA in a submission envelope based on one of a set of message

protocol data units(MPDU) defined in the submission and delivery protocol. The delivery protocol is

designed to use a remote operations service and optionally, a reliable transfer service to submit message to

the MTA.

A collection of MTAs and UAs constitutes a management domain. Administrative management

domains are public services such as AT&T, Sprint and so on.

Distributed multimedia systems:

A multimedia system consists of a number of components , which are distributed and dedicated

function with different locations.

Components:

Application software

Container object store

Image and still video store

Audio and video component store

Object directory service agent

Component service agent

User interface service agent

Networks

The application software is the multimedia application that creates, edits or renders multimedia

objects.

The container object store is used to store container objects in a network object server.

A image or still video store is a mass storage component for images and still video.

An audio/video component store is the storage resource used for storing audio and video objects.

An object directory service agent is responsible for assigning identification for all multimedia object

types managed by that agent.

A component service agent is responsible for locating each embedded or linked component object

of a multimedia container, and managing proper sequencing for rendering of the multimedia objects.

A user interface service agent is responsible for managing the display windows on a user

workstation, interacting with the user, sizing the display windows, and scaling the decompressed object to

the selected window size.

The network as used in this context refers to the corporate wide network consisting of all LAN and

WAN interfaces required for supporting a particular application for a specific group of users.

Distributed client-server operation:

While the client server architecture has been used for some time for relational databases such as

Sybase and Oracle. Most client-server systems were designed to connect a client across a network to a

server that provided database functions. The clients in this case were custom-designed for the server.

Client in distributed workgroup computing

The client systems interact with the data servers in any of the following ways:

31

1. Request specific textual data

2. Request specific multimedia objects embedded

3. Require activation of rendering server application to display

4. Create and store multimedia objects on servers

5. Request directory information on locations of objects on servers.

Servers in distributed workgroup computing

1. Provide storage for a variety of object classes

2. Transfer objects on demand to clients

3. Provide hierarchical storage for moving unused objects to near-line media

4. System administration functions for backing up stored data

5. Direct high-speed LAN and WAN server-to-server transport for copying multimedia objects.

Database operations

Search

Browse

Retrieve

Create and store

Update

Middleware in distributed workgroup computing

1. Provide the user with a local index, an object directory, for objects with which a client is

concerned

2. Provide automatic object directory services for locating available copies of objects

3. Provide protocol and data format conversations between the client requests and the stored

formats in the server

4. Provide unique identification throughout the enterprise wide network for every object through

time.

Multimedia object servers:

The resources where information objects are stored so that they remain sharable across the

network are called servers.

Types of multimedia servers

Data processing servers

Document database servers

Document imaging and still video servers

Audio and voice mail servers

Full-motion video servers.

Network topologies for multimedia object servers

Centralized multimedia server

Dedicated multimedia servers

Distributed multimedia servers

Multimedia network topologies

Traditional LAN

32

Extended LANs

High-speed LANs

WANs

Distributed multimedia database

A multimedia database consists of a number of different types of multimedia objects.

Database organization for multimedia applications

Data independence

Common distributed database architecture

Multiple data servers

Transaction management for multimedia systems

Managing hypermedia records as objects

Multimedia objects need not always be embedded in the database record or a hypermedia

document; instead, a reference can be embedded, and the multimedia object can reside separately in its

own database, potentially optimized for that type of multimedia object.

Managing distributed objects

The issues are

How objects are located, and once located , how retrieval is managed in a multi-user

environment, replication, archival, load balancing and purging.

The above issues are addressed with the following concepts

Interserver communications

Object server architecture

Object identification

Object revision management

Optimizing network location of objects

Object directory services