thesis msc comp graphics

8/14/2019 Thesis Msc Comp Graphics

1/80

Real-time Display of 3D Graphics

for

Handheld Mobile Devices

A Thesis

Submitted to the Office of Graduate Studies,

University of Dublin, Trinity College Dublin,

In Candidacy for the

Degree of Master of Science

By

Alan Cummins

November 2003


2/80

Table Of Contents

TABLE OF FIGURES ............................................................................................................ 4

DECLARATION ..................................................................................................................... 6

ACKNOWLEDGEMENTS.................................................................................................... 7

ABSTRACT ............................................................................................................................. 8

1. INTRODUCTION .......................................................................................................... 9

1.1. MOTIVATION ................................................................................................... 9 1.2. OBJECTIVES ................................................................................................... 11 1.3. SCOPE ............................................................................................................ 12 1.4. SUMMARY OF CHAPTERS ............................................................................... 12

2. HARDWARE ASPECTS ............................................................................................. 14

2.1. MOBILE HARDWARE ...................................................................................... 14 2.2. MOBILE SOFTWARE TOOLS ........................................................................... 16 2.3. COMMUNICATION PROTOCOLS ...................................................................... 18 2.4. CHOICE OF HARDWARE ................................................................................. 19

3. RELATED WORK....................................................................................................... 21

3.1. LEVEL OF DETAIL .......................................................................................... 21 3.2. CULLING ........................................................................................................ 23 3.3. IMPOSTORS .................................................................................................... 25 3.4. POINT -BASED R ENDERING ............................................................................ 29 3.5. SCENE MANAGEMENT ................................................................................... 30 3.6. COMMON THEMES ......................................................................................... 31

4. IMPLEMENTATION .................................................................................................. 32 4.1. LEVEL OF DETAIL .......................................................................................... 32 4.2. OCCLUSION CULLING .................................................................................... 34 4.3. STREET IMPOSTOR ......................................................................................... 38 4.4. SPLATTING ..................................................................................................... 42 4.5. OVERALL FRAMEWORK ................................................................................. 44

5. EXPERIMENTATION RESULTS............................................................................. 46

5.1. SINGLE MODEL B ASIC R ENDERING ............................................................ 46 5.2. MULTIPLE MODELS B ASIC R ENDERING ..................................................... 46 5.3. DISTRIBUTION TIMINGS ................................................................................. 47

5.4. PRE-PROCESS TIMINGS .................................................................................. 48 5.5. DISTRIBUTION COSTS .................................................................................... 50 5.6. R ENDERING WITH CULLING .......................................................................... 50 5.7. R ENDERING WITH IMPOSTOR TECHNIQUES ................................................... 51 5.8. R ENDERING WITH POINT -BASED TECHNIQUES .............................................. 52

6. CONCLUSIONS ........................................................................................................... 54

6.1. FUTURE WORK .............................................................................................. 54 6.1.1. Non-Photorealistic Rendering....................................................................... 54 6.1.2. Perceptual Feedback..................................................................................... 55 6.1.3. Artistic Principles.......................................................................................... 55 6.1.4. Virtual Humans ............................................................................................. 55

6.1.5. Salient Feature Extraction ............................................................................ 56

2


3/80

BIBLIOGRAPHY.................................................................................................................. 57

WEB BIBLIOGRAPHY ....................................................................................................... 67

A. APPENDIX - SCREENSHOTS - LEVEL OF DETAIL ........................................... 69

BUNNY ................................................................................................................................. 69 COW ..................................................................................................................................... 70 DRAGON .............................................................................................................................. 71 CUP ...................................................................................................................................... 72 S NAKE ................................................................................................................................. 73 LEGO PIECE ......................................................................................................................... 74 GEOMETRIC LEVEL -OF-DETAIL SAMPLES ........................................................................... 75

B. APPENDIX - MOBILE HARDWARE SPECIFICATION...................................... 76

C. APPENDIX - SCREENSHOTS - ARCHITECTURAL APPLICATION............... 77

3


4/80

Table of Figures

Figure 1-1 - 3G Revenue Growth Source Telecompetition Inc, February, 2001 ...................... 9 Figure 1-2 - Virtual Dublin as modeled by [HAMILL03] ...................................................... 10 Figure 2-1 Sample Mobile Device Hardware Configuration ............................................... 14 Figure 2-2 - Rendering Libraries Sample Screenshots............................................................ 18 Figure 3-1 - Edge Contraction and Gap Removal ................................................................... 22 Figure 3-2 - Culling Schemes.................................................................................................. 23 Figure 3-3 - Various Scenes Requiring Different Culling Schemes........................................ 24 Figure 3-4 - Individual Versus Fused Occluders..................................................................... 24 Figure 3-5 - Warped Impostors For Construction of 2.5D Object .......................................... 27 Figure 3-6 - Mixed Geometry and Impostors with Morphing................................................. 28 Figure 3-7 - Point-based Rendering Sample............................................................................ 29 Figure 4-1 - Vertex Removal Cost Estimation ........................................................................ 32

Figure 4-2- Level of Detail Based on Draw Distance ............................................................. 34 Figure 4-3 - Occlusion Culling Scheme .................................................................................. 36 Figure 4-4 - Occlusion Culling Overlap Testing ..................................................................... 37 Figure 4-5 - Occlusion Scheme Framework............................................................................ 37 Figure 4-6 - Impostor Placement At Street Nodes .................................................................. 38 Figure 4-7 - Street Scene At Ground Level ............................................................................. 39 Figure 4-8 - Street Impostor Draw Distance ........................................................................... 40 Figure 4-9 - Three way impostor at street joint ....................................................................... 40 Figure 4-10 - Texture Caching Framework............................................................................. 41 Figure 4-11 - Bounding Sphere Hierarchy .............................................................................. 42 Figure 4-12 - Initial Bounding Sphere Estimation .................................................................. 43 Figure 4-13- Framework for Pre-processing, Communication and Rendering ....................... 45

Figure 5-1 - Model Vertex and Face Count............................................................................. 46 Figure 5-2 - Num of Materials Vs. Frames Per Second .......................................................... 46 Figure 5-3 - Low Texture Vs. Bounding Box Rendering........................................................ 47 Figure 5-4 - Num of Materials Vs. Num of Models................................................................ 47 Figure 5-5 - Building Loading Times...................................................................................... 48 Figure 5-6 FPS With All Models Loaded............................................................................. 48 Figure 5-7 - Section Loading Times........................................................................................ 48 Figure 5-8 - FPS With Models Loaded Per Section ................................................................ 48 Figure 5-9 - Octree Creation time Vs Size .............................................................................. 49 Figure 5-10 - Splat Creation Time Vs Size ............................................................................. 49 Equation 5-11 - Formula for Calculation of Number of Impostors Per Street ........................ 49 Figure 5-12 - OBJ Versus Splat Physical File size.................................................................. 50 Figure 5-13- Splat Size Versus Level...................................................................................... 50 Figure 5-14 - Culling Techniques Versus No. Faces............................................................... 51 Figure 5-15 - Culling Techniques Versus No. Textures and FPS ........................................... 51 Figure 5-16 - Rendering Time With Impostors ....................................................................... 52 Figure 5-17 - Cow Model FPS Vs. Splat Level....................................................................... 52 Figure 5-18 - Cow Model FPS Vs. Octree Level .................................................................... 52 Figure A-1 - Bunny Model Level of Detail............................................................................. 69 Figure A-2 - Cow Model Level of Detail ................................................................................ 70 Figure A-3 - Dragon Model Level of Detail............................................................................ 71 Figure A-4 - Cup Model Level of Detail................................................................................. 72 Figure A-5 - Snake Model Level of Detail.............................................................................. 73 Figure A-6 - Lego Model Level of Detail ............................................................................... 74 Figure A-7 - Geometric Level of Detail .................................................................................. 75

4


5/80

Figure C-1 - Dublin Buildings Rendered on PC...................................................................... 77 Figure C-2 Dublin Buildings Rendered on iPaq.................................................................... 77 Figure C-3 - Street Impostor Placement .................................................................................. 78 Figure C-4 - Sample Street Impostors ..................................................................................... 79 Figure C-5 - City Scene Subdivision....................................................................................... 80

5


6/80

Declaration

This thesis has not been submitted as an exercise for a degree at any other university. Except

where otherwise stated, the work described herein has been carried out by the author alone.This thesis may be borrowed or copied upon request with the permission of the Librarian,

Trinity College, University of Dublin. The copyright belongs jointly to the University of

Dublin and Alan Cummins.

_______________________

Signature of Author

6


7/80

Acknowledgements

I would like to take this opportunity to thank my supervisor Dr. Carol OSullivan for her help

and guidance throughout the project and to Clodagh for enduring me as a partner in crime. Iam deeply indebted to Thanh and Ronan who helped to cajole this thesis into some semblance

of sense and Chris for his insight into research. I must also mention the ISG gang who always

gave helpful if not always constructive criticism and feedback, especially John who allowed

the use of his many Dublin models. Finally, thanks to my family, who, although confused by

my work hours, put up with my hermit-like existence.

7


8/80

Abstract

This thesis is concerned with the investigation and implementation of real-time display of

computer graphics on mobile devices. More specifically, this thesis uses the display of avirtual city and character models as a test bed application upon which to investigate whether

traditional speedup techniques may be used effectively on mobile platforms. Rationale is

presented for the choice of mobile hardware and software. Culling, level-of-detail, impostor

and point-based techniques are then implemented and applied to the test application on the

chosen hardware. These techniques focus on minimising the amount of rendering time

required per frame. An overall framework is suggested which incorporates these methods

while ensuring that mobile-specific constraints such as distribution and memory are catered

for. Experimentation results are then presented, showing that a system capable of real-time

rendering is feasible on such devices.

8


9/80

1. Introduction

1.1. Motivation

Presently the mobile communications market is one of the fastest growing markets (see

Figure 1-1) , despite the large discrepancies that exist between the services offered to mobile

terminal users and those available to users of fixed terminals. The Universal Mobile

Telecommunications System organisation (UMTS) has expended a lot of effort to provide a

standard approach that will facilitate the communication of rich interactive multimedia

content to terminal users and has concluded that the demand for 3G mobile data services is

real. Consumers and business users have consistently demonstrated strong interest in trying

new services that combine mobility with content and personalisation [UMTS01-1].

Figure 1-1 - 3G Revenue Growth Source Telecompetition Inc, February, 2001

Examples such as that of NTTs DoCoMo success in Japan provide apt evidence of the

fact that end-users want and need a means of communication, entertainment and obtaining

information via mobile devices. [VARSH00] offers further proof that wireless technologies

are emerging as an important and prevalent technology. Development of mobile devices such

as personal digital assistants (PDA) allows those on the move to perform a variety of

computing tasks. Customised infotainment is estimated to grow to be the second biggest

revenue producer for service providers [UMTS01-2]. Therefore the development of tools and

techniques to facilitate the production and delivery of such content is pertinent. Giving the

9


10/80

user the ability to interact in a virtual world by means of a mobile platform creates a real

challenge.

More specifically urban simulation presents unique challenges due to the large

complexity in terms of geometrical data and the widely varying visibility conditions (SeeFigure 1-2 below). Real-time rendering of cityscapes has been a goal of research for a number

of years.

Figure 1-2 - Virtual Dublin as modeled by [HAMILL03]

Many solutions have been proposed for the efficient rendering of large-scale

environments, [HAMILL03, SILLION97, FUNK93] to name but a few. However until

recently it has been infeasible to consider applying and implementing this concept in the

domain of mobile devices. It is inevitable that the computational power of todays mobile

devices will increase in order to facilitate the processing of the received content but the

capabilities will always be limited. Bridging the gap between the limitations of the device and

the processing demands of the application creates some interesting challenges. Generation of

content for real time delivery to mobile devices provides unique opportunities for novel

simplification techniques. Proposed solutions may also be applied to similar domain areas

such as the web and digital television.

10


11/80

1.2. Objectives

This thesis is concerned with the development of a rendering system for mobile devices.

The problem can be more formally stated as the development of an adaptive framework for

supporting creation and interaction within a synthetic environment with specific regard given

to the ability of users to interact remotely by means of mobile devices with real and virtual

characters.

In order to clarify this problem statement several definitions must be given. A mobile

device is described as one that can be carried by a person and has its own battery to allow

operation during transit. Examples of mobile devices include but are not restricted to: Mobile

phones, pocket PCs, laptops, smart phones, tablet PCs and personnel communication devices

such as pagers. Real-time display refers to the interactive display of information as soon asthe user requests it. Instructions should be perceived to have an immediate affect on the

environment once issued. A synthetic environment is defined as graphical, contextual or

sensual information that has been built by reconstructing real-world objects into 2.5D or 3D

representations, by interpreting scene information and reclassifying it or by associating visual

cues with actions. Virtual characters that users interact with are then deemed to be graphical,

contextual or sensual representations that give the user a sense of real interaction with others

in the confines of the given synthetic world. Given these requirements a framework

architecture must be developed. We consider an adaptive system to be one that has the abilityto change and re-assess its operation based on the environment it is functioning in. It must be

dynamic, proactive and reactive to the changing requirements placed on it. These

requirements vary from network latency to memory management. The system must be able to

function with hardware of varying capability, restructuring itself to always give the end-user

the highest possible visual and informational fidelity. The framework should list a set of

assumptions, concepts, values, and practices that constitute a way of viewing reality on a

mobile device. It should contain a structure for supporting and encapsulating the rendering

and distribution of a virtual environment from the end-user and from the developer of such a

system. Interactivity in the system is defined by the ability to allow the end-user to interact

with the system by means of simple navigation and control. The system should also allow the

user to interact with other remote users or with virtual avatars present in the scene.

Given this problem statement the main goal of this thesis is to create a viable tool that can

satisfy the requirements of adaptability, inter-connectability and real-time rendering for

mobile devices. However mobile computing imposes inherent limitations, which means that

novel techniques must be employed to allow real-time rendering. These restrictions include

but are not limited to screen resolution, storage capacity, processing power and distribution

11


12/80

artefacts. Visualisation rather than distribution is emphasised with display of characters and

virtual environments being the end goal. In development of such a system it is desirable to be

able to tailor the application to any device capability. Implementation of a solid framework is

required to ensure that an appropriate rendering flow can be chosen dependent on theunderlying constraints imposed by a given mobile device and communication channel. Based

on these factors work in the areas of scene and resource management, occlusion culling,

geometric and impostor-based simplification are investigated. A framework is then proposed

that allows for interactivity in a dynamic and adaptive virtual environment.

1.3. Scope

As previously discussed the implementation of a framework for the development and

rendering of virtual characters and locations is required. The scope of the project is narrowed

to deal with problems associated with architectural walk-through and high detail object

rendering, examples of which are character models. Emphasis is placed on the use of level-

of-detail and impostors in combination with resource management to form the backbone of

the framework. Regard is given to the distributive nature of the project but the actual

communication methods and implementation of such are left as separate work. More

specifically the system should provide tools for creating content as well as creating a viewer

for that content on a given mobile device. In order to achieve the objectives the scope is

limited to development on a single hardware platform with communication issues handled by

a single network protocol.

1.4. Summary of Chapters

The remaining chapters are organised as follows:

Chapter Two: Hardware

Chapter Two presents an overview of the various mobile technologies that are currently

available and the rationale by which the final hardware was chosen. The mobile technologies

discussed are split into three areas, namely: hardware devices, software (operating system and

development tools) and network communication protocols.

Chapter Three: Related Work

Related Work discusses previous work in the areas of level of detail, culling, impostors and

point-based rendering. Each area of research is related to the specific application area of

12


13/80

mobile graphics and problems and benefits of such are highlighted. Previous framework

applications and distribution schemes are also detailed with regard to scene management for a

city walkthrough system.

Chapter Four: Implementation

Chapter Four provides an explanation of the techniques chosen for implementation. These

techniques as previously discussed in Related Work are applied to the specific domain area

of mobile computing. An overall framework architecture is then presented which incorporates

all the techniques implemented.

Chapter Five: Experimentation Results

Experimentation Results presents measurable results that have been obtained from the

system. Each component of the framework is evaluated as a separate process and then as part

of the larger overall system.

Chapter Six: Conclusions

Chapter Six summarises the topics discussed in the thesis and describes future research

directions.

13


14/80

2. Hardware Aspects

This chapter gives an overview of the mobile devices currently on the market that cater

for real-time graphics development. The defining requirements of the application are theability to render 3D graphics in real-time and to allow the end-user to interact with avatars in

a virtual environment. Therefore it is vital to consider all factors that affect these goals

including: Hardware capabilities, availability of software and communication protocols. The

market leaders in mobile devices, HP iPaq, Nintendo GBA, Nokia and Palm [5, 11, 12, 16],

were considered and assessed based on the factors discussed below. The rationale for the final

choice of hardware and software is provided giving credence to the device suitability in the

short and long term. Various network technologies are also considered and an appropriate

communication protocol is chosen.

2.1. Mobile Hardware

Accurate performance ratings are difficult to ascertain for mobile devices due to the many

different technologies vying for market dominance. Without a standard set of applications by

which to measure true computing power fair benchmarking between mobile devices is tricky.

Therefore pure computing metrics cannot be used to determine the most suitable device. Less

definable aspects such as usability must also be examined in order to ascertain whichhardware device will fit the specific application area of virtual environment interaction.

Figure 2-1 Sample Mobile Device Hardware Configuration

Appendix B provides a tabulated set of hardware specifications for current mobile

devices. It is evident that mobile hardware has not been standardised and many important

features are implemented in vastly differing ways. Several important factors must be

considered when choosing appropriate hardware. Factors considered were:

14


15/80

Portability: The device must fall within the confines of battery-operated devices to

maintain portability. This constraint reduces the feasibility of onboard processors

over and above the main processor board due to power consumption. Traditional

graphics processors continue to drain prohibitive amounts of power and as such manytruly mobile devices in the current marketplace rely on the main processor to handle

all computation rather than use a wholly separate processing unit. All of the main

mobile hardware vendors provide solutions that are sufficiently portable with battery

length and physical size differing only slightly.

Feasibility and Future Proofing: Choice of hardware should depend on the devices

current success and any future technology that may be incorporated into it. The

device should have sample applications that demonstrate that it is capable of

rendering graphics at a reasonable speed. Devices based on Palm Pilots and PocketPCsolutions, while popular, remain the choice of a select few whereas possession of

traditional mobile handsets has exploded. However mobile phones have little room

for future proofing and have already begun to accrete functions from devices similar

to the ones previously mentioned.

Performance: Base-line performance is an important factor when determining which

hardware to choose. The hardware must be capable of basic computation before

considering more complex applications. Processing and memory speed is an

important issue in determining if a device is capable of rendering graphics in real-

time. Devices that rely solely on external memory cards for storage will cause slow

down when transferring information and as such may be unsuitable. Mobile phones

continue to increase in performance but at present are unable to cope with true 3D at

real-time rates. PDA based devices have more powerful processors and come

equipped with standard memory sizes that allow for useful development.

Screen-size: Screen-size varies from device to device but it is assumed that a usable

display should at least cater for more than a typical mobile phone screen size and as

such is 240 X 320 pixels in size. Colour screens are a necessity as monochromatic

screens will not differentiate items on screen as clearly and will cause the system to

be unusable. All vendors provide colour screens with mobile phones typically having

4096 colours and larger form factor devices having in the order of sixteen thousand.

User Interface: It is necessary to consider the usability of a given device. A typical

device should ideally have an intuitive interface, which allows for effective user

interaction. Most mobile device manufacturers have provided a touch screen and

stylus, miniature keyboard or several buttons as physical input to the system. The

15


16/80

Open Mobile Alliance (OMA) [15] is supporting standardisation of user interface.

See Figure 2-1 for sample hardware configurations.

Expandability: As the mobile hardware market is still fluctuating it is imperative to

choose a device with the capability to use any new technologies that may appear. Asmost of the mobile hardware devices are limited in physical storage it is essential to

consider only those devices that allow for easy access to extraneous memory modules

while not depending on them. There are many varying communication protocols that

require differing hardware to function. Therefore it is relevant to consider whether a

given device can support multiple communication hardware. Mobile phones have

limited or no expansion capabilities while PDAs allow for memory and

communication upgrades.

Considering each of these factors the choice of device has been determined to be one that

has a stable hardware base, large colour screen, is versatile, portable over and above

traditional laptop machines, computing performance at the top of its category and the means

by which to extend its functionality.

2.2. Mobile Software Tools

Software tools are in a fledgling state for mobile devices especially in terms of the

availability of development libraries and tools. The actual hardware chosen depends not only

on raw computing performance but also on the ease of development for that hardware.

Adjunct to this, current software availability gives an insight into the feasibility of

implementation of a virtual environment on a portable device. This section introduces some of

the programming languages, operating systems and third party libraries that are available for

mobile devices.

Programming languages typically used for mobile devices can be split into three general

sections, namely:

Mark-up Language: Mark-up languages used for mobile phones are typically based

on a subset of HTML 4.0 and chosen for development of simple text and graphic

based applications. Use of mark-up language for graphics applications is not feasible

in the current form but should not be ignored as a means of developing a syntax

language for scene description and associated attributes. See the VRML

specifications [21] for more details. Use of XML (extensible mark-up language) may

prove useful as a basis for a grammar for use in a mobile graphics system.

16


17/80

Java: Java is the standard technology for development on mobile handsets. It is also

the only technology that is compatible across all hardware platforms through use of

third-party java virtual machines. However, it lacks (in its current form) a solid 3D

graphics application-programming interface. As standards such as the Java MicroEdition [7] and Binary Runtime Environment for Wireless (BREW) [1] progress

further functionality will be added.

C++ / C: The traditional language, with straight assembler, for fast graphics

implementation is predominantly used in the larger dual devices (voice and office

utilities) such as the Palm Pilot and PocketPC based devices. It offers the best access

to low-level hardware attributes that are required for development of fast graphics

algorithms. See Microsofts Embedded Visual Studio Developer resources [8] for

complete details.

The mobile operating system controls access to the underlying hardware. It is imperative

that the operating system facilitates rather than hinders development of fast graphics routines.

The main operating systems for mobile systems are:

PocketPC 2002: PocketPC 2002 [9] is based on the previous WindowsCE 3.0

operating system improving on connectability, security, personal information

management (PIM) features and overall reliability. Although more complicated to

develop for than PalmOS based systems, it offers a more direct interface into the

devices hardware. Importantly, PocketPC 2002 has been slimmed down and is

intended for use in Smart Phones allowing for some future proofing.

Symbian: SymbianOS [19] was designed and developed for small mobile devices,

mass market, and occasional wireless connectivity and as an open platform for third-

party developers. Major mobile vendors such as Nokia have chosen Symbian as the

operating system of choice for their new hardware. SymbianOS differs from

PocketPC 2002 mainly due to the fact that it implements most of its multi-tasking

through event-driven messaging rather than with multi-threading as on PocketPC

devices. Multi-threading is complex, prone to errors and expensive (there can be

several kilobytes of overhead per thread). Therefore SymbianOS is considered more

efficient if less powerful.

Palm: PalmOS [16] based systems are the most common mobile platform in use

today. Consequently it has a large software base but as it focuses on simplicity,

expandability and mobility, it places emphasis on designing an operating system that

does not simply mimic a cut down PC but rather allows the user to quickly gain

17


18/80

access to information and content. In its simplicity it however lacks in some

important areas such as wireless services, secure transactions and graphics compared

to PocketPC based systems.

Figure 2-2 - Rendering Libraries Sample Screenshots

There are several graphic APIs available. Each offers varying levels of functionality and a

comparison review of all engines is required to determine if any are suitable as a starting point

for further development. The development libraries can be broken down into those that act as

adjunct libraries such as EasyCE and GAPI [3,4] and those that act as standalone libraries

such as Diesel Engine, MiniGL, PocketGL, TinyGL and X-forge [2,10,18,20,23] (See Figure

2-2) . A standard graphics library specification based on the OpenGL standard [13] is being

drafted by the Khronos Group [14] but no standard is adhered to in the current batch of third

party graphic libraries.

2.3. Communication Protocols

We shall now consider wireless communication protocols that may be suitable for use in

the proposed virtual environment for communication of events between independent users. Itshould be noted that there are two distinct wireless technologies available for use: Wireless

LAN versus traditional cellular network. Cellular networks remain the most popular and cost-

effective way to communicate but new technologies such as 802.11b and Bluetooth can be

easily configured for use within the scope of this project.

Digital cellular radio networks such as Global Systems for Mobile Communication

(GSM), General Packet Radio Service (GPRS) and Code Division Multiple Access (CDMA)

can be accommodated through the addition of a cellular card to the mobile device. Non-

traditional wireless technologies such as 802.11b and Bluetooth typically come as standard on

18


19/80

most devices and are suitable for small-scale local area communication. Each has benefits in

terms of bandwidth, underlying network infrastructure and quality of service and the reader is

referred to resources on the Wireless Developer Network [22] for further information.

Additionally, in order to enhance the user experience, technologies such as the GlobalPositioning System (GPS) can also be incorporated.

It should be noted that the underlying network infrastructure configuration and

implementation of the distributed component of the project has been left as separate work,

which is currently ongoing by [ROSSI03]. However factors worth considering when choosing

a communication method include: scalability, kinds of media supported, integration of media

elements, delivery channels, quality potential for target, distribution model, infrastructure

requirements and cost. While not focusing on the distributed aspects of the application the

constraints of the above communication methods are adhered to in development of thesystem. Details of such are discussed in Chapter Four.

2.4. Choice of Hardware

The only viable decision, considering all the factors discussed above, is a choice between

a high-end Palm Pilot and a device such as the Compaq iPaq. Solutions such as the GBA and

Playstation Portable (PSP) [11,17] that focus on the graphic capabilities of the device are not

currently applicable due to the lack of support for wireless communication. The Palm Pilot, in

its current form, lacks the performance necessary for any valuable graphic work to be carried

out. PocketPC based systems, such as the iPaq, currently run on ARM or XScale [6] based

boards and in combination with the operating system provide the most powerful solution. The

latest version of the PocketPC client comes with increased memory and processing

capabilities suggesting that it is the platform of choice (See Appendix B, IPaq H550). Taking

this choice of operating system into account, the memory management and screen handling

functionality provided by PocketPC based systems more closely matches that of normal PC

development. Furthermore, the majority of development of graphics-based applications for PDA devices has been carried out predominantly on WindowsCE / PocketPC based systems.

This indicates that it is the most suitable platform currently and will remain so with a strong

development platform of Embedded Visual Tools provided by Microsoft for further work.

Future hardware such as the Smart Phone and new Intel XScale technologies will all use

PocketPC or a variant of such and therefore guarantees some future proofing. As discussed in

[RHYNE02], handheld devices are slowly converging but a common hardware and software

platform has yet to emerge.

19


20/80

In terms of communication protocols, cellular technologies may be useful for future

development but are infeasible due to the large amount of infrastructure required. Local area

wireless communication is the only alternative. Bluetooth is not suitable for wide area

networks (WAN) as it is purely for communication over short distances so the 802.11bcommunication protocol has been chosen. PocketGL [18] has been chosen as the base

graphics library on which to base further development. As a relatively cheap and high

functionality library it offers the most visibility into base code and having been designed to

mimic the OpenGL standard it is the most suitable.

20


21/80

3. Related Work

This thesis seeks to provide solutions to fast rendering on mobile devices. In order to

achieve this goal, previous research in the areas of simplification and acceleration needinvestigation. More precisely, previous work in the areas of geometric level-of-detail, culling,

impostors and point-based rendering are discussed. These techniques are deemed to be the

most suitable to adaptation to a mobile paradigm. The discussion proceeds from basic

rendering techniques through to more complex algorithms that affect the underlying visual

structure of a given scene but which allow for increased frame-rate. Prior research into the

constraints imposed by network distribution and how these are incorporated into workable

frameworks that make use of the techniques discussed above is also presented. Finally,

common best practises are discussed as demonstrated by previous work.

3.1. Level of Detail

Surface simplification is increasingly important as the complexity of geometry increases.

New demands are also being placed on rendering systems to provide an increased level of

perception of visual information on lower powered devices than ever before. Through the

prudent application of simplification, geometry can be stored, retrieved and displayed more

efficiently. Simplified geometry may also be used as a proxy object in calculations in order tosimplify their results. Examples include object intersection testing, lighting calculations and

visibility determination. More specifically, simplified objects may be used as a part of a level-

of-detail scheme to provide a set of geometry that describes an object at varying complexity.

The choice of when to display each level-of-detail then becomes a decision based on visual

fidelity and time constraints for a given frame. [HECK97] and [PUPPO97] review many of

the techniques currently in use while [GARLAND99-1] discusses future opportunities

available in the area of multi-resolution modelling. Mesh simplification can be classified as

those based on budgetary constraints such as the time to render and those that are based onfidelity. Fidelity-based methods are more complicated as the decision then becomes a

function of application and end-user consideration. [LUEB03] discusses these classifications

further.

Typically surface simplification is obtained by decimation of triangular meshes

[SCHROE92] (see Figure 3-1) or by re-tiling an object [TURK92]. Careful consideration is

given to determining which triangles will cause the least visible change while also giving the

greatest reduction in geometry size. [WATS01] describes techniques for measuring and

predicting visual fidelity with regard to geometric simplification and these findings can be

21


22/80

incorporated to ensure the least visible change. Techniques described by [RONF96,

MELAX98] function by determining simple error metrics that calculate the cost of removal as

a function of connected vertices and curvature of the surrounding area before and after

removal of a vertex or edge. [GARLAND97, HOPPE93] use more complicated heuristics thattry to minimise visual distortion. Note that geometric distortion is not the only factor that

must be considered. Colour plays a large part in shape recognition and [CERT96] assigns

importance to both in order to effectively create multi-resolutional models. [KHO03] takes

this concept one step further and allows the end-user to affect the choice of areas of

simplification. Thus if a face is being modelled, users can specify to maintain complexity in

important features such as eyes.

Figure 3-1 - Edge Contraction and Gap Removal

Methods for determining the level-of-detail representation can use the inherent scene features

as part of their heuristic for determining good candidates for removal or simplification.

[REMOL02] deals with vegetation simplification and uses the trunk and leaves as separate

areas requiring simplification. Characteristics associated with a given technique can also be

tailored and combined to produce optimal effects. [GARLAND02] describes a multiphase

approach that combines the efficiency of clustering methods with contraction methods that

typically produce higher quality images. Previous work presented by [ZUNI03] indicates that

multi-resolution techniques have value in terms of rendering on mobile devices.

22


23/80

3.2. Culling

A fundamental problem in computer graphics is that of avoiding extraneous work when

rendering a given frame. As the size of a given dataset increases, the more imperative it

becomes to do the minimum amount of work to achieve an acceptable frame-rate while

maintaining a consistent world-view on screen. So as to avoid wasting computation time on

invisible portions of a scene, culling techniques have been developed that discard polygons

that do not contribute to a given viewpoint. Traditional Z-buffer removal is prohibitively

expensive for large datasets and even more so when no hardware graphics acceleration can be

relied upon, such as the case in mobile devices.

Culling can be broken down into three areas (See Figure 3-2) : View frustum culling refers

to the removal of objects that lie outside the extents of the viewable area on screen. Back-faceculling refers to the removal of any part of an object that faces away from the camera and

occlusion culling refers to the determination and removal of objects that overlap and hide

polygons from the current view direction.

Figure 3-2 - Culling Schemes

Typically culling is carried out by firstly culling items from the view frustum, then

determining those items that are fully and partially occluded by overlapping and then finally

removing back-facing polygons from the remaining set. It is in this final stage of culling that

much varied work has been carried out. Determination of the best method of occlusion culling

depends on the type of scene that is to be rendered (See Figure 3-3) . Scenes such as an

architectural walk-through are densely occluded and the camera can be assumed to be close to

the ground whereas in an indoor scene the occlusion can be greatly simplified by only

23


24/80

considering subsets of the actual screen space in the occlusion calculation. Scenes may also

contain non-trivial items such as an object that contains high detail hidden within a

compartment. The complexity of culling details in that scene then increases dependent on

whether the innards of the object are viewable or not.

Figure 3-3 - Various Scenes Requiring Different Culling Schemes

A comprehensive survey of visibility determination technologies is presented in

[COHEN00]. Using Cohens taxonomy, culling techniques can be split into those that

perform computation based on the current point of view and those that use a priori knowledge

valid for a certain region of space. The criteria for choosing a suitable culling algorithm needs

careful consideration when implementing on a low power device. Culling can be further

subdivided into those that provide tight fitting, conservative potentially visible sets of objects

(PVS) and those that use less accurate but more efficient approximate PVS. Algorithms that

account for fused occluders may also be necessary (See Figure 3-4) .

Figure 3-4 - Individual Versus Fused Occluders

24


25/80

In addition to these criteria the benefit of using techniques with large pre-computation stages

versus those with large run-time computation costs must be weighed up. High pre-processing

costs may remove most of the computation costs at run-time but will incur penalties in terms

of storage and transmission costs. It is necessary to choose the technique that will allow thecost of rendering the scene to be independent of the overall environment complexity.

[COHEN98] splits the viewspace into cells that contain the set of triangles viewable

from a point within the cell. In order to build this set, triangles that are guaranteed to be

hidden are searched for and strong occluders are found. Occluders and occludees can be

individual triangles, objects, bounding boxes or the cells themselves. Similarly [SAO99,

SCHAU00] use octrees and [COORG96] uses kd-trees to split the scene into manageable

sections from which to build PVS. [MEISS99] discusses several novel algorithms for efficient

scene subdivision for use in occlusion culling. Ultimately the use of scene subdivision shouldensure that large portions of a given scene can be culled without requiring investigation of

individual triangles, which may be numerous.

As previously stated, the type of scene to be rendered greatly affects what type of

occlusion culling will be most effective. [TELL91] uses the concept of cells and portals in

order to efficiently cull objects in indoor scenes. This technique uses the scenes characteristic

features to help speed up occlusion culling and obviously does not fit the characteristics of a

typical outdoor scene. Techniques such as [COORG97, SCHAU00] are much better suited to

outdoor scenes and more specifically architectural scenes where individual and groups of large occluders commonly exist in a given viewpoint.

3.3. Impostors

Another technique commonly used to simplify the rendering of a complex scene is that of

impostor-based rendering. Instead of rendering high triangle objects every frame, an image-

based primitive can be used to replace but sufficiently represent the high cost geometric

object or area in a scene, thus ensuring that the rendering time is independent of the geometricscene complexity. Impostors cut the cost of rendering as they typically contain only one or

few textures and have simple geometry compared to the multiple textures and complex

geometry they replace. However, it should be noted that as [RUSH00] discusses,

determination of whether the replacement of geometry with textures is perceptually effective

is not easy and depends very much on the object or scene to be replaced. [CHEN95, MACI95,

MCMILL95-1, ALIAGA96, WIMM01] provide discussion of various frameworks for the

display of scenes using pre-rendered or dynamic impostors. While each varies in specific

implementation, each commonly relies upon the use of scene subdivision to help to efficiently

25


26/80

generate impostors for every viewpoint in the scene (for further example see [XIONG96,

AGAR02]). Once scene subdivision has occurred, determination of where and when to

display the generated impostor is considered. [MCMILL95-2, SILLION97, ALIAGA99-1,

WILSON01] give good example of some of the issues involved. Finally, techniques used for improving the provided impostor are implemented. Warping is one such example [GOTZ02].

However cost is incurred to create, store and retrieve these images. Related impostor-based

work can be split into the following areas of work:

Work that deals with the specific application area of city walkthroughs.

Impostor-based techniques that have been applied to the display of individual objects

as a series of pre-rendered images

Display algorithms that incorporate depth into the impostor images for added realismand longer cache life.

Image warping methods for seamless integration of impostors with surrounding

geometric objects and for transition from image to geometry.

Memory and data management issues that surround the use of images to represent

geometry.

Specialised compression routines for efficiently storing and retrieving impostors.

The following discussion considers related impostor-based work with regard to the above

classification.

Artefacts caused by the use of impostor techniques include resolution mismatch,

rubber sheet effects, cracks, missing scene information and inconsistent representation of the

actual geometric primitives. The use of depth-based techniques can alleviate such problems

but moreover the addition of simple depth can add a magnitude of order of image quality to a

scene. Essentially depth or layered images can be considered as an ordinary texture map

enhanced with a depth per pixel value. These values can be then used to extrude more

complex feature-rich geometry over and above the use of simple quad-based impostors,

combating the shortcomings of image-based methods for scene representation. [SHADE98,

DECOR99, OLIV00, WILSON03] all make use of depth disparity information to allay the

difficulties discussed above. [CHANG02] makes use of depth images to effectively render

image based objects on mobile devices giving credence to their importance when rendering

on limited devices.

Image objects are based on the concept of determining a set of viewpoint images for a

given object as a pre-processing step and then during runtime shifting between these stored

viewpoints to give the appearance of a fully 3D object. (See Figure 3-5)

26


27/80

Figure 3-5 - Warped Impostors For Construction of 2.5D Object

These image-object techniques rely on the use of warping techniques to bridge the gap

between subsequent viewpoints as only a discrete set of orientations are stored. [MAX95,

SCHAU97] use this method to create simple but effective image object primitives while

further work by [DALLY96, BAYAK98, OLIV99] extend the concept to incorporate

complex geometric models, maximising the image fidelity while minimising the associatedoverhead. Most recently [DECOR03] uses billboard clouds to automatically build up a single

texture map and associated transparency map that most efficiently and effectively represents a

complex object. These techniques all have benefit in terms of performance and the ability to

use simplified textures as a form of level-of-detail.

In order to avoid strong noticeable distortions in the rendered image when using

impostor-based methods, warping of geometry and / or images may be used. Through the use

of warping, distortions brought about by viewpoint translation and rotation can be

compensated for. As impostor images are only valid for a certain subset of viewpoints,warping can be used to hide the transition from geometry to image [ALIAGA96] and from

image to image [SCHRO00]. (See Figure 3-6)

27


28/80

Figure 3-6 - Mixed Geometry and Impostors with Morphing

Warping may also be used to create new approximate images based typically on two reference

images [MARK97, RAFF98, PAJAR02] and for temporary display while the system retrieves

further information such as is the case in a network-based application where latency can cause

unavoidable delay [MARK96]. Such techniques come at a cost in terms of an extra processing

step at runtime, which must be weighed against the benefit in terms of visual fidelity. Use of a

threshold model such as in [RAMA99] may be useful for finding the perceptual threshold for

the human visual system when detecting artefacts. This model could then be used to predict

visual fidelity when geometry is replaced with impostors.

As each image-based representation generates at least one new texture, compression

techniques are essential to minimise the amount of data necessary to efficiently render a

scene. [LI01] provides an examination of several more commonly used compression

techniques while [GUEN93, COHEN99] focus specifically on exploiting frame-to-frame

coherence while animating texture intensive scenes. [WILSON00] provides further work based on incorporating compression schemes based on MPEG into an impostor-based system

to render a given scene in real-time. Ultimately the method employed for compression should

ensure that the amount of work required to compress and uncompress the stored data does not

exceed the cost of transmitting and rendering the related uncompressed impostor.

When using impostor-based techniques, the underlying scene properties can be used

to speed up the retrieval and use of images generated as a pre-process or during a previous

frame. Work by [SHADE96] for example exploits path coherence to generate a cache of next

28


29/80

most likely viewable impostors. Similarly [SCHAU96, WIMM98] make use of scene

partitioning and separate image caches to ensure that each frame can be efficiently rendered.

3.4. Point-Based Rendering

Point-based rendering is concerned with the determination of the minimum set of point

samples that cover the surface area of a given object. Level-of-detail representations can be

obtained, retrieving multiple sets of point samples per object (See Figure 3-7) .

Figure 3-7 - Point-based Rendering Sample

These point-samples can then be rendered using differing primitives depending on the output

quality required by the application. [GROSS96] provides in-depth discussion of the general

concepts involved in point-sampling. By using point-sampling increased efficiency in

rendering times can be obtained. Any geometric object can be converted and stored in such a

representation and further techniques such as smoothing, reduction, refinement and editing

can be carried out using the point-sample alone [LIND01]. Point-based rendering also has the

added benefit of suitability for transmission across variable bandwidth conditions, such as

those found in wireless technology. As [RUSIN01, WOOL02] suggest, efficient streaming of

object data can be carried out with an incremental transmission of point samples starting withthe lowest level of detail i.e. smallest point sample, largest displayed splat size through to the

highest level of detail with the largest point sample but smallest onscreen size. The sampling

and stratification scheme employed to obtain the point sample set varies and can be tailored to

suit the application. [WAND02] uses a scheme that allows for easy interpolation between

successive key-frames used in the animation of an object based on previous work in

[WAND00, WAND01]. [ZACH02] however uses a point-based technique that is suitable for

the rendering of vegetation during a terrain flyover. Work by [RUSIN00] focuses on the

interactive display of high volume data such as that obtained by laser scanning and

29


30/80

digitisation of an object. [ZWICK01] is concerned with point sampling an object and

maintaining connectivity between the sample set and the textures associated with the original

model. More complicated techniques that incorporate Fourier analysis [PAUL01] and

wavelets [GUTH02] can be used to form highly compressed data structures that accuratelyconvey the object to be represented as point samples.

3.5. Scene Management

Management of scene information is required when dealing with large amounts of data in

a walkthrough application. This especially applies in the case where memory capacity is

limited, quality of service for client-server communication is not guaranteed and techniques

of varying benefit are being applied to this data. [FUNK92] discusses just such an application

where the system is split into pre-processing, modelling and walkthrough phases. A cache of

suitable representations is also maintained. This framework is extended in [FUNK93] to

perform a constrained optimisation to choose the best representation from the cache for the

current frame. Predictive techniques are later included [FUNK96] in order to maximise the

benefits of the cache and to ensure that its contents remain valid for as long as possible. Both

[MANN97] and [COHEN97] present interesting work that transmits only initial reference

scene information and then subsequently allows the client to render extrapolated views.

Difference images are then transmitted only when an error metric is exceeded. [LIJ01] deals

with a system that exclusively uses image-based representations to render the scene and stores

the surrounding set of nodes for a given viewpoint while [GOBB01] deals only with the

management of multi-resolution meshes. [ZACH01] considers the use of pre-fetching policies

based on optimising the cache with regard to culling. [ROSSI03] discusses work already

carried out on the system that divides the scene into sections that are transmitted to the client

based on the position of the avatar in use. Perceptual metrics can be employed to determine

which resolution of texture should be transmitted and displayed such as in [DUMON01].

Discussion of previous work in the area of ad hoc location based services [HODES99]and algorithmic considerations that need revision due to the nature of mobile devices

[SATYAN01] are left to one side. However, it must be noted that aspects such as mobile

fidelity metrics, latency [ARON97], bandwidth usage, prediction [CAI99] and state

consistency must be factored into the review of previous scene management techniques.

[MACED94] relates these issues to a real-world example.

30


31/80

3.6. Common Themes

Many common themes run through each of the related works above. It is evident that

different techniques work more efficiently for different types of objects and scenes. For

example, point-based rendering is most useful when displaying high-detail geometric objects

while impostor-based techniques are effective when displaying large distant architectural

planar models. Meanwhile, geometric level-of-detail methods typically produce more

pleasing visual results when they are applied to high-detail non-planar objects. Culling, point

rendering and impostor techniques make heavy use of scene subdivision techniques, which

therefore must be robust and efficient. The determination of the fidelity of a scene plays an

important part in ascertaining if the technique being applied is successful. Choosing

techniques that only give benefit in terms of frame-rate is not sufficient. Use of perceptualmetrics along with geometric metrics will provide the best results. Scene management

methods must be used to ensure that the overall processing time does not exceed the straight

rendering of the scene. Factors that affect this condition include texture cache miss-hits,

network transmission times, network stability and overall resource size. A combination of

techniques may be used, but careful management of which types of objects and scenes to

apply them to is required.

31


32/80

4. Implementation

This chapter gives details of the implementation of the system. The system, as discussed

in Chapter One, focuses on reducing the overall rendering time and ensuring that bothcharacter and architectural models are catered for. All architectural models used come from

previous work by [HAMILL03] and use the Wavefront OBJ file format. Implementation is

split into geometric level-of-detail, culling, impostor and point-based rendering techniques. A

framework within which each of these methods sits is then described.

4.1. Level of Detail

In order to combat crippling polygon counts a geometric level-of-detail approach has been

taken. This allows for the production of multi-resolutional meshes that can be iteratively

transmitted to a client device dependent on the visual quality required. Metrics based on

[MELAX98] and [GARLAND97] were implemented and pseudo-code for their calculation is

included. Figure 4-1 gives a brief overview of some geometric considerations that affect the

quality of the resultant simplified object.

Figure 4-1 - Vertex Removal Cost Estimation

Work focuses on the removal of small faces that contribute little to the overall object shape

while still ensuring that high contrasting detail remains intact. Several different cases exist

when removing and replacing triangles. Typically, small triangles should be removed before

larger triangles, as the overall visual impact is less. Smaller triangles are deemed to be those

with both smaller surface area and shorter average edge length. Surfaces that are coplanar

32


33/80

should be merged first rather than merging two surfaces that previously had large incident

angles. Care is also taken to avoid replacing triangles that cause the resultant normal to flip.

The change in normal to a given face indicates that the shape of the face has been drastically

changed and will be visually noticeable.

SlimModel()Load Object Information

WHILE Number Faces > Required Number Faces

DO FOR all Vertex

Determine Cost_Of_Moving_Vertex()

END

Get Minimum Cost Vertex Removal

Remove that Vertex

Update Edge List

END

Cost_Of_Moving_Vertex_Complex()FOR All Connecting Vertices To Current Vertex

DO Determine Lowest Angle Between Faces Containing This Edge

Determine Length of Edge

Determine Area of Triangles Possibly Removed

IF Combined Cost < Current Cost

THEN Mark Edge As Most Favourable Candidate For Removal

END

END

Cost_Of_Moving_Vertex_Simple()FOR All Connecting Vertices To Current Vertex

DO Determine Area of Triangles Possibly Removed

Score Face Removal Based On Area And Affect On Surrounding Face Normal

END

Table 4-1 Geometric Level of Detail Pseudo-code

As the pseudo-code shows the algorithm is applied iteratively until the percentage

simplification is achieved (See Table 4.1). Once completed, multiple levels-of-detail are

integrated into a single file format which allows for streaming of successive levels-of-detail

from low detail to the full original high detail model. These level-of-detail representations are

33


34/80

then displayed based on a distance metric and approximate rendering time cost. If the object is

far from the camera then a low detail model need only be displayed.

Figure 4-2- Level of Detail Based on Draw Distance

The level-of-detail then increases as the object moves towards the camera (See Figure 4-2) . In

addition to this, if the estimated rendering time is less than a pre-determined value, then a

lower detail model is chosen so that interactive frame-rates can be maintained. Frame-rate

estimation is calculated as a combination of several previous frames combined with a time-

penalty based on the objects currently viewable. The texture quality also increases as theobject moves closer to the camera. Sample level-of-detail for an object can be seen in Figure

A-7.

4.2. Occlusion Culling

Culling is effective in removing large portions of a scene that are not viewable for a given

viewpoint, thereby reducing the amount of rendering time required. As detailed in Chapter 3,

culling can be broken into three areas: frustum, back-face and occlusion culling. Frustum

culling is implemented using work by [GRIBB01]. Back-face culling is implemented by

examining the normal to the viewpoint.

Occlusion culling is concerned with the determination of a potentially visible set of

occluders for a given viewpoint. Using this set of potential occluders, occludees are

determined and removed from the scene. Potential occluders are deemed to be those that are

larger than a given volume or with a dimension greater than a certain value. Once determined,

the occluder set is re-examined and any objects deemed in close proximity are fused to allow

34


35/80

for occluder fusion checking. Objects are deemed occluded when checked against the list of

potential occluders and overlap has occurred. Full pseudo-code is given in Table 4.2.

Occlusion_Cull()Get Potential Occluder Set

FOR every Octree Node

IF Node Outside View Frustum

THEN Cull Node

ELSE FOR every Potential Occluder

IF Potential Occluder_In_Front_Of_Node()

THEN Cull Node

ELSE Check Next Level of Octree

END

END

END

Occluder_In_Front_Of_Node()IF Eye to Center Occluder > Eye to Center Node

THEN RETURN No Overlap

ELSE Project Node Bounding Box to Screen Space

Project Occluder Bounding Box to Screen Space

IF Occluder Overlaps Node In Screen Space AND Overlap > Percentage

THEN RETURN Overlap Occurred

ELSE RETURN No Overlap

END

END

Table 4-2- Occlusion Culling Pseudo-code

Overlap testing requires determination of the screen-space coordinates of the objects to betested. Once the screen coordinates of the objects are calculated a simple overlap test can be

performed. (See Figure 4-3)

35


36/80

Figure 4-3 - Occlusion Culling Scheme

In order to speed up the process of overlap testing, single objects within the scene are

tested only once the bounding box in which they reside has been determined as viewable.

The scene is split by use of spatial subdivision such as in [SAO99]. This involves splitting the

scene into an octree subdivision data structure and all occlusion testing occurs as follows:

Testing of the current octree node, testing of the current object bounding-box, testing of

individual object faces. An example subdivision is shown in Figure C-5. By testing in this

order expensive face testing need only occur on those objects that have a strong possibility of

visibility. Use of simplified bounding boxes in culling testing greatly speeds up the

calculations involved but several problems arise from their use. As illustrated in Figure 4-4,

bounding boxes can be ill fitting. This can occur when a building consists of a large tower

with a lower flat section. The bounding box includes a large area of screen-space that does not

contain any object faces. This causes inaccurate results when determining if objects have

overlapped.

36


37/80

Figure 4-4 - Occlusion Culling Overlap Testing

The overlap test must also take into account the amount of overlap that has occurred. If the

overlap result is less than a certain percentage it should be ignored and the object drawn. The

system also tries to avoid inefficient testing where the overlap is obviously never going to

occur, i.e. the overlap test tries to return success or failure as soon as possible without going

through all overlap cases.

Figure 4-5 - Occlusion Scheme Framework

The overall framework for culling is illustrated in Figure 4-5. The scene is divided into

octree regions, view frustum culling occurs, the potentially visible set is obtained, occluded

37


38/80

objects are removed, back-face culling occurs and then the final render is calculated. The

culling algorithm does not consider dynamic objects i.e. those that can change shape or size.

Character models fall within this criterion as they can move about the environment. Their size

can have a large effect on what is and isnt occluded if significantly large. For simplicity thesystem makes the valid assumption that dynamic objects such as these cause little difference

in occlusion culling calculations. This assumes that the dynamic objects are relatively small in

comparison to the scene in which they reside. The only time at which they cause major

occlusion is when they are directly in front and very close to the camera. If this situation

occurs then the system simply displays the character and no other objects.

4.3. Street Impostor

Techniques such as point-based rendering and geometric level-of-detail are not suitable

for the rendering of large architectural models. These techniques generally work better on

models with smaller facets. Architectural models have many large co-planar faces. Impostor

techniques are much more suited as they can simplify large portions of the screen with little

perceivable difference in image quality. Impostor generation can be seen as a two-phase

process: Initially impostors need to be generated for all possible views in a scene, then re-

triangulation and 3D re-projection needs to occur at render time. Determination of all possible

views is achieved by constraining the system to use in architectural walk-through

applications. This constrains the user camera to a position parallel to ground level with no

overhead shots of the city being considered. Furthermore the generation of images is

constrained to end points of streets and junctions such as in [SILLION97].

Figure 4-6 - Impostor Placement At Street Nodes

38


39/80

Figure 4-6 provides example of the various junction types that commonly occur in a city

layout. Impostors are placed at a position perpendicular to the line from the midpoint to the

street start or end. Then, dependent on the position of the user viewpoint, any relevant

impostors are drawn orientated towards that position. This makes use of typical characteristicsof street scenes in which visibility is typically blocked by narrow streets with few tall

buildings viewable at the sides of the street. The only buildings visible that do not reside on

the street itself are only viewable at the end of the street or above the local skyline.

Essentially, at every given viewpoint the scene can be split into background, foreground and

buildings on the street that the viewer is currently standing (See Figure 4-7) .

Figure 4-7 - Street Scene At Ground Level

The system simply places an impostor at the end of the street, large enough to represent all

viewable geometry behind it (See Figure 4-8) . Appendix Figure C-3 shows a sample city

where, in the upper picture, streets are marked out and then, as displayed in the lower picture,

impostors are generated and placed. Sample impostors are shown in Figure C-4.

39


40/80

Figure 4-8 - Street Impostor Draw Distance

Figure 4-9 - Three way impostor at street joint

40


41/80

Use of impostors causes many problems: Resolution mismatch, rubber sheet effect,

incomplete representation, and parallax problems. Such problems and others have been

detailed in [DECOR99]. All these effects are minimal in the case of the implemented system

due to the small screen space in which the impostor actually resides. To combat parallax andgive further visual fidelity to the system, each impostor is enhanced by the use of three

impostors per street start and end. (See Figure 4-9) As the user moves parallel to the end

impostor across the width of a given street, visual information is typically lost. If simple

rotation, within a given threshold, were used to alleviate this problem the effect would be

noticeable and incorrect. Therefore, at a certain pre-destined point of rotation of the end

impostor, it switches to either a left or right hand side impostor.

Calculating and creating all the relevant image impostors for the given city scene

creates a large amount of information. Problems then arise in the efficient storage andretrieval of this information at runtime. A texture-caching scheme is required to manage these

problems. (See Figure 4-10)

Figure 4-10 - Texture Caching Framework

41


42/80

The caching scheme has two levels of texture storage. These two levels cover the far and near

field of a given viewpoint. The system keeps track of the current viewpoint and direction and

stores the full building textures for the near field and stores the impostor textures within a

certain distance and direction for the far-field representation. Then, dependent on the texturesand memory resources available, the system will display the near field and then try to display

the far field if possible.

4.4. Splatting

A point-based method using concepts developed by [RUSIN00] is used to generate a

hierarchical sphere-tree representation of a given model. (See Figure 4-11 for a simple

example) A pre-processing phase splits the model based on the longest edge of each

successive levels bounding box. The size of the splat required to cover each level is taken as

the radius of the bounding sphere of the bounding box. (See Table 4.2 for full pseudo-code)

Figure 4-11 - Bounding Sphere Hierarchy

Initialise_Splat_Hierarchy()FOR All Vertices

DO Set the bounding Radius of Current Vertex as

Maximum Bounding Radius of Any Face That Touches Current Vertex

END

Build_Splat_Hierarchy() FOR All Vertex In Parent Node

DO Add Parent Vertex to Current Vertex List Dependent On Axis Split

END

Calculate Centroid For Current Vertex List

Calculate Bounding Radius For Current Vertex List

42


43/80

IF Number Vertex > 1 AND Bounding Radius > MIN_ALLOWED

THEN Determine Longest Edge of Bounding Box

Count Number of Vertex Either Side Of Longest Edge Split

Left Child = Build_Splat_Hierarchy()

Right Child = Build_Splat_Hierarchy()

ELSE Mark As Leaf Node

END

Table 4-3 - Splatting Pseudo-code

Recursion continues until a user-specified minimum number of vertices / faces remain in any

given bounding sphere or the size of the bounding radius reaches a minimum size. Details

such as colour and normal are taken as the average of the included faces.

Figure 4-12 - Initial Bounding Sphere Estimation

Importantly, it is necessary to ensure that no visible gaps appear between splatted points. A

conservative over-estimate of final splat sizes is taken as the maximum radius of any

connected faces for a given vertex to ensure that this case does not occur ( Figure 4-12) .

The data structure produced by this splatting is suitable for progressive transmission as

successive levels-of-detail are stored in breath first order i.e. the lowest level of detail is

completely stored followed by the next level and so on. This allows for partial transmission of

a splatting file, which may often be the case in a wireless network environment. This concept

follows on from work by [RUSIN01, WOOL02]. In order to minimise the overhead

associated with transmission of this new file format a simplified version of the LZ adaptive

43


44/80

dictionary-based algorithm is used. This allows the file size to be reduced while avoiding

costly or lossly decompression. Individual splats are rendered using traditional z-buffering

and depth correction. Splat shapes and sizes can be changed to any geometric shape with

simple points being the least costly. An example is shown in Figure A-1. Note that the Lego piece ( Figure A-6) does not suit splatting while highly detailed models such as the dragon

(Figure A-3) are much more suited. Similarly architectural models such as buildings will be

unsuited due to their large, mainly flat, co-planar surfaces. Character objects typically consist

of a large amount of small faces that can be easily adapted to a sphere-tree representation.

Objects may also be subdivided by use of an octree, with successive levels-of-detail

drawn by successively smaller cubes. These cubes are much more suited to use with models

such as the Lego piece ( Figure A-6) due to the ability to precisely fit bounding cubes to the

shape of the model.

4.5. Overall Framework

As the system has many different techniques working in tandem, careful management of

how and when each one is instigated is required. Further organisation is needed to ensure that

the server or offline process carries out as many processor and memory intensive functions as

possible. To aid in reducing transmission costs, simple predictive algorithms are needed, as

the mobile device will never contain the full scene in memory. As discussed in [ROSSI03],

this distribution and segmentation algorithm has already been implemented. This work has

been further enhanced to cache not only geometry and texture but level-of-detail and impostor

resources. As described in [FUNK96], the framework implemented similarly consists of a

pre-processing phase where level-of-detail, impostor generation and spatial subdivision occur.

At render time the client determines what is next in the render queue, whether it has all

pertinent information related to that object and then proceeds to render the object (See Figure

4-13) . As the work involved in actual distribution across wireless networks is not part of this

thesis, it has been emulated by simply forcing the application to read from the local file storewhen information does not reside in system memory. The cache employs a least recently used

scheme to determine which objects should be replaced in the cache. The predictive algorithm

for determination of caching uses a simple ellipsoid distance metric as described earlier based

on [PASM03] and also bears resemblance to the concept of areas of interest developed in

[CAI99].

44


45/80

Figure 4-13- Framework for Pre-processing, Communication and Rendering

[PASM03] presents work on determination of a mathematical model for comparison of

geometric and image-based simplification methods. The results provide guidelines for

determination of when to change between varying levels-of-detail. In practice the actual point

at which the framework decides to change between representations is based on the polygonal,

texture and communication budget. Initially the client system will determine, given its next

set of objects to render, whether a sufficient representation resides in its local cache. Upon

determining this, it considers what level-of-detail can be used without excessively using upthe polygonal budget i.e. no single object should take up the entire budget. Rather a balance

should be maintained between all objects in the current viewpoint. The only proviso to this

rule is when displaying character objects or objects that have been given a high importance.

The content creator assigns importance to every object during system initialisation.

Importance is arbitrary and dependent on application requirements. Generally character

objects will be given high priority as emphasis is placed on interaction with virtual characters.

This architecture is flexible enough to allow for additional rendering pre and post-processing

and varying caching metrics can be added.

45


46/80

5. Experimentation Results

This chapter presents results obtained during experimentation with the system. The results

shown give an entire overview of the system, starting with timings based on basic rendering

through to those that use the techniques as described above.

5.1. Single Model Basic Rendering

In order to judge the success of the techniques implemented, presented are timings

measured with basic rendering as provided by PocketGL [18]. Figure 5-1 presents a graph of

the number of vertex and faces per model. The models chosen for rendering are architectural

models from [HAMILL03].

0

100

200

300

400

500

600

700

800

f i r e w o

r k s d o y

l e s

k e n n e d

y s

U l s t e r

B a n k B

a c k

B e c k e

t L u c e

P h y s i

c s B o t a

n y

D e n t a

l S t a t s

Name of Model

N u m

Vertex

Face

Figure 5-1 - Model Vertex and Face Count

0

10

20

30

40

50

60

70

f i r e w o

r k s d o y

l e s

k e n n e d

y s

U l s t e r

B a n k B

a c k

B e c k e

t L u c e

P h y s i

c s B o t a

n y

D e n t a

l S t a t s

Name of Model

F P S / N u m

M a

t e r i a

lMaterial

FPS

Figure 5-2 - Num of Materials Vs. Frames Per Second

It can be seen that the number of textures directly affects the rendering performance of thesystem (See Figure 5-2) . As the number of textures required for display increases the frame

rate decreases dramatically.

5.2.

thesis msc comp graphics

Documents