immersidata management: challenges in management of data...

8
Immersidata 1 Management: Challenges in Management of Data Generated within an Immersive Environment * Cyrus Shahabi, Greg Barish, Brian Ellenberger, Ning Jiang, Mohammad Kolahdouzan, Seong-Rim (Angela) Nam, Roger Zimmerman Integrated Media Systems Center and Department of Computer Science University of Southern California Los Angeles, 90089 {shahabi, gbarish, ellenber, njiang, kolahdoz, seongnam, rzimmerm}@usc.edu The possibility of immersing a user into a virtual environment is quickly becoming a reality. At the Integrated Media Systems Center (IMSC) we are exploring the unique new challenges presented by such systems in the context of our own implementation, the Media Immersion Environment (MIE). A novel aspect of immersive environments is the new data types that need to be supported. In this report we present our experience with three new data types: haptic, avatar, and application coordination data. These data types have a unique set of characteristics that results in new challenges for their acquisition, analysis, content-based querying, and integration requirements associated with data-intensive immersive collaboration (e.g., scalability, synchronization, and cross-analysis needs). Currently, MIE is used to host three applications that generate experimental data streams of these new data types and hence allow us to study their requirements in a realistic setting. We have already identified significant properties of these new data types and through our prototype implementations we hope to achieve an even better understanding of their characteristics, develop solutions that enhance immersive experiences, support advanced queries, optimize storage and retrieval, and increase system efficiency and scalability. Introduction With current technological advances in computer graphics and animation, high-speed networking, signal and image processing, and multimedia information systems, it is now feasible to immerse a person into a virtual environment. An obvious application for immersive environments is for people to interact, communicate and collaborate naturally in a virtual space while they are actually residing in different physical locations. At IMSC [MNN+99], we are focusing on challenges in realizing such an immersive environment under a project called Media Immersion Environment (MIE). There are applications for MIE in many domains, including those where it is: expensive to have physical presence (e.g., distance learning), impossible to have physical presence (e.g., space exploration), important to have multiple presences (e.g., remote medicine), not safe to have physical presence (e.g., nuclear studies), and enjoyable without requiring physical presence (e.g., entertainment industry). An interesting extension to immersive environments is the ability to record sessions (persistent storage). Augmenting this recording with enough semantics will make it possible to query elements of the immersion and customize the results toward user preferences. Hence, a user would be able to experience the immersion as if he/she had been present in the physical environment. One challenge of recording an immersion is managing the huge amount of heterogeneous data generated. The obvious data types (image, video, audio, and text) are supported in our prototype, but for purposes of this paper, we will focus on the challenges in supporting three less familiar types: haptic, avatar, and application coordination data. We term these data types collectively as immersidata (see Figure 1). Immersidata is the information acquired or rendered during an immersisession, in which one records or presents visual, auditory, haptic, and application coordination data to immerse a person with other persons, objects, places, and databases. An immersisession is an augmented or virtual reality experience that connects a person with other persons, objects, places, and databases. Our MIE system supports the acquisition, storage, querying and analysis of immersidata. Here, we report on our effort, the lessons learned about the properties of this data and some 1 This term has been coined by Professor Dennis McLeod during one of our many IMSC group meetings. * This research was supported in part by NASA/JPL contract nr. 961518, and unrestricted cash/equipment gifts from NCR, Intel, and NSF grants EEC-9529152 (IMSC ERC) and MRI-9724567.

Upload: others

Post on 19-Jan-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

Immersidata1 Management: Challenges in Management of DataGenerated within an Immersive Environment*

Cyrus Shahabi, Greg Barish, Brian Ellenberger, Ning Jiang,Mohammad Kolahdouzan, Seong-Rim (Angela) Nam, Roger Zimmerman

Integrated Media Systems Center and Department of Computer ScienceUniversity of Southern California

Los Angeles, 90089{shahabi, gbarish, ellenber, njiang, kolahdoz, seongnam, rzimmerm}@usc.edu

The possibility of immersing a user into a virtual environment is quickly becoming a reality. At theIntegrated Media Systems Center (IMSC) we are exploring the unique new challenges presented by suchsystems in the context of our own implementation, the Media Immersion Environment (MIE). A novelaspect of immersive environments is the new data types that need to be supported. In this report wepresent our experience with three new data types: haptic, avatar, and application coordination data.These data types have a unique set of characteristics that results in new challenges for their acquisition,analysis, content-based querying, and integration requirements associated with data-intensive immersivecollaboration (e.g., scalability, synchronization, and cross-analysis needs). Currently, MIE is used to hostthree applications that generate experimental data streams of these new data types and hence allow us tostudy their requirements in a realistic setting. We have already identified significant properties of thesenew data types and through our prototype implementations we hope to achieve an even betterunderstanding of their characteristics, develop solutions that enhance immersive experiences, supportadvanced queries, optimize storage and retrieval, and increase system efficiency and scalability.

Introduction

With current technological advances in computer graphics and animation, high-speed networking, signaland image processing, and multimedia information systems, it is now feasible to immerse a person into avirtual environment. An obvious application for immersive environments is for people to interact,communicate and collaborate naturally in a virtual space while they are actually residing in differentphysical locations. At IMSC [MNN+99], we are focusing on challenges in realizing such an immersiveenvironment under a project called Media Immersion Environment (MIE). There are applications for MIEin many domains, including those where it is:

• expensive to have physical presence (e.g., distance learning),• impossible to have physical presence (e.g., space exploration),• important to have multiple presences (e.g., remote medicine),• not safe to have physical presence (e.g., nuclear studies), and• enjoyable without requiring physical presence (e.g., entertainment industry).

An interesting extension to immersive environments is the ability to record sessions (persistent storage).Augmenting this recording with enough semantics will make it possible to query elements of theimmersion and customize the results toward user preferences. Hence, a user would be able to experiencethe immersion as if he/she had been present in the physical environment.

One challenge of recording an immersion is managing the huge amount of heterogeneous data generated.The obvious data types (image, video, audio, and text) are supported in our prototype, but for purposes ofthis paper, we will focus on the challenges in supporting three less familiar types: haptic, avatar, andapplication coordination data.

We term these data types collectively as immersidata (see Figure 1). Immersidata is the informationacquired or rendered during an immersisession, in which one records or presents visual, auditory, haptic,and application coordination data to immerse a person with other persons, objects, places, and databases.An immersisession is an augmented or virtual reality experience that connects a person with other persons,objects, places, and databases. Our MIE system supports the acquisition, storage, querying and analysis ofimmersidata. Here, we report on our effort, the lessons learned about the properties of this data and some

1 This term has been coined by Professor Dennis McLeod during one of our many IMSC group meetings.* This research was supported in part by NASA/JPL contract nr. 961518, and unrestricted cash/equipment gifts from NCR, Intel, and NSFgrants EEC-9529152 (IMSC ERC) and MRI-9724567.

Page 2: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

future challenges. This paper serves primarily to introduce immersidata to the community, in order toencourage research towards the modeling, querying and analysis of this new data type.

Immersidata Applications and Data Types

We focus on three different applications, each one being useful for illustrating a specific immersidata typeconsidered in this study. The first is a virtual museum, a haptic application where users can touchexpensive art pieces in a virtual environment [MELG97, SJM97]. Haptic data contains information on thesensations of movement, touch, and feel associated with experiencing a virtual object. Haptic experiencesallow the user to physically feel many types of virtual forces via special machinery attached to computersserving as interfaces to that data. One such device we have been experimenting with is the PHANToM[Pha98] stylus. This product allows users to experience touch and movement by having them use the stylusas if it was a single finger on a real hand. With the stylus, users have a wide range of motion and rotation,and can experience many levels of resistant force. The type of data that the PHANToM captures includesmovement, rotation, and forces (collisions between virtual objects). The result of this experience is a set ofrecorded haptic data that can later be replayed and analyzed. One future application we are consideringwould allow students to learn a breast cancer detection procedure. Using a more advanced haptic device,trainees could experience the procedure, which was recorded previously by an expert, to aid their learning.

Our second MIE application supports teleconferencing using a low bandwidth communication channel. Inorder to minimize the network communication required to support animation, computer graphics scientists

Figure 2: the avatar data type - a set of facial feature points as a function of time

ImmersiveData Types

Immersidata Image AudioVideo

Avatar ApplicationCoordination

Haptic

...

Tactile Grasping Kinesthetic

TextureTaxonomy

PrehensionTaxonomy

ForceTaxonomy

Figure 1: Organization and properties of immersidata

Page 3: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

have proposed the tracking of only a few feature points [NLE+99], utilized for avatar animation.Hereafter, we refer to the set of these feature points as the avatar2 datatype. Figure 2 shows an example setof these X, Y points for the human face.

Our third application involves collaborative distance learning [KHH97]. It is a rapid prototype of a digitalmicroscope, as part of the BioSIGHT [MNN+98] application, which allows two or more users to learnabout biology materials by navigating through biological images (e.g., images of a blood cell) usingcommands such as "Select Image", "Zoom”, "Scroll Left", etc. The application works similar to animmersive collaborative white board: the effects of each user action will be simultaneously received by allother users. As many users collaborate on learning about, say blood cells, their interactions with thesystem (i.e. commands issued by each user), which are called application coordination data, are sharedthrough a collaborative data server. This server then time stamps the interactions and stores them in adatabase. An immediate advantage of recording this data is the ability to playback a session. Thus,sessions recorded by students can be replayed for review, or by an instructor for grading purposes.

One can imagine a collaborative distance learning application (or any other cooperative immersiveapplication) which involves the integration of several immersidata types. Students and/or their teachercould be required to see each other (avatar), touch related objects (haptic) and use software commands ofshared applications (application coordination data). By storing all these data types and providing cross-analysis and query capabilities as well as synchronized playback, the learning experience can be enhanced.

MIE Architecture

Our current MIE storage architecture provides access to three organizational repositories as illustrated inFigure 3: an object-relational database management system (OR-DBMS, currently Informix UniversalServer), a real-time file server (RTFS, currently Microsoft NetShow Theater Server), and the Ariadneinformation mediator [AAK+98]. Additionally, a collaborative server coordinates multi-user interactions.The distributed repository supports the storage and retrieval of structured, semi-structured, andunstructured media types. Middleware written in Java “glues” these repositories together to provideunified access from the immersipresence stations to the stored data. The middleware is loaded into theclient space along with the application and its graphical user interface (GUI) at run time. Industrystandard communication mechanisms such as socket connections, CGI-bin, and Remote MethodInvocation (RMI) provide connectivity between the middleware and the servers, and among the servers.For example, the collaborative data server (AgoraServer), which is typically used between multiple clientsto synchronize, store and retrieve the session data of multi-user collaborative applications, utilizes acommunications protocol based on socket connections between the different clients to keep all theparticipants synchronized. At the same time, all generated structured data is stored in the InformixUniversal Database by invoking the database server via the Java RMI API.

Currently, the middleware functions that provide access to multiple storage servers are tightly coupledwith our applications and their graphical user interfaces. We are in the process of separating the generalfunctions required to access different servers from the application code, in order to define them as a formalAPI, which we then intend to release publicly for other research groups who want to Rapidly Integratemultiple Multimedia Servers (RIMS API) for their applications. The main advantage of the RIMS API isthat it will provide access to multiple servers from a Java client activated via a standard web-browser.Finally, as our own servers (e.g., RTFS and collaborative data server) become available, we plan to releasetheir code as well so that other research groups can replace commercial products with our open sourceservers. This will provide them with a flexible and extendible multimedia storage infrastructure.

Many immersive data types, for example haptic data, need to be acquired and later replayed in real time.At first glance, this seems to be a trivial task. However, our preliminary computations indicate that thebandwidth and storage capacity required for real-time storage and display of haptic data can be asdemanding and hence as challenging as those of continuous media data (i.e., video and audio). Due to thesimilar characteristics of haptic data and continuous media data, our architecture includes a flexible real-time file server (RTFS) that is intended to service these (and other) isochronous media types. A direct linkconnects the clients to the RTFS. Additional semantic content descriptions of the real-time data (such astime-of-day, participants, etc.) are forwarded to the Informix Universal Database server. One of the mainadvantages of this design is its operation in two-tier mode, which provides superior performance ascompared with the three-tier architectures typically employed in the industry. We currently utilize

2 Note that in computer graphic literature, the tern avatar is used to describe a more general 3-D representation of character animation.

Page 4: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

Microsoft NetShow Theater Server but are in the process of developing our own RTFS, since some ofthe required extended functionality is not available in NetShow.

Real-Time File Servers

Informix Java API usingRemote Method Invocation

(RMI)

Real-Time streams

SUN Enterprise 450,Informix UniversalDatabase Server

PentiumPentiumIIII

PentiumPentiumIIII

PentiumPentiumIIII

High-SpeedATM Network

WWW

Ariadne InformationMediator

SUN StorEdgeA1000, Mainrepository ofImage & Data

Socket

CGI-bin

Collaborative data Server

Middle-Ware

Immersidata Station

Middle-Ware

Immersidata Station

Socket

Socket

Our past experience in designing and building multimedia servers [Sha96, GZS+97, ZG97] has played akey role in the design and development phases of this custom RTFS. To achieve the high bandwidth andmassive storage required for a multi-user RTFS, disk drives are commonly combined into disk arrays tosupport many simultaneous I/O requests. A distributed, multi-node design based on commodity personalcomputer hardware components provides a cost-effective and scalable solution. To store a large mediaobject X on such a platform it is commonly striped into n blocks: X0, X1, …, Xn-1 [Pol91]. There are twobasic techniques to assign these blocks to the magnetic disk drives that form the storage system: (a) in around-robin sequence [SGM86], or (b) in a random manner [MSB97]. Traditionally, the round-robinplacement utilizes a cycle-based approach [GZS+97] to scheduling of resources to guarantee a continuousdisplay, while the random placement utilizes a deadline-driven approach [RW94]. We have previouslyimplemented paradigm (a) and now plan to extend our prototype with random block placement inconjunction with a deadline-driven scheduler. Once we have acquired some sample haptic data, we intendto investigate which of the two data placement and scheduling techniques are more appropriate for a hapticdata server. Furthermore, we expect to find additional characteristics that can help us to customize and finetune our media server to achieve a better performance. For example, it will be interesting to see if theconventional performance metrics for media servers (e.g., startup latency, hiccup-rates and throughput)still remain valid for haptic servers. Similar issues can be investigated for buffer management techniquesand admission control policies. To achieve high reliability and availability for all the session data as wellas support for heterogeneous disk storage we will implement parity protection and logical disks [ZG97].

Properties of Immersidata

Table 1 compares immersidata requirements with traditional media types:

Figure 3: MIE Architecture

Page 5: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

Characteristics TextFields

Images Video/Audioand Animation

RenderedAnimation

Immersidata

Database Research Since ~1970 ~1985 ~1990 ~1995 ~2000Semantics Very Rich Poor Poor Medium Rich

Content ExtractionComplexity

Very Low High Very High Medium Low

Storage Requirements Small Medium Large Medium LargeSpatial Content N/A Coarse N/A Average Fine

Temporal Content N/A N/A Coarse Average FineReal-Time Delivery N/A N/A I/O intensive CPU intensive I/O intensive

Haptic data is an example of an data type which exhibits these immersidata properties. Haptic data fromthe PHANToM includes three-dimensional position vectors, a quadruple to represent rotation (aquaternion), object collision and time information. Obviously, this data implicitly contains spatial,temporal, and semantically rich information. When we further consider that such data is acquired at a highsampling rate (to make replay of the session smooth and realistic), it becomes more obvious that storagerequirements and real-time collection efforts become significant problems.

Preliminary Findings

We have already begun to address the challenges in storage, real-time data collection, and scalability ofimmersisessions with a few prototypical solutions.

Consider our approach to real-time haptic data collection. In our implementation, we use a multi-threadedpriority buffering scheme to ensure that real-time, temporal, and storage requirements are all met. Somethreads are associated with temporary (in-memory) storage of data collected at real time, others areassociated with periodically flushing this in-memory buffer to disk. Thus, our implementation permitsrapid buffering of frequently sampled information while incurring minimal file I/O penalties. To ensurethat our recording and playback of haptic data is portable across CPUs (different speeds result in differentplayback rates), we identify the sampling rate (in terms of callbacks to our code). Then, during playback,we make sure to execute recorded events according to the relative speed of the playback CPU.

We currently have a prototype which acquires and plays back haptic data via the PHANToM. With thePHANToM stylus, users have a wide range of motion and rotation and can experience many levels ofresistant force. The type of data captured by PHANToM for this application includes movement,rotations, and forces (collisions between virtual objects).

Avatar data has similar real-time acquisition needs, in particular to be captured at a rate of 30 frames persecond. For each frame, the coordinates of about 18 points need to be captured. A similar technique to thatof haptic acquisition has been employed for avatar data acquisition. This data is then integrated with off-line recorded data representing face muscles, eye-regions, wrinkles, and volume morphing features inorder to reconstruct and render a realistic face animation.

Finally, our application coordination data prototype is a good example of how we address collaborativedata acquisition. Our Java-based prototype consists of a collaborative data server module, AgoraServer3,and platform-independent client modules. The server acts as the synchronizer between the clients in acollaborative session (e.g., digital microscope prototype). The server is connected to the clients throughsockets, and upon receiving any command (which is initially in the form of push or mouse button actionsbut transformed to a standard form based on a predefined schema) from a user, distributes the command toall of the participants in order that they are received. If a command requires more information that needs tobe fetched from the database server (e.g., images), the coordinator server, which is also connected to theInformix object relational database (through the native Informix Java API) downloads and distributes thatinformation as well. Each user, including the originator of the command, experiences session changesbased on these directives from the server. The server also timestamps and records interactions of allsession participants. This information, if saved as raw-data in local files or blobs in the database (for the

3 Agora: a gathering place; especially: the marketplace in ancient Greece (Merriam-Webster)

Table 1: Immersidata versus traditional data types

Page 6: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

applications that the server does not have any knowledge about the semantics of its commands), can onlybe used for playback purposes. However, with our prototype, the information about the semantics ofapplication commands are known to the server, so these semantics can be automatically extracted fromuser interactions and stored in the database as completely structured data. Later, this data can be queried.

Future Challenges

Thus far, we have focused our implementation efforts of immersidata acquisition and ad-hoc querying andanalysis. We are now turning our attention to other complex topics, such as designing a framework for thestructured modeling, querying, analysis and the synchronized playback of immersidata. In this section, wediscuss our ideas for improving the integration of these unique datatypes into our MIE architecture.

Note that, although we are indeed interested in the overall all-encompassing immersisession, where onesentire body is engulfed in a virtual world, we typically focus on the difficult and challenging aspects ofeach immersidata type. In many of the examples which follow, we consider techniques which might beused at a virtual group meeting. Such events may consist of facial animation and expression recognition(avatar), handling objects from remote meeting participants (haptic), and synchronized delivery/playbackof these events (application coordination).

Haptic Data

Future work for the haptic datatype involves exploring the obvious system issues as well as moresemantic, modeling challenges. Certainly, we will need to solve system-level issues related to the captureand playback of real-time, voluminous haptic data. We also see the effective classification and modelingof haptic data as being key techniques towards helping us achieve our goals.

For example, with regard to haptics as is applied to the hand, we have defined three distinct subclasses ofhaptic data: tactile data, grasping data, and kinesthetic data (Figure 1). This classification is based on thetype of feedback that most haptic devices can capture and some can replay. Tactile data pertains totouching, grasping data relates to holding, and kinesthetic data related to the forces applied to the fingers.

Haptic data needs to be acquired and later played-back in real time. Our preliminary computationdetermines that the bandwidth and storage capacity required for real-time storage and display of hapticdata can be as demanding and hence as challenging as those of continuous media data (i.e., video andaudio). To illustrate, consider the tactile subclass, the most demanding of the three subclasses inbandwidth and storage space requirements. According to [Loomis79], the spatial location of a point isdetectable within .15mm by a finger-pad. An area of perhaps 10x10 mm2 containing such points coverseach fingertip. [Bolanowski88] claims that the tactile system can resolve vibrations of up to 1 kHz with thehighest sensitivity around 250 Hz. Suppose we are interested only in the highest sensitivity range. Asimple computation on the above numbers, assuming the needs for two hands, each with five fingers, eachsensor being binary, results in a 12 Megabits per second (Mb/s) bandwidth requirement. Next, assume thatgrasping and kinesthetic data are captured at 60 Hz. With 26 floating point values associated with graspingand 15 floating points associated with kinesthetics, capture at 60 Hz for two hands approaches 2 Mb/s.Thus, the total bandwidth requirement for haptic data can be casually estimated as high as 14 Mb/s, whichis three times higher than the average rate for MPEG-2. In short, there are strong real-time acquisition andplayback challenges associated with this new datatype.

We can also consider challenges in querying haptic data. For example, consider a scenario where studentsuse haptic devices to perform a virtual medical exam on another person. Instructors might want to queryabout the nature of the practice exam. Did the student do everything in the right order? Was the studenttoo gentle or too rough with the patient? Obviously, these queries involve both a high degree of semanticunderstanding as well as identification and ordering of temporal behaviors. By augmenting raw hapticdata with semantic and structural knowledge, we hope to be able to answer these types of queries.

Avatar Data

As is the case with haptic data, acquisition of avatar data is challenging both in terms of data volume andreal-time synchronization. Intuitively, one approach to avatar data acquisition is to collect the history ofevery bit associated with an avatar. This obviously not a scalable solution, since the real-time demands ofthis type of acquisition would simply defeat the process.

Instead of capturing every point associated with an avatar, and to retain the flexibility of having high-resolution avatars, we have been exploring techniques for capturing key avatar features over time. Thegeneral idea is to identify useful points for each avatar and simply track the movement of these points. As

Page 7: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

mentioned earlier, our initial focus has been with facial animation. To complement the avatar feature-based approach described earlier, we have been actively investigating two additional separate, but relatedproblems: (a) identifying muscles from images and (b) identifying expressions from muscle actions.

To achieve muscle identification, we have been exploring two new methods: PCA (Principle ComponentAnalysis) and ICA (Independent Component Analysis). PCA and ICA are two popular techniques used inface recognition and can also be used in finding representation of face expression. The PCA approach isused to provide a way of representing faces using features automatically derived from the statisticalstructure of a set of learned faces [VAA96]. Most recent research work on this are Holons[CM91] andEigenfaces[TP91]. Features provide a better abstract level of spatial information for facial images.

The ICA approach is a generalization of PCA. It is based on higher-order statistics decorrelated for facerecognition [BS97]. The specific approach involves extracting statistically independent components of theface images in data set, components which can be found through an unsupervised learning algorithm[BS95]. In [BS97], three ways of applying ICA are introduced: computing the independent componentsfrom the full set of images; separating independent components by performing ICA on a subset ofprinciple components vectors from the initial image set; selecting a subset of the independent componentsas kernels according to the priority based on the between-class to within-class variability. Experimentalresults show that ICA can generate much more powerful representations for face images than PCA. WhileICA and PCA are used for identifying muscles from images, we are also exploring the Facial ActionCoding System (FACS) to identify facial expressions from muscles [EF78]. We hope to, in some manner,combine results from PCA/ICA-based approaches with FACS technology to accomplish the overall goalof expression recognition.

In addition, we have noticed that a lot more research work has been done on extracting models fromspatial information as compared to temporal information of face images. In our MIE project, we can trackthe avatar data on the video sequence of a face [NLE+99]. Since the number of feature points is not large,the temporal change can potentially play an important role in finding the exact model for facialexpressions. Thus, we hope to also combine time-series techniques to improve expression recognition.

The end result of several of these approaches will be to eventually support content-based querying of theavatar immersidata type. Once we capture the feature points by the process described above, we canexploit symmetry information and use Euclidean distances, instead of coordinates, to reduce the set offeature points further. Subsequently, each distance measure across time can be represented as a time seriesdata. Hence, the work on time-series surprise mining [STZ99] can be utilized here to identify certain faceexpressions automatically. This would allow for more semantic querying, such as “ find the percentage ofmeetings in which 15% of the participants were bored” .

Application Coordination Data

We are investigating different options for extending and improving the functionality and performance ofthe coordinator server. One option to reduce the load on the centralized server, particularly when largeobjects (e.g. images) are to be sent to the clients, is to have a hierarchical architecture by deployingsecondary servers that act as nodes between groups of clients and the primary server, and have the clientscommunicate with them. Another way to improve the performance is to have the clients directly connectedto the database (using the RIMS API). This will enable the clients to bypass the coordinator server, whenlarge objects (e.g. images) need to be fetched from the database.

The advantage of storing verbose application coordination data is that applying data-mining techniques foranalysis purposes becomes easier. The information about the semantics of the digital microscopeprototype, our current client program, is partly integrated into the current version of the server module. Weare modifying the server to become more generic, and making tools to easily define the semantics of theapplication coordination data (based on the object-oriented schema) and the objects (e.g., images) inspecific applications for the server. This will enable the server to automatically extract and store semanticsof the application coordination data for different collaborative applications into the database. Oneinteresting challenge in the analysis of immersidata is to extract the behavior of individual users while theywere collaborating with others.

Our intention here is to evaluate individuals in the collaborative groups, assuming we know how toanalyze a user's behavior in a single-user session. The idea is to generate models to predict and reconstructone user's interactions within the session, as he was working with the application alone, by analyzing hisinteractions in the group, his profile in other sessions and profiles of other people (training sets) for the

Page 8: Immersidata Management: Challenges in Management of Data ...eiger.ddns.comp.nus.edu.sg/pubs/mis99.pdf · With current technological advances in computer graphics and animation, high-speed

same session. Subsequently, we can apply known methods for analyzing single-user sessions (e.g.clustering [SZAS97, ZASSS97]) to the reconstructed interactions.

Conclusions

We have described how IMSC uses the MIE framework to support sessions in immersive environments.We described properties of our unique immersidata types (application coordination, avatar, and haptic),how they are currently addressed by the system, and future challenges. By using our MIE system tosupport the integration needs of collaborative immersipresence, we hope to better understand thesedatatypes and develop solutions which enhance the overall immersive experience while also supportingadvanced queries, optimized data storage/retrieval, and maximized system efficiency and scalablity.

Acknowledgements

We would like to thank the following people who have helped us with related applications with whom wehave participated in intriguing discussions: Craig Knoblock, Chris Kyriakakis, Margaret McLaughlin,Dennis McLeod, Gerard Medioni, Ulrich Neumann, Chrysostomos "Max" Nikias, Skip Rizzo, SandySawchuck, Gaurav Sukhatme, Wee Ling Wong, and Sherali Zeadally.

References[AAK+98] Ambite, J.L.; Ashish, N.; Knoblock C.A.; Minton, S.; Modi, P.J; Muslea, I.; Philpot, A.; Tejada, S. “Modeling Web

Sources for Information Integration” . AAAI/IAAI 1998: 211-218[GIZ96] Ghandeharizadeh, G.; Ierardi, D.; Zimmermann R. “An On-Line Algorithm to Optimize File Layout in a Dynamic

Environment” . Information Processing Letters. Vol 57. 1996. P75-81.[GKSZ96] Ghandeharizadeh, G.; Kim, S.H.; Shahabi, C.; Zimmermann R. “Placement of Continuous Media in Multi-Zones

Disks” . In Multimedia Information Storage and Management, Kluwer Academic 1996, ISBN 0-7923-9764-9.[GZS+97] Ghandeharizadeh, S.; Zimmermann, R.; Shi, W.; Rejaie, R.; Ierardi, D.J.; Li, T.W. “Mitra: A Scalable Continuous

Media Server.” Multimedia Tools and Applications Journal, Kluwer Academic Publishers, 5(1):79-108, July 1997.[KG98] Kim, S.H; Ghandeharizadeh, S. “Design of Multi-user Editing Servers for Continuous Media.” In the Proceedings of

the 8th Workshop on Research Issues in Database Engineering (RIDE’98), Feb. 1998.[KHH99] Kozma, R.; Hoadley, C.; Hinojosa, T. “Pilot Evaluation of the BioSIGHT Design and Immunology Module” . Menlo

Park, CA: Center for Technology in Learning, SRI International 1999.[MELG97] McLaughlin, M.L.; Ellison, L.; Lucas, J.; Goldberg, S.G. “The Interactive Electronic Exhibition: Reconstructing the

Boundaries between Museums and their Constituencies” . International Communication Association Conference,Montreal, Canada, 1997.

[MNN+99] McLeod, D.; Neumann, U.; Nikias, C.; Sawchuck, A.A. “The Move Towards Media Immersion” . IEEE SignalProcessing. Vol 16. No. 1. Jan 1999. p 33-43.

[MSB97] Muntz, R.R.; Santos, J.; and Berson, S. “RIO: A Real-time Multimedia Object Server.” ACM Sigmetrics PerformanceEvaluation Review, Volume 25, Number 2, September, 1997.

[NLE+99] Neumann, U.; Li, J; Enciso, J.Y. Noh; Fidaleo, D; Kim, T.Y. “Constructing A Realistic Head Animation Mesh For ASpecific Person” , USR-TR 99-691, Feb 1999, USC.

[Pha98] PHANToM Users Manual. SenseAble Technologies. 1998.[Pol91] Polimenis, V.G. “The Design of a File System that Supports Multimedia.” Technical Report, TR-91-020, ICSI, 1991.[RW94] Reddy, A. L. N.; Wyllie, J. C. “ I/O Issues in a Multimedia System.” IEEE Computer Magazine, March 1994, Volume

27, Number 3, pp 69-74.[SGM86] Salem, K.; Garcia-Molina, H. “Disk Striping.” Proceedings of International Conference on Database Engineering,

February, 1986.[SM98] Santos, J.R.; Muntz R.R. “Performance Analysis of the RIO Multimedia Storage System with Heterogeneous Disk

Configurations.” ACM Multimedia Conference, Bristol, UK, 1998.[SJM97] Schertz, P.M.; Jaskowiak, J.; McLaughlin, M.L. “Evaluation of an Interactive Art Museum”. SPECTRA, A

Publication of the Museum Computer Network, 25 (1), 33-37, 1997.[Sha96] Shahabi, C. “Scheduling the Retrievals of Continuous Media Objects.” Ph.D. Dissertation, University of Southern

California, Los Angeles, California, 1996.[STZ99] Shahabi, C.; Tian, X.; Zhao, W. “Trends and Surprises Mining on Time-Series Data” . In preparation for publication.[SZAS97] Shahabi, C.; Zarkesh, A.; Adibi, J.; Shah, V. “Knowledge Discovery from Users Web-Page Navigation” . Proceedings

of IEEE RIDE’97 Workshop, April 1997.[ZASSS] Zarkesh, A.; Adibi, J.; Shahabi, C.; Sadri, R.; Shah, V. “Analysis and Design of Server Informative WWW-Sites” .

Proceedings of ACM CIKM’97 Conference, 1997.[ZG97] Zimmermann R.; Ghandeharizadeh, G. “Continuous Display Using Heterogeneous Disk Subsystems” . In Proceedings

of the 5th ACM International Multimedia Conference, Seattle, Washington, Nov. 1997.