dance preservation and technology

11

Nrityakosha: Preserving the Intangible Heritage of IndianClassical DanceANUPAMA MALLIK and SANTANU CHAUDHURY, Indian Institute of Technology, DelhiHIRANMAY GHOSH, Tata Consultancy Services

Preservation of intangible cultural heritage, such as music and dance, requires encoding of background knowledge together withdigitized records of the performances. We present an ontology-based approach for designing a cultural heritage repository forthat purpose. Since dance and music are recorded in multimedia format, we use Multimedia Web Ontology Language (MOWL)to encode the domain knowledge. We propose an architectural framework that includes a method to construct the ontology witha labeled set of training data and use of the ontology to automatically annotate new instances of digital heritage artifacts.The annotations enable creation of a semantic navigation environment in a cultural heritage repository. We have demonstratedthe efficacy of our approach by constructing an ontology for the cultural heritage domain of Indian classical dance, and havedeveloped a browsing application for semantic access to the heritage collection of Indian dance videos.

Categories and Subject Descriptors: H.m [Information Systems] :Miscellaneous

General Terms: Design, ExperimentationAdditional Key Words and Phrases: Heritage preservation, ontology construction, concept recognition, video annotation, multi-media web ontology language

ACM Reference Format:Mallik, A., Chaudhury, S., and Ghosh, H. 2011. Nrityakosha: Preserving the intangible heritage of Indian classical dance. ACMJ. Comput. Cult. Herit. 4, 3, Article 11 (December 2011), 25 pages.DOI = 10.1145/2069276.2069280 http://doi.acm.org/10.1145/2069276.2069280

1. INTRODUCTIONPreservation of heritage and ensuring its accessibility over a prolonged period of time has receiveda boost with digital multimedia technology. The tangible heritage resources like monuments, hand-icrafts, and sculpture can be preserved through digitization and 2D and 3D modeling techniques[Foni et al. 2010; Aliaga et al. 2011]. Preservation of intangible resources like language, art and cul-ture, music and dance, is more complex and requires a knowledge intensive approach. Such a livingheritage lies with a set of people, who naturally become the custodians of this heritage, by practising

This work was funded under the heritage project entitled Managing Intangible Cultural Assets through Ontological Interlink-ing of the Department of Science and Technology of the Government of India.Authors addresses: A. Mallik, Multimedia Lab, Room 405, Block II, IIT Delhi, New Delhi-110016, India; email: [email protected]; S. Chaudhury, Multimedia Lab, Room 405, Block II, IIT Delhi, New Delhi-110016, India; email: [email protected];H. Ghosh, TCS Innovation Labs Delhi, 249 D&EUdyog Vihar Phase IV, Gurgaon 122016. India; email: [email protected] to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee providedthat copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first pageor initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute tolists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may berequested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481,or [email protected] 2011 ACM 1556-4673/2011/12-ART11 $10.00DOI 10.1145/2069276.2069280 http://doi.acm.org/10.1145/2069276.2069280

ACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.

11:2 A. Mallik et al.

and passing on its legacy, often in an undocumented form. Thus, intangible cultural heritage is veryfragile and its preservation must capture the background knowledge that lies with its exponents, suchas dancers, musicians, poets, writers, historians, and the communities at large.India has a rich cultural heritage, where music and dance have been interwoven into the social

fabric. Indian classical music and dance portray human emotions, love and devotion, narrate storiesfrom myths and religious scriptures, and are integral parts of the celebration of life. A list pertainingto intangible cultural heritage compiled by UNESCO [UNESCO-HeritageList 2010] includes manyIndian dances, theatre and music forms. We present an ontology-based approach to the preservationof intangible cultural heritage with a case study in the domain of Indian classical dance (ICD).Indian classical dance is an ancient heritage, more than 5000 years old. The depictions of many forms

of this dance can be seen in dance postures in the form of sculptures on the walls of ancient Indiantemples and monuments. Contemporary dancers still learn from these sculptures and keep them alivein their dance performances. An ancient Indian scripture called Natya Shastra,1 which is almost 2000years old, and is probably the oldest surviving text on the subject, provides a detailed listing of thegrammar and rules covering music, dance, makeup, stage design and virtually every aspect of stage-craft. The dance forms whose theory and practices can be traced back to the Natya Shastra are knownas Indian classical dances. With the passage of time, these dance forms have been interpreted andperformed by different artists in different ways and have been associated with a rich set of body pos-tures and gesturesa grammar for the performance, mythological stories, sculptural depictions, andvarious other cultural artifacts. Thus, these dance forms embody a correlated collection of knowledgesources which can be presented through a variety of manifestations in digital form, such as digitizedscriptures, digital close-ups of the dance postures, and gestures and video recordings of performances.Preservation of this intangible knowledge is assisted by the availability and integrity of these variedsources. Although large multimedia repositories of ICD heritage resources are available [Kalasam-pada; ASI-India], there are none that correlate the digital resources with the traditional knowledge.The digital medium offers an attractive means for preserving and presenting tangible, intangible,

and natural heritage artifacts. It not only preserves a digital representation of heritage objects againstthe passage of time, but also creates unique opportunities to present the objects in different ways, suchas virtual walkthroughs and time-lapse techniques over the Internet. In recent times, the economicsof computing and networking resources have created an opportunity for large-scale digitization of theheritage artifacts for their broader dissemination over the Internet and for their preservation. Severalrenowned museums and cultural heritage groups have put up virtual galleries in cyberspace [Louvre-Museum; EuropeanArt; Kalasampada; ASI-India] to reach out to a global audience. Currently, mostof such presentations available on these portals are hand-crafted and static. With the increase in thesize of collections on the web, correlating different heritage artifacts and accessing the desired onesin a specific use-case context creates a significant cognitive load on the user. Of late, there has beensome research effort to facilitate semantic access in large heritage collections. Several research groups[Hunter 2003; Hammiche 2004; Tsinaraki 2005; Petridis 2005] have proposed use of an ontology in thesemantic interpretation of multimedia data in collaborative multimedia and digital library projects.While ontology is a useful tool for modeling a conceptual domain, it has not been designed to modelmultimedia data that is perceptual in nature. The ontology and the metadata schemes are tightlycoupled in these approaches, which necessitates creation of a central metadata scheme for the en-tire collection and prevents integration of data from heritage collections developed in a decentralizedmanner.

1The Natya Shastra is an ancient Indian treatise on the performing arts, encompassing theatre, dance and music. NatyaShastra is the foundation of the fine arts in India. (http://en.wikipedia.org/wiki/Natya Shastra).


Nrityakosha: Preserving the Intangible Heritage of Indian Classical Dance 11:3

The significance of a heritage artifact is implied in its contextual information. Thus, the scope ofdigital preservation extends to preservation of the knowledge that puts them in proper perspective.Establishing such contexts enables relating different heritage artifacts to create interactive tools toexplore a digital heritage collection. A major activity in any digital heritage project is to create a do-main ontology [Stasinopoulou et al. 2007; Hernendez 2007; Aliaga et al. 2011]. However, much of thedigital heritage artifacts are recorded in multimedia format, and traditional ontology representationschemes need multimedia records to be annotated for semantic processing. Such annotation is a labor-intensive process and a major bottleneck in creating a digital heritage collection. In this context, wepresent a different approach wherein the domain ontology, enriched with multimedia data and car-rying probabilistic associations between concepts, provides a means to curate a heritage collection bygenerating semiautomated annotation for the digital artifacts.The key contribution of our work presented in this article, is to utilize computing methods to preserve

the living heritage of a cultural domain like Indian classical dance. This is made possible by providingways to encode the highly specialized background knowledge of the domain in an ontology, and furtherproviding methods to correlate this knowledge to access the audio-visual recordings and other digitalartifacts of ICD in a seamless fashion. Our ontology-based framework provides a conceptual linkagebetween the heritage resources at the knowledge-level and multimedia data at feature-level. One ofthe key ingredients in our architecture is a cultural heritage ontology [Mallik and Chaudhury 2009]encoded in a novel multimedia ontology representation [Ghosh et al. 2007]. The ontology includes de-scriptions of domain concepts in terms of expected audio-visual features in multimedia manifestations,making it especially suitable for semantic interpretation of multimedia data. We have experimentedwith a heritage collection of ICD videos. Starting with a hand-crafted basic ontology for the ICD do-main, we create a multimedia enriched ontology for the domain by using a training set of video seg-ments labelled by domain experts. The heritage-specific knowledge of the domain is revalidated byapplying machine-learning techniques to finetune the ontology parameters. Once a multimedia en-riched ontology is available, it can be used to interpret the media features extracted from a largercollection of videos, to classify the video segments in different semantic groups, and to generate se-mantic annotations for them. The annotations enable creation of a semantic navigation environmentin the cultural heritage repository. Our knowledge-based system of preserving ICD heritage is calledthe Nrityakosha.2 It offers information technology-based means to safeguard the ancient tradition of aheritage domain like ICD and make it accessible to future generations.The rest of the article is organized as follows. Section 2 gives an overview of our ontology-based

framework. Section 3 gives a brief introduction to the domain of Indian classical dance. Section 4explains the requirement of a multimedia ontology for preserving digital heritage artifacts and theadvantage of using MOWL to build the multimedia ontology of the ICD domain. In Section 5, the an-notation generation framework is detailed along with its main module: concept-recognition. Section 6gives a brief idea of the ontology-based conceptual video browsing application. Section 7 concludes thearticle with a summary of our findings and directions for future work.

2. ONTOLOGY-BASED FRAMEWORKDigital heritage resources include digital replicas of the intangible heritage exhibits, such as videos orstill images of dance forms, as well as the contextual knowledge relating to the exhibits contributedby domain experts. Our architectural framework is motivated by the need for relating the digital ob-jects with contextual knowledge, to make the former more usable. The goal behind building such aframework is to give a novel multimedia experience to the user seeking to retrieve digital resources

2Sanskrit word Nritya means dance and Kosha means treasure, so Nrityakosha means a treasure of dance.



Fig. 1. Architecture for ontology-based management of a digital heritage collection.

belonging to a heritage collection. The heritage collection may be comprised of independently built col-lections of text documents, image documents, video files, music data, and scanned images pertaining todifferent concepts in the heritage domain. Our framework with its unique concept-recognition facultyhas the capability of correlating the digital resources in different collections, which pertain to the sameconcept in the ontology. The user, given an interface to browse the high-level concepts of the Heritagecollection via an ontology, selects a concept, say a poem from the musical epic Geeta Govinda.3 Per-forming a cross-modal retrieval with the help of the domain ontology, the retrieval system can providethe following:

lyrics of a poem from the epic: a text document;hand gestures and body postures corresponding to the words in the poem: Images;a dance performed on the poem: Video;a sculpture from the walls of a temple depicting the poem: Image;a song composed using the lyrics of poem: Music.

Thus, by correlating and presenting different heritage artifacts and providing access to the desiredones, we shift the cognitive load from the user to our system, in the process enhancing the usersmultimedia experience. Figure 1 shows the architecture of our ontology-based framework for managinga digital collection of intangible heritage artifacts. There are two major tasks in the framework.

(1) Ontology Creation. To begin with, a basic ontology for the domain is hand-crafted using anontology-editing tool like Protege by a group of domain experts. This includes the domain concepts,

3Jayadevas Geetagovinda, composed in the twelfth century, is an Indian classic and a part of world cultural heritage.http://www.geetagovinda.org/Geetagovinda.html.



their properties, and their relations. In general, the concepts follow a hierarchy. At the lowest level,some concepts can be detected using specific media detectors, for example, specific Bhangimas (bodypostures) in Indian classical dance. We call them media nodes in the ontology. At the higher level, thereare abstract concepts that have domain-specific relations with each other and associate with the medianodes. Taking an example from the domain of Indian classical dance, an Odissi dance is generally ac-companied by Odissi music which has audio characteristics. Mangalacharan is a kind of Odissi dance,and thus its concept node inherits the audio-visual properties of the Odissi dance, including the audioproperties of Odissi Music (Figure 4).The domain experts also provide conceptual annotations using a manual annotation tool to a training

set of multimedia data. Low-level media features are extracted by feature extractors and are used fortraining the media feature classifier for the media nodes in the ontology. The basic ontology considersall relations to be crisp. However, different instances of an abstract concept manifest in different setsof media nodes. Moreover, different instances of media nodes have different media characteristics. Thismotivates us to model the ontology with probabilistic associations [Ding et al. 2004].In the Ontology Learning module, we compute the joint probability distributions of the concept and

the media nodes and create the probabilistic associations. The multimedia ontology with conceptsthat have media-based properties and probabilistic associations needs a more evolved representationscheme. We use the MOWL language (detailed in Section 4) to represent the ontological concepts andthe uncertainties inherent in their media-specific relations. MOWL encoding of the ontology allowsus to construct a Bayesian network (BN) equivalent to it. The multimedia ontology thus created, en-codes the experts perspective and needs adjustments to attune it to the real-world data. Conceptualannotations help build the case data used for applying a machine-learning technique, called the FullBayesian Network (FBN) learning [Su and Zhang 2006], to refine the ontology.Thus, ontology learning is done by learning the structure and parameters of the BN derived from the

given ontology. While the concept nodes remain unchanged, some of the hand-crafted relations betweenthe concepts may be found statistically insignificant and get deleted. This is described in more detailin Mallik et al. [2008]. Some new significant relations may be found and get added in the ontology. Thetechnique is applied periodically as new labelled multimedia data instances are added to the collectionand the ontology is updated. This semi-automated maintenance of the ontology alleviates significantefforts on the part of knowledge engineers.(2) Annotation Generation and Conceptual Browsing. The multimedia ontology is used to automat-

ically annotate new instances of media artifacts. The media features of the new videos are extractedand a set of media detectors are used to detect the media nodes in the ontology. The MOWL ontologycan then be used to recognize the abstract domain concepts using a probabilistic reasoning framework,which is detailed in Section 5.1. The concepts so recognized are used to annotate the multimedia arti-facts. Automatic annotation of digital heritage artifacts enables quick ingestion of such artifacts in adigital heritage collection. Moreover, these annotations are used to create semantic hyperlinks in thevideo collection and to provide an effective video browsing interface to the user, which is detailed inSection 6.

3. INDIAN CLASSICAL DANCEWe introduce the domain of Indian classical dance, which we have selected as a test bed for our heritagepreservation scheme at this stage of the article, since it contains some special terms describing itsconcepts, and references to these in different parts of the article cannot be understood otherwise. Danceis a perceptual domain where both audio and visual components play an important part. Hence videoas a medium is best suited to capture its knowledge. Any classical dance is that subclass of dance that



Fig. 2. Hand gestures, face expressions and dance posture images from the ICD domain. With permission of Madhumita Raut(source: Odissi: What, Why and HowEvolution, Revival and Technique). Vanishka Chawla (dancer).

contains highly stylized body postures, hand gestures, and sequences;4 for ICD, some are shown inFigure 2.It is important to note here that hand gestures, body movements, and facial expressions of the dancer

always express a natural language keyword or phrase in Indian classical dance. Recognition of a handgesture or facial expression in an image or a video can lead to the discovery of the semantic conceptdefined by the keyword/s expressed (e.g., in Figure 2, the first hand gesture denotes a blooming lotusflower; the second gesture denotes the face of a deer; the first facial expression is of a sad mood,while second one expresses happiness or fun). Thus, a spatio-temporal sequence of hand gesturesand body postures can lead to the discovery of high-level concepts like a mood, words of a poem, or aportion of a mythological story. However an ontology is needed to understand the correlations and tobuild the spatio-temporal associations between the various dance steps (e.g., shown in Figure 6).While studying the domain, we came across various facts and facets of ICD, which we mention here,

as they helped construct a domain ontology.

There are at least 8 to 9 different Indian classical dance forms belonging to different parts of India,some of which are Bharatnatyam, Odissi, Kuchipudi, and Kathak. Each is different in choreography,posture, hand movements, costumes, colors, language, and music, yet follow some common basictenets, as laid out in ancient Indian literature.

All the dance forms are performed in a classical music rendition, which can be described somewhatin terms of Raag (melody) and Taal (beat). The two main subclasses of Indian classical music areCarnatic music and Hindustani music.

The uniqueness of Indian classical dance is that all the performances are mainly devotional in con-tent. The dance performances are based on stories and poems from Indian epics like Mahabharat,Ramayan, and GeetaGovinda, and those from legends in Indian mythology. Thus the dancers areconstantly realizing the roles of the deities described in the legends.

Much of the knowledge about the rules and grammar of Indian classical dance and music come fromthe Natya Shastra. Another piece of literature, called the Abhinaya Darpan, written by Nandikesh-var, first published in 1874, is known as the Indian classical dance manual. It contains vast amountof textual material for the study of the technique and grammar of body movements in an Indianclassical dance.

The sculpture found in ancient Indian monuments and temples has preserved a complete record ofthe dance postures from ancient times. It reflects a deep inter-relationship with Indian dance-drama.

4Basic to the Odissi dance are the postures known as chawk and tribhangi: chawk is a quadrangular posture of the body createdwith the help of shoulders, hands, knees, and legs; tribhangi is a triangular pose created by the dancers hips, waist, and head.(from www.pagesofindia.com/culture/dances-in-india.htm).



Fig. 3. Perceptual model. With permission of Madhumita Raut (source: Odissi: What, Why and HowEvolution, Revival andTechnique). Vanishka Chawla (dancer).

With this background, we explain the need for a multimedia ontology for preserving the knowledge ofthis domain, along with its digital resources in the next section.

4. MULTIMEDIA ONTOLOGY FOR INDIAN CLASSICAL DANCECultural heritage artifacts generally need to be digitized in audio-visual formats. Conventionalontology languages, such as OWL, use natural language constructs to represent concepts and theirrelations, making it convenient to use them for processing textual information. An attempt to use theconceptual model to interpret multimedia data gets severely impaired by the semantic gap that existsbetween the two. Current implementations of semantic web technology for cultural heritage reposito-ries rely on accompanying annotations, which are manually created, contextually derived (e.g., cameraparameters), or generated through media processing techniques. However, the concepts have theirroots in perceptual experience of human beings, and the apparent disconnect between the conceptualand the perceptual worlds is rather artificial. The key to semantic processing of media data lays inharmonizing the seemingly isolated conceptual and perceptual worlds.Concepts are formed in human minds as a result of many perceptual observations of the real-world

objects and abstraction of the experience, and are labeled with linguistic constructs to facilitate com-munication in human society. When a natural language construct is used to specify a concept, it givesrise to the expectation of some perceptible media properties. Observation of those media propertiesforms the basis of concept recognition in the real-world as well as on multimedia recordings. For ex-ample, Figure 3 depicts the formation of the concept Pranam (a Sanskrit word meaning salutation)and the abstracted visual patterns that are expected on an embodiment of the concept in a multimediaartifact. Note that the different instances of the concept have significant variations, and the perceptualmodel needs to cope with the uncertainty.Thus, we can form a model of the world where the presence of a concept causes some media patterns

to manifest in the multimedia data instances with a degree of definite probability. Detection of anyof such media pattern is evidence for the presence of the concept. A concept can be recognized on the



Fig. 4. Multimedia ontology snippet from the Indian classical dance domain. With permission of Madhumita Raut (source:Odissi: What, Why and HowEvolution, Revival and Technique). Vanishka Chawla (dancer).

basis of accumulated evidential value as a result of detection of a number of expected media patternswith a closed-world assumption. Multimedia Web Ontology Language [Ghosh et al. 2007] is based onthis causal model of the world. The model based on probabilistic cause-effect relations motivate usto use Bayesian network-based evidential reasoning for concept recognition. Associating media exam-ples with concepts in an ontology for multimedia data processing is described in Bertini et al. [2009].However, it does not include a scheme for reasoning with the media data.Figure 4 depicts a small section of an ontology encoding media-based description of concepts. The on-

tology snippet is part of the ICD ontology, and shows some of the concepts related to the Odissi danceform, which is a subclass of Indian classical dance. An Odissi dance performance is generally accom-panied by an Odissi music score. This relation is shown by an isAccompaniedBy relation between thetwo nodes. The Odissi music concept in the ontology has an audio media pattern attached to it, whichcan be detected by an audio media- detector. Similarly, visual properties of the folded hand posturedescribe the concept of Pranam. The Mangalacharan dance contains a sequence called BhumiPranam.A unique posture of the dancer, called chawk, can be used to identify that the dance performance is ofthe form Odissi dance. Visual properties of these media patterns can be used to describe these conceptswith some uncertainty. Individual media patterns are often connected by some spatial or temporal re-lationships, in the context of a concept or an event. Such spatio-temporal relations can be formallyspecified in MOWL, following the scheme proposed in Wattamwar and Ghosh [2008].

4.1 MOWL Language ConstructsThe MOWL language has been designed as an extension of OWL to ensure compatibility with theW3C standards. It uses OWL constructs to define classes, individuals, and properties. In addition, itproposes some language extensions for the following:

Encoding media properties. We use these constructs to attach media observables such as mediaexamples and media features (MPEG-7-based such as color, edge histogram, or composite media



Fig. 5. MOWL relations. With permission of Madhumita Raut (source: Odissi: What, Why and HowEvolution, Revival andTechnique).

patterns like postures, human actions) to concepts in our ontology. Figure 5(a) shows how the conceptPranam is associated with its media properties.

Specifying property propagation rules that are unique to media-based description of concepts. Forhierarchy relations, media properties propagate from superclass to subclass and media examplespropagate from subclass to superclass. There are relations in an ontology, different from hierarchyrelations, which also permit the flow of media properties. For example, say Odissi dance is accompa-nied by Odissi music, then the former generally inherits the audio properties of the latter, althoughthe relation is accompanied by does not imply a concept hierarchy (see Figure 5(b)). To allow specifi-cations of such extended semantics, MOWL declares several subclasses to OWL:OBJECTPROPERTY.

Specifying Spatio-Temporal Relations. Many concepts observed in videos are characterized by spe-cific spatial or temporal arrangements of component entities. The occurrence of such a relationis termed as an event, for example, a goal event in soccer, a dance step in ICD. A multimediaontology should be able to specify such concepts in terms of spatial/temporal relations betweenthe components. MOWL defines a new property, MOWL:SPATIOTEMPORALPROPERTY, as a subclass ofOWL:OBJECTPROPERTY to instantiate specific spatio-temporal relations.Classical Dance sequences are choreographical sequences of some short dance steps, which are

further choreographed as temporal sequences of classical dance postures performed by the dancer.We illustrate the spatio-temporal definition in MOWL, of a dance step labeled PranamDanceStep inICD domain in Figure 6.

Specification of probabilistic association between concepts and their media properties. The uncer-tainty in a domain can be characterized as a joint probability distribution over all the domain classesand individuals, which can then be factored into several component factors, each a local distributioninvolving fewer variables. The uncertainty of a variable can be specified as a conditional probabilitydistribution over a set of variables which are its direct causes. Uncertainty is encoded by specifyinga causal strength of association between concepts or between a concept and a media property spec-ification as a pair of probability values P(M | C) and P(M | C), as shown in Figure 7, where C



Fig. 6. Part(a) shows the construction of the observation graph for an event related to the Pranam concept from the ICDontology. The smallest event unit is the temporal relation followedBy. Hence we make an event entity node Folding of handsfollowedBy (Closing of eyes followedBy Bowing of head). The set of observation nodes connected to this event entity node isvirtual evidence for the observation node FOLDING OF HANDS and the event entity node Closing of eyes followedBy Bowing ofhead. Likewise for the second temporal relation, followedBy, we have the two observation nodes: Closing of eyes and Bowingof head, and the followedBy event. Part(b) shows the observation graph when the observable media objects (images of postures)are also specified in the ontology. The observation nodes are shown linked to the corresponding concept nodes. With permissionof Vanshika Chawla (dancer).

Mangalacharan(0.97)

Odissi(0.99)isPartOf(1.00,0.43)

BhumiPranam(0.84)contains(0.43,0.01)

Chawk(0.99)contains(0.35,0.00)

OdissiMusic(0.34)isAccompaniedBy(0.75,0.10)

MadhumitaRaut(0.05)isDancedBy(0.05,0.10)

ChawkPosture(1.00)

hasMF(0.90,0.00)

OdissiAudio(0.12)

hasMF(0.85,0.20)

MadhuFace(0.06)

hasMF(1.00,0.00)

Pranam(0.95)contains(0.89,0.01)

PranamPosture(1.00)

hasMF(1.00,0.01)

PranamDanceStep(0.39)

hasMF(0.83,0.20)

Fig. 7. Concept recognition in an observation model.

is a concept and M represents an associated concept or an associated media pattern. Thus MOWLsupports probabilistic reasoning with Bayesian networks, in contrast to crisp description logic-basedreasoning with traditional ontology languages.

4.2 Reasoning in MOWLThere are two distinct stages of reasoning with MOWL for concept recognition. The first stage involvesidentifying a set of media properties that are likely to manifest when a concept is present in a multi-media instance. In general, the set can include the media property of the concept as well as those ofsome related concepts in the domain. Referring to the ICD example mentioned earlier, an Odissi danceis generally accompanied by Odissi music. Thus, the audio characteristics of Odissi music are likelyto manifest in a video recording of Odissi dance. This form of media property inheritance is quiteACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.


distinct from the property inheritance rules that are supported by existing ontology models. MOWLconstructs include typing of the relations between the concepts and defines media property inheritancerules.Once the expected media properties for a concept are obtained, they together with the concepts

involved, are organized as a connected graph, called an observation model (OM) for the concept. TheOM is, in effect, a specification for the concept in terms of search-able media patterns. It is modeled asa Bayesian network. The joint probability distribution tables that signify causal strength (significance)of the different media properties towards recognizing the concept are computed from the probabilisticassociations specified in the ontology. Figure 7 depicts a typical OM for the concept Mangalacharan,which is an invocation dance in Odissi dance form. The media nodes are shown connected to theirrespective abstract concepts with a hasMF or hasMediaFeature relation (a MOWL relation) describedin Section 4. The OM derived from the associations in the ontology, part of which is shown in Figure 4,shows how the concept node Mangalacharan is specified in terms of the following:

PranamDanceStep - a composite concept (from the related concept BhumiPranam);PranamPosture - a folded hands posture (from the related concept Pranam);chawkPosture (from the parent node Odissi dance); andOdissiMusic (from the parent node Odissi dance), and so on.

The PranamDanceStep concept in the OM, shown as an ellipse with a double boundary, is a compositeconcept. It can be described in terms of other concepts along with the spatio-temporal relations be-tween them. Figure 6 shows the construction of the observation graph for the spatio-temporal event,PraamDanceStep, related to the Pranam concept.Once an OM for a semantic concept is created, the presence of the expected media patterns can be

detected in a multimedia artifact using appropriate media-detector tools. Such observations lead toinstantiation of some of the media nodes, which in turn result in belief propagation in the Bayesiannetwork. The posterior probability of the concept node as a result of such belief propagation representsthe degree of belief in the concept.

4.3 Nrityakosha CompilationThe ICD heritage collection for our Nrityakosha system was compiled by gathering dance videos fromdifferent sources. These include a highly specialized collection called Symphony Celestial Gold Edi-tion purchased from Invis Multimedia,5 which contains videos of classical dance performances byeminent Indian artists. Another set of high-quality dance performance videos was obtained from theDoordarshan Archives of India.6 Many dance DVDs were donated for research purposes by well-knownartists of ICD. The videos contain dance and music performances, training and tutorials on differentdance forms, as well as many interviews and talks on ICD. We started work with a data set of ap-proximately 20 hours of dance videos, consisting of dance performances of mainly two Indian classicaldance formsBharatnatyam and Odissi. The ICD ontology was constructed by encoding specializedknowledge gathered from the domain experts, as well as from dance manuals like Natya Shastra andAbhinaya Darpan. The ontology is written in MOWL. The experts provided conceptual labels for thecontent of about 30% of videos, which were then used as a training set to finetune the ICD ontology.Our ICD ontology contains around 500 concepts related to Indian dance and music in the ontology, outof which about 260 have media observable patterns (features/examples) associated with them.

5http://www.invismultimedia.com.6http://www.ddindia.gov.in/About DD/Programme Archives.



VanshikaFace

Taal

MisrachapuTaal

isa

Music

isa

HindustaniMusic

isa

CarnaticMusic

isa

Posture

KrishnaPose

isa

BodyMovement

isa

HandGesture

isa

DanceStep

isa

KrishnaYashodaDance

KrishnaYashodaPerformance_29

io

DancePerformance

hasTaal

isa

Role

hasRoles*

DanceForm

belongsTo

DurgaDance

isa

Story

hasRoles*

KrishnaStory

isa

Composition

isa

KrishnaRole

isa

Dancer

performedByYashodaRole

isa

KrishnaHandGesture

isa

KrishnaDanceStep

hasDanceStep hasPostureNaughtyKrishna

io

KrishnaDanceAction

hasMediaFeature

KrishnaFlutePosture

hasMediaFeature

MisrachapuAudio

ICDConcept

isa

isa

isa

isa

DanceEvent

isa

isa Artist

isa

isa

hasMusic

Bharatnatyam

isa Odissi

isa

VanshikaChawla

hasMediaFeature

io

hasMediaFeatureis_of_music

hasMusic

KrishnaYashodaStory

hasRoles*hasRoles*

isa

isa

hasRoleshasRoles

hasTaal

belongsTo

isa

hasDanceStep hasPosture performedBy

Fig. 8. A snippet from the MOWL ontology of Indian classical dance.

Figure 8 shows a snippet of the Indian Classical Dance ontology in MOWL represented graphically.Root node represents an ICD concept, subclasses of which are shown as Music, DanceForm, Artist,and Composition. This snippet focuses on an important mythological figurean Indian deity namedKrishna. Stories about Krishna abound in folklore, and all the classical dances of India have perfor-mances dedicated to events in his life. One of the events depicted here is enactment of a scene betweenKrishna and his mother Yashoda through a performance in Bharatnatyam dance form. Linkages anddependencies between a Story, a Role, a DancePerformance, a DanceForm, a Dancer, and various BodyMovements are encoded in the MOWL ontology.The pink-colored leaf nodes in elliptical shape denote media nodes, which can be associated with var-

ious media classifiers such as posture recognizer, face recognizer, and so on. The hierarchical relationsin the Ontology are denoted by black-colored edges and labeled isa. All other relations that denotemedia propagate properties between nodes, are shown as blue-colored, dotted edges. Nodes that arered-lined are individuals or instances of their respective class to which they are connected with aACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.


Fig. 9. Annotation tool architecture.

red-colored io (instance-of) link. This graphical representation of the ICD ontology represents aBayesian network. The conditional probability values are not shown here in order to preserve thevisual clarity of the diagram.To recognize concepts in a new video of ICD, evidence is gathered at the leaf nodes, as different

media features are recognized or classified in the video by the media classifiers. If evidence at a nodeis above a threshold, the media feature node is instantiated. In this snippet, the media nodes thathave been instantiated are MisrachapuAudio, KrishnaDanceAction, and KrishnaFlutePosture. Theseinstantiations result in belief propagation in the Bayesian network, and the posterior probability at theassociated concept nodes is computed. The nodes in gray color that are linked directly to the instanti-ated media nodes are MisrachapuTaal, KrishnaDanceStep, and KrishnaPose. After belief propagation,these nodes have high posterior probability. As they get instantiated, we find a high level of belief forthe existence of other high-level abstract nodes (in cyan color). These are CarnaticMusic, and henceBharatnatyam; NaughtyKrishna and hence KrishnaRole; KrishnaRole hence KrishnaYashodaStory,and KrishnaStory; KrishnaRole and Bharatnatyam hence KrishnaYashodaDance. Thus we surmisethat the video is of a Bharatnatyam dance performed to the accompaniment of Carnatic music, on thetheme of a Krishna Yashoda story, in which the dancer performed a Naughty Krishna role.

5. SEMANTIC ANNOTATION OF THE HERITAGE ARTIFACTSAn important contribution of our framework is the attachment of conceptual annotation to multimediadata belonging to the heritage domain, thus preserving the background knowledge and enhancing theusability of this data through digital access. This annotation is like curating or authoring a domain-based specific collection, where a curator selects possible domain concepts for annotating a media fileand the system helps by providing evidence for presence or absence of the concept/s and suggestingrelated concepts based on the media content analysis and ontological relations. Figure 9 shows thearchitecture of the annotation generation framework. It consists of five functional components. Thebasis of this whole framework is the MOWL ontology created from domain knowledge, enriched withmultimedia data and then refined with learning from annotated examples of the domain.



The most important component of this module is the concept-recognizer. The task of this module is torecognize the high-level semantic concepts in multimedia data with the help of low-level media-basedfeatures. The belief-propagation in BN which leads to the recognition of high-level semantic conceptsis explained in Section 5.1. OMs for the high-level concepts selected by the curator of the collection aregenerated from the MOWL ontology by the MOWL parser and given as input to this module. Low-levelmedia features (SIFT features, spatio-temporal interest points, MFCC features, etc., detailed later inthis section) are extracted from the digital artifacts which can be in different formats (image, audio,video), and provided to the concept recognizer by media feature extractor. Media pattern classifiers,trained by feature vectors extracted from the training set of multimedia data, help detect the mediapatterns (postures, actions, music, and so on) in the digital artifacts. Some of these classifiers aredetailed in Section 5.1.In initial stages of building the collection, data is labeled with the help of manual annotations, pro-

vided by the domain experts. The XML-based annotation generator is responsible for generating theannotation files in XML. The inputs to this module are the conceptual annotations (manual, super-vised, and automatic), and features of the multimedia data (in MPEG-7 format) and the output is theXML file as per MPEG-7 standard (example snippet shown in Figure 13(b)).

5.1 Concept RecognitionPreservation of heritage resources belonging to the performing arts in a digital form to give a holisticview to the audience, requires a solution to the problem of recognizing elementary domain conceptslike postures, actions, and audio in video sequences. We have tackled this problem with the help ofvarious media detectors, details of which are given below. We follow a common representation schemefor the different media patterns that are to be classified in concept recognition. Although the low-levelmedia features differ for the various media detectors, the basic steps to build the feature vectors forthe classifiers are as follows.

Collect labeled examples of media patterns (images or video snippets) from different dance videos.Extract descriptors for all the examples, and quantize the descriptors by K-means clustering algo-rithm to obtain a discrete set of local Nf feature words.

An example media pattern Ei is represented by an Indicator vector iv(Ei), which is a histogram ofits constituent descriptor words,

iv(Ei) = {n(Ei, f1), . . . , n(Ei, f j), . . . , n(Ei, fNf )} (1)where n(Ei, f j) is the number of local descriptors in image Ei, quantized into feature words f j ,through some similarity computation.

Train a support vector machine (SVM) classifier [Burges 1998] with the indicator vectors to classifythe media patterns.

5.1.1 Dance Action Categorization. We have used spatio-temporal interest points [Niebles et al.2008] for detecting human action categories for ICD dance actions. Spatio-temporal interest points areextracted for frames of a video. The extracted spatio-temporal interest points are used in the bag ofwords approach to summarize the videos in the form of spatio-temporal words. The results of danceaction categorization experiments performed for some recognizable Odissi dance actions are shown inTable I. Approximately 30 video shots of each action were submitted for training. We use the Wekamachine-learning platform (www.cs.waikato.ac.nz/ml) to train the SVM classifiers and then perform10-fold cross-validation tests on 77 videos to test the classification of the various dance actions. Theaccuracy of classification was found to be approx. 88.8% on an average, wherein the cluster sizes invarious tests were 50, 100, 200, and 500.ACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.


ArdhacheeraBhumiPranam Manjira Veena

Fig. 10. Examples of Odissi dance actions with spatio-temporal features. With permission of Madhumita Raut (source: Odissi:What, Why and HowEvolution, Revival and Technique).

Table I. SVM Classification Results for Odissi Dance ActionsClasses TPRate FPRate Precision Recall F Score ROC AreaArdhacheera 0.833 0.011 0.909 0.833 0.87 0.911Bhasa 1 0.024 0.875 1 0.933 0.988BhumiPranam 0.846 0.047 0.733 0.846 0.786 0.9Goithi 1 0.011 0.917 1 0.957 0.994JagranNritya 1 0 1 1 1 1Manjira 0.846 0 1 0.846 0.917 0.923TribhangiBhramari 0.75 0.011 0.857 0.75 0.8 0.87Veena 0.917 0 1 0.917 0.957 0.958VipritBhramari 0.75 0.022 0.75 0.75 0.75 0.864

5.1.2 Dance Posture Recognition. We have used the scale-invariant feature transform (SIFT) ap-proach [Lowe 2003] to recognize dance postures in still images taken from dance videos. This approachtransforms image data into scale-invariant coordinates relative to local features. Using the representa-tion scheme detailed above, an SVM classifier is trained with indicator vectors to classify the postures.We extracted about 628 images of various Bharatnatyam dance postures (by 13 different dancers) andabout 288 frames depicting various Odissi dance postures (by 6 dancers) from our set of ICD videos.These postures were labeled by the experts. The Bharatnatyam postures were classified into 16 classesand Odissi postures into 12 classes. Some of the Odissi postures with their SIFT features are shownin Figure 11. A 10-fold cross-validation using SVM classifier (Linear kernel, cost factor of 2.0) on theWeka machine learning framework yielded an accuracy of 92.78%.

5.1.3 Music Form Classification. An important component of the Indian classical dance is the mu-sic accompanying the dance performance. For our experiments on classification of music forms, weselected 160 video clips of duration 45 seconds to 2 minutes from different dance forms, with fivedifferent music categories. Using the concept of audio terms discovered in an audio file, each audiofile was represented by a feature vector composed of an aural feature vector, a Mel feature cepstralcoefficient (MFCC) [Logan 2000] feature vector, and a combination of both. Many tests with 10-foldcross-validation using the SVM classifier (linear kernel, cost factor of 2.0) on a Weka machine learningframework, were conducted. The results are shown in Figure 12 via a comparative study of the varioustests done using different numbers of clusters and a confusion matrix for one of the tests. The accuracyof classification was high, averaging around 95%.



Alasa Chawk Darpani Mardala Manjiraindolent, languid squarestance looking into a mirror playing 'mardala' instrument playing 'manjira'

Fig. 11. Odissi dance postures with SIFT features. With permission of Madhumita Raut (source: Odissi: What, Why and HowEvolution, Revival and Technique).

Table II. SVM Classification Results for Odissi Dance PosturesClasses TPRate FPRate Precision Recall F-Measure ROC Area

Alasa 1 0.011 0.9 1 0.947 0.994Ardhacheera 1 0 1 1 1 1chawk 0.955 0.013 0.955 0.955 0.955 0.971DarpaniLeft 1 0 1 1 1 1DarpaniRight 0.833 0 1 0.833 0.909 0.917Flute 0.5 0 1 0.5 0.667 0.75Manjira 0.875 0.011 0.875 0.875 0.875 0.932Mardala 0.833 0.011 0.833 0.833 0.833 0.833Tribhangi 1 0 1 1 1 1TribhangiRight 1 0.037 0.842 1 0.914 0.981

0.00

0.00

0.00

0.0625

0.00

1.00

0.00

0.0625

0.00

0.00

0.00

1.00

0.00

0.0625

0.00

0.00

0.00

0.9375

0.00

0.00 Odissi

Satriya

Carnatic

Manipuri

Hindustani0.00

0.00

Carnati

c

Hindus

tani

Manip

uri

Odissi

Satriy

a

1.00

0.00 0.8750

(a) Classification Accuracy for AudioFeatures

(b) Confusion Matrix of Combined Vectorswith 50 clusters

Fig. 12. Experimental results of music form classification.

To illustrate concept-recognition using these media classifiers, we refer to the OM for the conceptMangalacharan detailed in Figure 7. Here, the BN corresponding to the OM is shown after somemedia patterns were detected in an Odissi dance video and corresponding media nodes were instanti-ated. The concept Odissi Dance is related to the root node Mangalacharan as its parent. Basic to theOdissi dance are the two postures known as chawk and tribhangi. One of the famous Odissi dancers isACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.


Madhumita Raut. Mangalacharan includes the dance action, BhumiPranam, which contains thePranam (Figure 3) posture.Media nodes are shown as ellipses in a light pink color and are attached to their associated concept-

nodes by pink-colored links called hasMF (hasMediaFeature). The pair of values at each link denotethe probabilities P(M | C) and P(M | C), where C is a concept and M represents an associatedconcept or media pattern. The OM is constructed from the ICD ontology refined with FBN learning,so the probability values shown correspond to real-world data. Bracketed value with the name of eachnode denotes its posterior probability after media nodes have been instantiated and belief propagationhas taken place in the BN.For a new video, media features are extracted and media patterns detected to initiate concept-

recognition. Concept-recognition occurs with belief propagation in the BN representing the OM. Inthis video, the media patterns detected are chawkPosture and PranamPosture, shown as dark pinkellipses. The concept nodes Chawk and Pranam highlighted in gray are the low-level concepts whichare recognized due to presence of these media patterns in data. Due to belief propagation in the BN,higher-level concept nodes (in cyan) are recognized as present in the video. The presence of the Chawkconcept causes Odissi Dance to be recognized. Presence of Pranam and BhumiPranam lead to therecognition of the Mangalacharan concept, which is further confirmed by recognition of the OdissiDance concept in the video. Conceptual annotations are generated and attached to the video throughthe annotation generation framework, detailed earlier in this section.

5.2 Video AnnotationHere we illustrate an application which is based on the annotation generation framework describedabove. This is a video annotation tool that allows the curator of a heritage video collection to attachconceptual annotations to the videos in the collection. The spatio-temporal characteristic of video datameans that an annotation can be associated to the video content at any level of spatio-temporal detail.A comprehensive annotation scheme needs to take care of video analysis at multiple levels of granu-larity at which the entities may exist and interact. For example, annotations can be associated with aspatial region within a frame, a complete frame, a spatio-temporal region spanning a set of frames, aset of frames within a shot, and a complete shot, and a set of shots, or the complete video.Events can also be specified at multiple levels of granularity. A spatial event can be associated with

a spatial region in a frame where two spatial entities happen to be in a specified spatial relation.Temporal events occur when two temporal entities exhibit a specified temporal relation. Events (spatialand temporal) can also correspond to the change of state (e.g., change of spatial relation or temporalrelation with another entity). An event which depicts a change of state can be associated with a spatialregion, that is, the spatial expanse of the aggregate of the spatial regions corresponding to the initialand the final states, or a temporal range, that is, the duration over which a change of state occurs.Some examples of the different kinds of events from ICD domain are the following.

Spatial events. Hand gestures, body postures, facial features, facial expressions.Temporal events. A choreographic sequence related to a dance; steps like making a circle, walkingto the left side; a sequence of hand gestures and body postures that express the words of a song.Certain dance sequences in classical dance have a set pattern of steps following each other, andsuch a sequence is denoted by a name. For example, Bhumi Pranam (bowing to the earth), is achoreographic sequence where the dancer squats, bending her knees, and touches her forehead withboth palms after touching the floor.

Spatio-temporal events. These help recognize the different roles played by different objects in a videoshot. In case of a group dance there is a main dancer accompanied by other dancers. A sequence



(a) Video annotation tool snapshot. With permission from Vanshika Chawla, dancer.

(b) Example XML snippet from a video annotation file. With permission from Vanshika Chawla, dancer.

Fig. 13.

depicting a dialogue between say a mother and son (Yashoda and Krishna), has two roles beingenactedone actor-dancer plays the mother, and the second plays the role of the son.

In our work, we specify an annotation as a triplet: The concept could be a high-level or low-level concept from the domain ontology. The parameter locationspecifies the spatio-temporal region to which the concept refers to. The parameter probability providesACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.


(a) Conceptual video browser screen shot showing the different panes. With permission from Vanshika Chawla, dancer.

(b) Conceptual video browser screen shot showing selection of concepts through a textual representation of the ontology. With permission from Vanshika Chawla, dancer.

Fig. 14.

a graded measure of the likelihood that the concept is relevant to the specified location. This parameteris optional, and in case of default, the concept is considered relevant (with probability 1) to the specifiedlocation.The ontology derived Bayesian network can now be considered to model the dependencies between

concept and hypotheses. A concept hypothesis is the pair , where the location forevery concept is hypothesized to correspond to some spatio-temporal region/s. The Bayesian networkwould then make use of the evidential observations at those hypothesized locations and infer proba-bilities for the relevance of the concepts to their corresponding hypothesized locations.

6. BROWSING AND QUERYING VIDEO CONTENTThe video database in our Nrityakosha system consists of videos of Indian classical dance as well astheir annotation files. These XML files for storing the conceptual annotations, generated by the videoannotation application (Section 5.2), contain sufficient details to allow access to any section of the



video, right from a scene to a shot, to a set of frames depicting an event, to a single frame denotinga concept, to an object in an audio-visual sequence. Each of these entities is labeled with one or moreconcepts in the Domain ontology, through manual, supervised and automatic annotation generation asdetailed in Section 5.The video annotation files in XML, along with the actual videos and aMOWL ontology of the domain,

constitute the inputs to the video browsing application called the conceptual video browser (CVB). Thedomain ontology enriched with multimedia videos of the domain is parsed to produce a textual orgraphical representation. Users can view the domain concepts along with their relations and selectdomain concepts to access. This browsing interface provides the user with a concept-based navigationof the video database, using which he or she can easily browse and navigate progressively towardsmore relevant content. Using this interface, a user can retrieve video content from different videos,view annotations attached to a video entity, and navigate to other similar video entities using hyper-links. Instances of the same concept entity or related concept entities are hyper-linked for this purpose.We illustrate how CVB can be used to browse the ICD video collection with a MOWL ontology of the

ICD domain as an input to the browsing tool, with some example snapshots in the following pages.Users who could typically be ICD dancers or students of ICD can use the CVB

to learn about specific dance postures, dance steps, hand gestures, and so on, by querying about themand viewing the various video clips and images returned as search results;

to watch performances related to a particular poem (e.g., Geeta Govinda or story, e.g., Mahabharatin mythology);

to watch portrayals of a mythological character, say Krishna or Yashoda in different dance forms;to get to see dance performances pertaining to a particular music form like Carnatic music; or toselect a musical beat (called Taal in ICD ) or melody(Raag), and so on.

The system is able to show videos that are related to the search results on the basis of various onto-logical relations, and thus provide a comprehensive browsing and navigational interface for the ICDcollection.We invited 15 exponents and dancers in the ICD domain to use the conceptual video browser for

searching and browsing the heritage collection of ICD videos. They were happy to find an interac-tive, computer-based, comprehensive system that provides hands-on knowledge about the basic dancepostures, music forms, and dance steps in ICD. Concepts and relationships in the ontology were vali-dated and approved by them, as they could see them in context of the dance videos. While using theCVB, they were able to search for the dance performances by entering different queriesrelated to adance, a dancer, a music form, a character portrayed, a geographic region, a composition or even thename of a posture or dance step. Sometimes, when they could not formulate what they were lookingfor, the ontology-guided browsing interface helped them navigate to the relevant content. Thus, theywere able to view different aspects of dance and visit different sections of the graphically representedICD domain. A subjective user satisfaction score was given by the users based on a scale of 1 to 5 inincreasing order of their satisfaction with the usage of the CVB. They were asked to give a score onseveral parameters like relevance of the search results, ease of finding a dance video, correctness of theontological relations, comprehensiveness of the dance ontology and the user-friendliness of the GUI.The mean opinion score computed by averaging the subjective scores was 4.

7. CONCLUSIONThis work is an attempt to help preserve the knowledge about Indian classical dance, which is anancient heritage. It makes use of technology to capture the grammar, the rules, and the ethics ofACM Journal on Computing and Cultural Heritage, Vol. 4, No. 3, Article 11, Publication date: December 2011.


(a) CVB screen shot showing selection of Bharatnatyam Dance concept. Related videos show concepts CarnaticMusic and TamilLanguage.

CVB screen shot showing observation graph for BharatnatyamDance and its ontological relations with the concepts CarnaticMusic and TamilLanguage.

(b)

Fig. 15. Example I. The user wishes to view videos of a particular dance form and has the intention of viewing the danceperformances which pertain to a particular kind of music and beat. (a) The user selects a major dance form BharatnatyamDancefrom the ontology and submits a conceptual query for search. Search results show thumb-nails of videos of BharatnatyamDance.The hyper-link pane shows related videos under two columns, which have labels: CarnaticMusic and TamilLanguage. The userselects to play a video from search results by clicking on its thumbnail. (b) The user chooses to view the ontological relations ofBharatnatyamDance by pressing the View OM button. To navigate further, the user can select the music form CarnaticMusicor the language TamilLanguage and browse videos pertaining to the selected concept. With permission from Vanshika Chawla(dancer).



Fig. 16. Example I (continued). The user selects to view the OM for CarnaticMusic in order to navigate further. After viewingthe ontological relations, he can select a musical beat (Taal) - AdiTaal which is related to CarnaticMusic. This CVB Screen shotshows the observation graph for CarnaticMusic concept and its relations with other concepts including the AdiTaal concept.With permission of Vanshika Chawla (dancer).

Fig. 17. Example II. The user selects the role of the mythological figure Krishna to view the portrayals of this character indifferent dance forms. The CVB screen shot here shows selection of the KrishnaRole concept, along with the observation graphfor KrishnaRole showing linkages with media features like KrishnaDanceStep and KrishnaFlutePosture (refer Section 3 andFigure 8). With permission of Vanshika Chawla (dancer).



(a) CVB screen shot showing selection of VanshikaChawla concept and its observation graph.

(b) CVB screen shot showing the observation graph for BharatnatyamDancer.

Fig. 18. Example III. The user wishes to view dance performances of a particular dancer, and then may wish to view danceperformances of other dancers of the same dance form. (a) The user selects a dancers name VanshikaChawla from the ontology.He discovers after viewing the OM for the dancer that she is a Bharatnatyam dancer; (b) He then selects the concept Bharat-natyamDancer in order to search for the larger set of dance performances for all Bharatnatyam dancers. The figure showsthe OM for BharatnatyamDancer concept which has two dancers VanshikaChawla and AnitaRatnam as its instances, and thusthe media examples of these two dancers propagate to this concept due to the MOWL interpretation of such relations (refer toSection 4.1). Thus user gets to view dance performances by all Bharatnatyam dancers. With permission of Vanshika Chawla(dancer).



dance as laid down in ancient scripts and treatises. It also tries to capture the various interpretationsof this knowledge in actual performances by dancers and the dance-gurus over time as well as incontemporary world. This has been made possible by using ontology to encode the interdependenciesbetween various facets of dance and music and by building a knowledge base enriched with multimediaexamples of different aspects of dance. With the help of this multimedia-enriched ontology, we are ableto attach conceptual metadata to a larger collection of digital artifacts in the heritage domain andprovide a holistic, semantically integrated navigational access to them.In this work, we have experimented with videos of Indian classical dance that belong to the two main

classical dancesBharatnatyam and Odissi. We plan to extend the knowledge base by adding videos ofother Indian dances. The media properties studied and analyzed are mainly the dance postures, shortdance actions, music forms, and some textual and face-recognition features. We would like to includehand gestures that denote specific things or characters, and face expressions that are used to expressemotions and moods, as they are an important component of Indian dance. Content-based analysistools which include hand-gesture recognition and facial expression interpreters would greatly enhanceconcept recognition and annotation generation in this domain. Another enhancement can come fromextending the browsing tool to personalize retrieval of search results and add context and geographicalconnotations to the ontology.

ACKNOWLEDGMENTSThe ICD dance videos were contributed for research by the PadmaShri-awarded, Odissi dancer, GuruMayadhar Raut and his daughter and disciple, Ms. Madhumita Raut [Raut 2007], and by Ms. VanshikaChawla, a Bharatnatyam dancer based in New Delhi. The required third-party permissions were ob-tained for using the images from their performances in this article. A few images from the WikimediaCommons database [Wikimedia ] have also been used in this article.

REFERENCESALIAGA, D. G., BERTINO, E., AND VALTOLINA, S. 2011. Decho: A framework for the digital exploration of cultural heritage objects.

J. Comput. Cult. Herit. 3, 12, 1, 1226.ASI-INDIA. Archaeological survey of Indiahome. http://www.asi.nic.in/index.asp.BERTINI, M., BIMBO, A. D., SERRA, G., TORNIAI, C., CUCCHIARA, R., GRANA, C., AND VEZZANI, R. 2009. Dynamic pictorially enrichedontologies for digital video libraries. IEEE Multimedia 16, 4251.

BURGES, C. J. C. 1998. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2, 121167.DING, Z., PENG, Y., AND PAN, R. 2004. A Bayesian approach to uncertainty modeling in OWL ontology. In Proceedings of the

International Conference on Advances in Intelligent Systems: Theory and Applications.EUROPEANART. Web gallery of art, image collection, virtual museum, searchable database of European fine arts.http://www.wga.hu/.

FONI, A. E., PAPAGIANNAKIS, G., AND MAGNENAT-THALMANN, N. 2010. A taxonomy of visualization strategies for cultural heritageapplications. J. Comput. Cult. Herit. 3, 1:11:21.

GHOSH, H., CHAUDHURY, S., KASHYAP, K., AND MAITI, B. 2007. Ontology Specification and Integration for Multimedia Applications,Springer, Berlin.

HAMMICHE, S. 2004. Semantic retrieval of multimedia data. In Proceedings of the 2nd ACM International Workshop on Multime-dia Databases, ACM, New York, 3644.

HERNENDEZ, F. 2007. Semantic web use cases and case studies: An ontology of Cantabrias cultural heritage.http://www.w3.org/2001/sw/sweo/public/UseCases/FoundationBotin/.

HUNTER, J. 2003. Enhancing the semantic interoperability of multimedia through a core ontology. IEEE Trans. Circuits Syst.Video Technol. 4958.

KALASAMPADA. Digital library: Resources of Indian cultural heritage. http://www.ignca.nic.in/dlrich.html.LOGAN, B. 2000. Mel frequency cepstral coefficients for music modelling. Tech. rep., Cambridge Research Lab, Cambridge MA.LOUVRE MUSEUM official website. http://www.louvre.fr/llv/commun/home.jsp?bmLocale=en.



LOWE, D. 2003. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 20, 91110.MALLIK, A. AND CHAUDHURY, S. 2009. Using concept recognition to annotate a video collection. In Proceedings of the 3rd Interna-

tional Conference on Pattern Recognition and Machine Intelligence (PReMI09). Springer, Berlin, 50751.MALLIK, A., PASUMARTHI, P., AND CHAUDHURY, S. 2008. Multimedia ontology learning for automatic annotation and video browsing.In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR08). ACM, New York, 387394.

NIEBLES, J. C., WANG, H., AND FEI-FEI, L. 2008. Unsupervised learning of human action categories using spatial-temporal words.Int. J. Comput. Vision, 299318.

PETRIDIS, K. 2005. Knowledge representation and semantic annotation for multimedia analysis and reasoning. In IEE Proceed-ings on Vision, Image and Signal Processing. 255262.

RAUT, M. 2007. Odissi: What, Why and How Evolution, Revival and Technique. B.R. Rhythms, Delhi.STASINOPOULOU, T., BOUNTOURI, L., KAKALI, C., LOURDI, I., PAPATHEODOROU, C., DOERR, M., AND GERGATSOULIS, M. 2007. Ontology-based metadata integration in the cultural heritage domain. In Proceedings of the 10th International Conference on AsianDigital Libraries (ICADL07). Springer, Berlin, 165175.

SU, J. AND ZHANG, H. 2006. Full Bayesian network classifiers. In Proceedings of the 23rd International Conference on MachineLearning. 897904

TSINARAKI, C. 2005. Ontology-based semantic indexing for mpeg-7 and tv-anytime audiovisual content. In Special issue of Mul-timedia Tools and Application J. Video Segment. Semantic Annot. Transcod. 26, 299325.

UNESCO-HERITAGELIST. 2010. The list of intangible cultural heritage in need of urgent safeguarding.http://www.unesco.org/culture/ich/index.php?lg=en&pg=00011.

WATTAMWAR, S. S. AND GHOSH, H. 2008. Spatio-temporal query for multimedia databases. In Proceedings of the 2nd ACM Work-shop on Multimedia Semantics (MS08). ACM, New York, 4855.

WIKIMEDIA. Wikimedia commons database: A database of freely usable media files to which anyone can contribute. http: //com-mons.wikimedia.org/wiki/Main Page.

Received March 2011; accepted April 2011


dance preservation and technology

Documents