Bharatanatyam Dance Transcription using Multimedia Ontology and Machine Learning

04/24/2020
by   Tanwi Mallick, et al.
IIT Kharagpur
0

Indian Classical Dance is an over 5000 years' old multi-modal language for expressing emotions. Preservation of dance through multimedia technology is a challenging task. In this paper, we develop a system to generate a parseable representation of a dance performance. The system will help to preserve intangible heritage, annotate performances for better tutoring, and synthesize dance performances. We first attempt to capture the concepts of the basic steps of an Indian Classical Dance form, named Bharatanatyam Adavus, in an ontological model. Next, we build an event-based low-level model that relates the ontology of Adavus to the ontology of multi-modal data streams (RGB-D of Kinect in this case) for a computationally realizable framework. Finally, the ontology is used for transcription into Labanotation. We also present a transcription tool for encoding the performances of Bharatanatyam Adavus to Labanotation and test it on our recorded data set. Our primary aim is to document the complex movements of dance in terms of Labanotation using the ontology.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

page 9

page 17

page 24

page 33

page 37

page 38

09/30/2020

Visual Semantic Multimedia Event Model for Complex Event Detection in Video Streams

Multimedia data is highly expressive and has traditionally been very dif...
09/24/2019

Posture and sequence recognition for Bharatanatyam dance performances using machine learning approach

Understanding the underlying semantics of performing arts like dance is ...
05/27/2014

A Topic Model Approach to Multi-Modal Similarity

Calculating similarities between objects defined by many heterogeneous d...
01/10/2014

STIMONT: A core ontology for multimedia stimuli description

Affective multimedia documents such as images, sounds or videos elicit e...
07/23/2020

METEOR: Learning Memory and Time Efficient Representations from Multi-modal Data Streams

Many learning tasks involve multi-modal data streams, where continuous d...
10/26/2018

Investigating non-classical correlations between decision fused multi-modal documents

Correlation has been widely used to facilitate various information retri...
09/15/2017

WOAH: Preliminaries to Zero-shot Ontology Learning for Conversational Agents

The present paper presents the Weighted Ontology Approximation Heuristic...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Dance is a form of art that may a tell story, set a mood, or express emotions. Indian Classical Dance (ICD) is an ancient heritage of India, which is more than 5000 years old. With the passage of time, the dance has been performed, restructured, reformulated, and re-expressed by several artists. New choreography has been composed using the basic forms. Hence, the dance forms have been associated with rich set of rules, formations, postures, gestures, stories, and other artifacts. But, till date it has been passed on to the students by the teacher, from one generation to the next, through the traditional method of Guru-Shishya Parampara, which is the typically acknowledged Indian style of education where the teacher (Guru) personally trains her / his disciple (Shishya) to keep up a continuity (Parampara) of education, culture, learning, or skills. Hence, there is a need to preserve the intangible heritage of the dance artifacts.

Recently many significant systems have been developed to preserve cultural heritage through digital multimedia technology. Preservation of the tangible heritage resources like monuments, handicrafts, and sculpture can be done through digitization, and 2D and 3D modeling techniques. Preservation of intangible resources like language, art and culture, music and dance, is more complex and requires a knowledge intensive approach. Therefore, only little work has been carried out for preservation of the dance heritage.

These dance forms embody a collection of knowledge which can be preserved either through creating digital transcription of the performances or by annotating the video recordings of performances. Analysis of dance can help convert the audio-visual information of dance into a graphical notation. Dance transcription, still a rarity, can be handy to preserve the heritage of a country like India which boasts of diverse types of classical dance forms. Transcription can also help exchanging dance ideas between performers. Another way of preserving the intangible heritage of dance is dance media annotation or to attach conceptual metadata to the collection of digital artifacts. The collection of digital artifacts with conceptual metadata can help in semantic access to the heritage collection.

Mallik et al. [mallik2011nrityakosha] present an ontology based approach for designing a cultural heritage repository. A Multimedia Web Ontology Language (MOWL) is proposed to encode the domain knowledge of a choreography. The suggested architectural framework includes a method to construct the ontology with a labeled set of training data and the use of the ontology to automatically annotate new instances of digital heritage artifacts. The annotations enable creation of a semantic navigation environment in the repository. The efficacy of the approach is demonstrated by constructing an ontology for the domain of ICD in an automated fashion, and with a browsing application for semantic access to the heritage collection of ICD videos.

Use of notation is another way of recording dance for future use. A system of notation is required for recording the details of postures and movements in domain of dance. Labanotation [guest2014labanotation] is a widely used notation system for recording human movements in terms of graphical primitives and symbols. Karpen [Karpen90] first attempted to manually encode the movements of Bharatanatyam on paper using Labanotation. It has been demonstrated by examples that the body movement, space, time, and dynamics of the ICD, in particular Bharatanatyam, can be described through Labanotation. Hence it is argued that the Labanotation, coupled with video filming, is a good way to record ICD. According to the author, hand gestures can also be easily implemented in Labanotation together with palm facing and specification of the quality of movement. However, no attempt was made in this paper to automate the process and for the next about three decades no work was done in transcribing ICD in Labanotation. Some research on dance preservation using notation has been carried out in other dance forms like Thai dance, Contemporary dance etc. Raheb et al. [el2011labanotation] use Web Ontology Language (OWL) to encode the knowledge of dance. The semantics of the Labanotation system is used to build elements of the ontology. Tongpaeng et al. [tongpaeng2017thai] propose a system to archive the knowledge of Thai dance using Labanotation and then use the score of the notation to represent the dance in 3D Animation. Till date automatic generation of Labanotation from the recorded dance video has not been attempted.

This work has been inspired by the idea of musical notations. Similar dance transcription systems may be useful in several way. The system can generate parse-able representation of a dance performance, help to preserve intangible heritage, help to annotate performances for better tutoring, and can be used as a front-end for dance synthesis. We first attempt to capture the concepts in Bharatanatyam Adavus in an ontological model. At the top level, a Bharatanatyam Adavu can be expressed as a dance (sequence of visual postures) accompanied by music. Further, we identify the concepts of audio and video structures of Bharatanatyam Adavu. We next build an event-based low-level model that relates the ontology of Adavus to the ontology of multi-modal data streams (RGB-D of Kinect in this case) for a computationally realizable framework. An event denotes the occurrence of an activity (called Causal Activity) in the audio or the video stream of an Adavu. The events of audio, video and their synchronization, thus, are related to corresponding concepts of the ontological model. We use this ontology and event characterization for transcription into Labanotation using Laban ontology. We also present a transcription tool for encoding the performances of Bharatanatyam Adavus to Labanotation and test it on our recorded data set. Our aim is to examine the ways in which Labanotation can be used for documenting the dance movements.

2 Indian Classical Dance: Bharatanatyam and its Adavus

We introduce the domain of Bharatanatyam Adavu in the context of our knowledge capture and heritage preservation scheme of the article for the ease of understanding the through the entire paper.

Bharatanatyam is one of the eight111ICD has eight distinct styles as recognized by the Ministry of Culture, Government of India: namely, Bharatanatyam, Kathak, Odissi, Kathakali, Kuchipudi, Manipuri, Mohiniyattam, Sattriya. Indian Classical Dance forms. Like most dance forms, Bharatanatyam Adavu too is deeply intertwined with music. It is usually accompanied by instrumental (Tatta Kazhi222A wooden stick is beaten on a wooden block to produce instrumental sound., Mridangam, Flute, Violin, Veena, etc.) and / or vocal music (Carnatic style – with or without lyrics) called Sollukattu. Adavus are the basic units of Bharatanatyam that are combined to create a dance performance. An Adavu involves various postures and gestures of the body including torso, head, neck, hands, fingers, arms, legs and feet, and eyes. While performing Adavus, the dancer stamps, rubs, touches, slides on the ground in different ways in synchronization with the Sollukattu. There are 15 basic Adavus in Bharatanatyam – most having one or more Variants. In total, we deal with 58 Adavu variants. There exists a many-to-one mapping from the Adavus to the Sollukattus.

2.1 Sollukattus and Bols – the Music of Adavus

Bharatanatyam is deeply intertwined with music. It is usually accompanied by Instrumental (Tatta Kazhi, Mridangam, Flute, Violin, Veena, etc.) and / or Vocal music (Carnatic style – with or without lyrics). The music is strung together in sequences to create different rhythmic patterns, called Taalam333Taalam is the Indian system for organizing and playing metrical music., to accompany dance performances. A repeated cycle of Taalam consists of a number of equally spaced beats, which are grouped into combinations of patterns. Time interval between any two beats is always equal. The specific way they mark the beats (by tapping their laps with their fingers, palm, and back of the hand; or by a specific instrument) are determined by these patterns of the beats or the Taalam.

Sollukattu # Description of Bols
Beats
Joining A 8 tat dhit ta [B] tat dhit ta [B]
Joining B 6 [dhit dhit] tei [dhit dhit] tei [dhit dhit] tei [dhit dhit] tei
Joining C 8 tei tei [dhit dhit] tei tei tei [dhit dhit] tei
KUMS 6 [tan gadu] [tat tat] [dhin na] [tan gadu] [tat tat] [dhin na]
Mettu 8 tei hat tei hi tei hat tei hi
Nattal A 8 tat tei tam [B] dhit tei tam [B]
Nattal B 8 [tat tei] tam [dhit tei] tam [tat tei] tam [dhit dhit] tei
Tattal 8 tat tei ta ha dhit tei ta ha
Natta 8 [tei yum] [tat tat] [tei yum] ta [tei yum] [tat tat] [tei yum] ta
Paikkal 8 [dhit tei da] [ta tei] [dhit tei da] [ta tei]
[dhit tei da] [ta tei] [dhit tei da] [ta tei]
Pakka 8 ta tei tei tat dhit tei tei tat
Sarika 8 tei a tei e tei a tei e
Tatta A 8 [tei ya] tei [tei ya] tei [tei ya] tei [tei ya] tei
Tatta B 6 tei tei tam tei tei tam
Tatta C 8 [tei ya] [tei ya] [tei ya] tei [tei ya] [tei ya] [tei ya] tei
Tatta D 8 tei tei [tei tei] tam tei tei [tei tei] tam
Tatta E 8 tei tei tam [B] tei tei tam [B]
Tatta F 8 tei tei tat tat tei tei tam [B]
Tatta G 6 tei tei tei tei [dhit dhit] tei
TTD 8 [tei tei] [dhat ta] [dhit tei] [dhat ta] [tei tei] [dhat ta] [dhit tei] [dhat ta]
Tirmana A 12 ta [tat ta] jham [ta ri] ta [B] jham [ta ri] jag [ta ri] tei [B]
Tirmana B 12 [tat ding] [gin na] tom [tak ka] [tat ding] [gin na]
tom [tak ka] [dhi ku] [tat ding] [gin na] tom
Tirmana C 12 [ki ta ta ka] [dha ri ki ta] tom tak [ki ta ta ka] [dha ri ki ta]
tom [tak ka] [dhi ku] [ki ta ta ka] [dha ri ki ta] tom
Multiple bols at the same beat are enclosed within []
[B] stands for a beat without any bols – typically called stick-beat
KUMS, Mettu, Nattal, Tattal, and TTD stand for Kartati–Utsanga–Mandi–Sarikkal, Kuditta Mettu, Kuditta Nattal, Kuditta Tattal, and Tei Tei Dhatta respectively
Table 1: List of Sollukattus with bol compositions

Taalams necessarily synchronize the movements of various parts of the body with the music through a structured harmonization of four elements, namely – (a) Rhythmic beats of Taalam, (b) Mridangam beats from percussion, (c) Musical notes or Swaras444Swara, in Sanskrit, connotes a note in the successive steps of the octave., and (d) Steps of the Adavus. It may be noted that a number of different Taalams are used in Bharatanatyam. The Taalams555Adavus can be performed in all 7 taalams as well; but the rest are less popular. commonly used in Adavu are – Adi taalam (8 beats’ pattern) and Roopakam taalam (6 beats’ pattern). Finally, a Taalam is devoid of a physical unit of time and is acceptable as long as it is rhythmic in some temporal unit. With a base time unit, however, Bharatanatyam deals with three speeds, called Kaalam or Tempo. The Taalams are played mainly in 3 different tempos – Vilambitha Laya or slow speed, Madhya Laya (double of Vilambitha Laya) or medium speed, and Drutha Laya (quadruple of Vilambitha Laya) or fast speed.

A phrase of rhythmic syllables (Sollukattu), is linked to specific units of dance movement in an Adavu. A Sollukattu666sollukattu’ = sollum (syllables) + kattu (speaking). A Sollukattu means a phrase of rhythmic syllables linked to specific units of dance movement (Adavu). is a specific rhythmic musical pattern created by combination of instrumental and vocal sounds. Traditionally, a Tatta Kazhi (wooden stick) is beaten on a Tatta Palahai (wooden block) for the instrumental sound and an accomplice of the dancer speaks out a distinct vocalization of rhythm, like tat, tei, ta etc., called Bols777Bols (or bolna = to speak), are mnemonic syllables for beats in the taalam.. In a Sollukattu, both the instrument and the voice follow in sync to create a pattern of beats. Every beat is usually marked by a synchronous beating (instrumental) sound, though some beats may be silent. In some cases, there may be beating (instrumental) sound at positions that are not beats (according to the periodicity). The list of Sollukattus are given in Table 1. As Adavus are performed along with the rhythmic syllables of a Sollukattu that continues to repeat in cycles. Rhythm performs the role of a timer (with beats as temporal markers). Between the interval of beats, the dancer changes her posture.

2.2 Adavus – the Postures and Movements

Adavus are the basic unit of Bharatanatyam that are combined to form a dance sequence in Bharatanatyam. Adavus form the foundation stone on which the entire Nritta rests. It involves various postures, gestures of the body, hand, arms, feet, and eyes888Current work does not consider hand and eye movements for limitations of sensors.. While performing Adavus the dancer stamps, rubs, touches, slides on the ground in different ways in synchronization with the Sollukattu (bol) or the syllables used. The Adavu

s are classified according to the rhythmic syllables on which they are based and the style of footwork employed. According to

Kalakshetra school of training there are 15 Adavus. Most Adavus have two or more Variants. Variants of an Adavu bear similarity of intent and style, but differ in details. A total 58 Adavus and 23 Sollukattus are used in Kalakshetra999There are four major styles of BharatanatyamThanjavur, Pandanallur, Vazhuvoor, and Mellatur. Kalakshetra, promulgated by the Kalakshetra Foundation founded by Rukmini Devi, is the modern style of Bharatanatyam and is reconstructed from Pandanallur style.. The details are listed in Table 2. Every (variant of an) Adavu uses a fixed Sollukattu while a given Sollukattu may be used in multiple Adavus. Each posture of Adavus is a combination of leg support (Mandalam), legs position (Pada Bheda), arms position (Bahu Bheda), head position (Shiro Bheda), hand position (Hasta Mudras), neck position (Griba Bheda), eyes position (Drishti Bheda).

# Adavu Taalam Sollukattu
Name Variants
1 Joining Joining 1 Adi Joining A
Joining 2 Joining B
Joining 3 Joining C
2 Kati or Kartari Kati or Kartari 1 Roopakam KUMS
3 Kuditta Mettu Kuditta Mettu 1–4 Adi Kuditta Mettu
3 Kuditta Nattal Kuditta Nattal 1–3 Adi Kuditta Nattal A
Kuditta Nattal 4–5 Kuditta Nattal B
Kuditta Nattal 6 Kuditta Nattal A
5 Kuditta Tattal Kuditta Tattal 1–5 Adi Kuditta Tattal
6 Mandi Mandi 1–2 Roopakam KUMS
7 Natta Natta 1–8 Adi Natta
8 Paikkal Paikkal 1–3 Adi Paikkal
9 Pakka Pakka 1–4 Adi Pakka
10 Sarika Sarika 1–4 Adi Sarika
11 Sarrikkal Sarrikkal 1–3 Roopakam KUMS
12 Tatta Tatta 1–2 Adi Tatta A
Tatta 3 Roopakam Tatta B
Tatta 4 Adi Tatta C
Tatta 5 Tatta D
Tatta 6 Tatta E
Tatta 7 Tatta F
Tatta 8 Roopakam Tatta G
13 Tei Tei Dhatta Tei Tei Dhatta 1–3 Adi Tei Tei Dhatta
14 Tirmana Tirmana 1 Roopakam Tirmana A
Tirmana 2 Tirmana B
Tirmana 3 Tirmana C
15 Utsanga Utsanga 1 Roopakam KUMS
KUMS stands for Kartati–Utsanga–Mandi–Sarikkal
Table 2: List of Adavus with accompanying Sollukattu

Since Adavus are elementary units and used for training, each Adavu has a specific purpose (as shown is Table 3). For example, Tatta Adavus focus on striking of the floor with foot. The body remains in a posture called Araimandi and the feet, by rotation, strike the floor alternately with the sole. There are 8 variants of Tatta Adavu. The features of the Variant 1 of Tatta Adavu (say, Tatta 1) are – (a) Strike on the floor, (b) Heel to touch hip during strike, (c) No hand gesture, and (d) No movements. The Sollukattu used in Tatta 1 is tei a tei (say, Tatta_A). This follows the Adi Taalam or 8 beats’ pattern as shown in Table 4. The bols on each beat are shown in three different tempos.

Adavu Purpose of the Adavu
Joining Simple connecting Adavus to be used while building longer sequences of postures
Kati or Kartari Paidhal itself includes a variety of leaps and may also be coupled with spins (Bramhari). It also includes the famous Kartari (Scissors) adavu where the movement of the hand and feet trace crisscross patterns in space.
Kuditta Mettu Jumping on the toes and then striking the heels
Kuditta Nattal Striking the floor by leg, jumping on toes, stretching legs and hands and also circular movement of hand
Kuditta Tattal Striking the floor, jumping on toes, stretching hands, circular movements of hands, neck and head with the bending of torso and waist and hand movements define different planes in space
Mandi Mandi in some Indian languages refers to area around the thigh and knee. In some instance we can refer it to a bent knee. For example, Araimandi is where the knee is half bent. Muzhumandi or Poorna Mandala is where the knee is fully bent. In Mandi adavus we make use of the Muzhumandi position often. Steps could vary from jumps in Poorna Mandala to jumping and touching one knee on the floor.
Natta Stretching of legs
Paikkal Paikkal (Paidhal or Paichal) is a Tamil term that means to leap. It differs from the Kuditta Mettu in the sense, the dancer while doing the Paikkal covers space, whereas in Kuditta Mettu she / he jumps in the same spot. A very graceful step in itself, Paikkal is usually seen at the end of Korvai (a string of Adavus) as part of Ardhis.
Pakka Moving towards sides
Sarika Sarika means a thing of beauty or nature
Sarrikkal Sarrikkal means to slide. Here as one foot is lifted and placed the another foot slides towards it.
Tatta Striking the floor with feet
Tei Tei Dhatta Use of half and full seating, stretching legs and hand, jumping with linear and circular movements of hands
Tirmana Tirmana (or Teermanam means to conclude or an ending or a final stage. Thus the steps in these adavus are used to end a dance sequence or jathis. It is done in a set of three steps or repeated thrice.
Utsanga Use of different hand position to enhance the stretching on half seating, straight standing, jump on heels, striking the floor. Also use of linear and circular movements of waist and stretching of hands.
Source:  [preetivasudevan] and personal communication with Debaldev Jana
Table 3: Purpose of various Adavus
Beats 1 2 3 4 5 6 7 8
Speed
1 tei a tei tei a tei tei a tei tei a tei
2 tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei
3 tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei
tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei tei a tei
Time Measure: Adi Taalam
Table 4: Beat pattern of Tatta 1 (Tatta Adavu Variant 1) in Adi Taalam

The posture of a dancer is synchronized with the beats. The synchronized postures with beats are shown in Figure 1. Here, the dancer strikes her left and right foot with the beats in rotation.

Right Strike Left Strike Right Strike Left Strike
tei a (beat 1) tei (beat 2) tei a (beat 3) tei (beat 4)
Right Strike Left Strike Right Strike Left Strike
tei a (beat 5) tei (beat 6) tei a (beat 7) tei (beat 8)
Figure 1: Example performance of Tatta 1 (Tatta Adavu Variant 1)

Like Tatta 1, all Adavus are combinations of:

  • Position of the legs (Sthanakam) / Posture of standing (Mandalam): Adavus are performed in postures that are (Figure 2) – (a) Samapadam or the standing position, (b) Araimandi / Ardha Mandalam or the half sitting posture, and (c) Muzhumandi or the sitting posture.

    Samapadam Araimandi Muzhumandi
    (Standing) (Half Sitting) (Full Sitting)
    Source: Leg Postures in Bharatanatyam by Nysa Dance Academy
    https://nysadancecom.wordpress.com/2015/09/26/leg-posture-aramandi-or-ardhamandala/
    Figure 2: Three Mandalams (types of leg support) of Bharatanatyam
  • Jumps (Utplavana): Based on the mode of performances Utplavanas are classified into Alaga, Kartari, Asva, Motita, and Kripalaya.

  • Walking Movement (Chari): Chari are used for gaits. According to Abhinayadarpana, there are eight kinds of Charis – Chalana, Chankramana, Sarana, Vehini, Kuttana, Luhita, Lolita, and Vishama Sanchara.

  • Hand Gestures (Nritta Hastas): Bharatanatyam primarily uses two types101010Few other types like Nritya Hasta are used at times. of Hasta Mudras (Figure 3) that play a significant role in communication – 28 single hand gestures (Asamyutha Hasta) and 23 combined (both) hand gestures (Samyutha Hasta). There are twelve major hand gesture for Adavus – Pataka, Tripataka, Ardhachandra, Kapittha, Katakamukha, Suchi Musthi, Mrigasirsha, Alapadma, Kaetarimukha, Shikhara, and Dola.

In Bharatanatyam, Adavu is used in dual sense. It either denotes just the dance part (postures and movements) or the dance and the accompanying music together. To maintain clarity of reference, in this paper, we refer to the dance simply by Adavu and the composite of dance and music by Bharatanatyam Adavu.


(a) Sets of Asamyutha Hasta Mudras
 

(b) Sets of Samyutha Hasta Mudras
 
Image Source: https://grade1.weebly.com/theory.html

Figure 3: Hasta Mudras of Bharatanatyam

3 Object-based Modeling of Adavus

To express the ontology we follow an extended object-based modeling framework comprising a set of classes (Table 5), a set of instances (Table 6), and a set of relations (Table 7). Classes are used to represent generic as well as specific concepts. These can be Abstract or Concrete. A concrete class has one or more instances while an abstract class has one or more specialized classes. Relations are usually binary and are defined between two classes, between a class and an instance, or between two instances.

Class Type Remarks
Sollukattu Concrete The music (audio) of Adavus (Table 1)
Adavu Abstract The movements (video) of Adavus (Table 2)
Tatta Adavu, Natta Adavu, Concrete Types of Adavus (Table 2)
Carnatic Music Abstract The style of Bharatanatyam music
Sequence Abstract Ordered list of elements of one kind
Beat Abstract Basic unit of time – an instance on timescale
Bol Concrete Mnemonic syllable or vocal utterances (Table 8)
Posture Abstract Standing or sitting position of a dancer
Taalam Concrete Rhythmic pattern of beats
Tempo Concrete Beats per minute – defines speed
Instrumental Strike Concrete Beating of a percussion
Position (Time Stamp) Concrete Instant of time
Key Posture Concrete Momentarily stationary posture (Figure 1)
Transition Posture Abstract Non-stationary posture
Trajectorial Transition Posture Concrete Transitions along a well-defined trajectory
Natural Transition Posture Concrete Natural posture transitions by the dancer
Leg Support (Mandalam) Concrete Ways to support the body (Figure 2)
Legs Position (Pada Bheda) Concrete Positions of both legs in Bharatanatyam
Arms Position (Bahu Bheda) Concrete Positions of both arms in Bharatanatyam
Head Position (Shiro Bheda) Concrete Positions of head in Bharatanatyam
Neck Position (Griba Bheda) Concrete Positions of neck in Bharatanatyam
Eyes Position (Drishti Bheda) Concrete Eye movements depicting navarasa
Hands Position (Hasta Mudra) Abstract Positions of both hands in Bharatanatyam
Single Hand Gesture Concrete Asamyukta Hasta Mudras (Figure 3)
Double Hand Gesture Concrete Samyukta Hasta Mudras (Figure 3)
Left Leg (Formation) Concrete Left leg in Pada Bheda (Table 9)
Right Leg (Formation) Concrete Right leg in Pada Bheda (Table 9)
Left Arm (Formation) Concrete Left arm in Bahu Bheda (Table 10)
Right Arm (Formation) Concrete Right arm in Bahu Bheda (Table 10)
Left Hand (Formation) Concrete Left hand in Hasta Mudra (Table 12)
Right Hand (Formation) Concrete Right hand in Hasta Mudra (Table 12)
Table 5: List of Classes for the ontology of Bharatanatyam Adavu
Class:Instance Remarks
Sollukattu: Tatta_A, , Tatta_G, Natta, Kuditta Mettu 23 types of Sollukattus (Table 1)
Adavu: Tatta 1, , Tatta 8, Natta, Kuditta Mettu 58 types of Adavus (Table 2)
Bol: tei, yum, tat, , 31 types of Bols (Table 8)
Taalam: Adi Taalam, Roopakam Taalam 2 of the 7 types of Taalams
Tempo (Laya): Vilambit Laya, Madhay Laya, Drut Laya 3 types of Layas (speed) or tempo
:Spinal Bending (boolean) Spine may or may not be bent
Key Postures: Natta1P1, Natta1P2, Natta1P3, Key Postures of Natta Adavu Variant 1
Leg Support: Samapadam (Standing), Araimandi (Half-Sitting), Muzhumandi (Full Sitting) 3 types of leg support
Legs Position: Aayata [S], Prenkhanam [M], Types of both legs positions
Arms Position: Natyarambhe [S], Natyarambhe [M], Types of both arma positions
Head Position: Samam, Left Paravrittam, Right
Paravrittam,
Types of head positions (Table 11)
Hands Position: Tripataka [S], Types of both handa gestures
Left / Right Leg (Formation): Aayata, Anchita, Types of single leg formations (Table 9)
Left / Right Arm (Formation): Natyarambhe, Kunchita
Natyarambhe,
Types of single arm formations (Table 10)
Left / Right Hand (Formation): Tripataka, Types of single hand gestures
[S]: Denotes symmetric ([S]) positions between left and right limbs
[M]: Denotes asymmetric positions between left and right limbs and its mirror ([M])
Instances of Neck Position, Eyes Position, and Double Hand Gestures are not considered
Table 6: List of Instances for the ontology of Bharatanatyam Adavu
Relation Domain Co- Remarks
Domain
is_a Class Class Specialization / Generalization or is_a hierarchy of Object-based Modeling. This is used to build the taxonomy. For example, Tatta Adavu is_a Adavu.
has_a Class Class /
Instance
Composition or has_a hierarchy of Object-based Modeling. This is used to build the partonomy. For example, Sollukattu has_a Taalam.
isInstanceOf Instance Class Distinct instances of a class
isAccompaniedBy Class Class isAccompaniedBy captures the association between video and audio streams. Hence, Adavu isAccompaniedBy Sollukattu.
isSyncedWith Class Class Expresses high-level synchronization – between audio and video streams. Every Adavu isSyncedWith a unique Sollukattu.
isSequenceOf Class Class isSequenceOf builds a sequence from elements of the same type. For example, every Sollukattu (Adavu) has_a a sequence of beats (postures) constructed from beat (postures) by isSequenceOf relation. isFollowedBy is a dual of this relation.
isAccentedBy Class Class A beat isAccentedBy a bol.
isFollowedBy Class Class Ordering of audio events (like beats) or video events (like postures) – Event is isFollowedBy event . isSequenceOf is a dual of this relation.
triggers Instance Instance Expresses low-level synchronization – between audio and video events. Hence, a beat triggers a posture as the dance is driven by the music.
repeats Class Class Once a taalam completes a bar, it may repeat itself.
Table 7: List of (Binary) Relations for the ontology of Bharatanatyam Adavu

3.1 Ontology of Bharatanatyam Adavus – Top Level

At the top level, a Bharatanatyam Adavu can be expressed simply as a dance (Adavu) accompanied and driven by (isAccompaniedBy) music (Sollukattu) (Figure 4). In other words, the musical meter111111The meter of music is its rhythmic structure. of an Adavu is called a Sollukattu which is a sequence of beats / bols. An Adavu is a sequence of postures. We also note that Sollukattu is a form of Carnatic Music.

Figure 4: Ontology of Bharatanatyam Adavu at the abstract level

Elaborating on the basic concept of Adavus, we show in Figure 5 that there are several specializations of Adavus like Tatta Adavu or Natta Adavu having instances Tatta Adavu 1, , Tatta Adavu 8 etc. and there are several instances of Sollukattus like Tatta A, Kuditta Mettu, etc. Specifically, every Adavu is synchronized with (isSyncedWith) a unique Sollukattu.

Figure 5: Ontology of Bharatanatyam Adavu with specializations and instances

3.2 Ontology of Sollukattus

Next, we elaborate the ontology of a Sollukattu (Section 2.1) in Figure 6. A Sollukattu is performed in a Taalam that designates a specific pattern of rhythm. A Taalam is composed of a sequence of beats (isSequenceOf) going at a certain tempo (speed). At the end of the sequence of beats (or the bar), the Taalam repeats itself. A tempo corresponds to the speed of the rhythm which may be carried out in one of the three speeds (Laya) – slow, medium, and fast. Adi Taalam and Roopakam Taalam are the typical rhythms used in Bharatanatyam.

Figure 6: Ontology of Sollukattus

A beat is an instant in time that may be marked by beating of a stick and optionally accented by a bol. Hence, it has_a temporal position (time stamp), an instrumental strike (for example, beating of Tatta Kazhi), and a bol like tei, yum, tat, (vocabulary of Bols by Bharatanatyam experts is given in Table 8).

Sl. # Bol Sl. # Bol Sl. # Bol
1 a 12 ha 23 tak
2 da 13 hat 24 tam
3 dha 14 hi 25 tan
4 dhat 15 jag 26 tat
5 dhi 16 jham 27 tei
6 dhin 17 ka 28 tom
7 dhit 18 ki 29 tta
8 ding 19 ku 30 ya
9 e 20 na 31 yum
10 gadu 21 ri 32 Stick Beat
11 gin 22 ta
Stick Beat is treated as a pseudo-bol. The bols shown in the table are typical as Bharatanatyam does not follow a strictly fixed set of bol.
Table 8: Bol vocabulary of Sollakattus

3.3 Ontology of Adavus

We elaborate the ontology of an Adavu (Section 2.2) in Figure 7. An Adavu is created by a sequence of Postures and intervening Movements like Utplavana (Jumps), Chari (Walking), or Karana121212Karanas (‘doing’ in Sanskrit) are the 108 key transitions described in Natya Shastra. (synchronized movement of hands and feet). A posture may be a Key Posture or a Transition Posture. A Key Posture is defined as a momentarily stationary pose taken by the dancer with well-defined positions for the Legs (Pada Bheda), the Arms (Bahu Bheda), the Head (Shiro Bheda), the Neck (Griba Bheda), the Eyes (Drishti Bheda), and the Hands (Hasta Mudra). Every Key Posture is also defined with a specific Leg Support and Spinal Bending to support and balance the body. A Transition Posture, in turn, is a transitory pose (ill-defined, at times) between two consecutive Key Postures in a sequence or a pose assumed as a part of a movement. It may be Trajectorial or Natural. While a Trajectorial Transition Posture occurs in a well-defined trajectory path of body parts, a Natural Transition Posture may be suitably chosen by a dancer to move from one Key Posture to the next.

Figure 7: Ontology of Adavus

In the current work, we focus only on Key Postures and do not model and / or analyze movements and transitions. Hence, we do not elaborate the ontology for Transition Postures or movements. However, the concept of Key Postures are detailed in Figure 8.

3.3.1 Vocabulary of Positions and Formations

To elaborate the ontology for a Key Posture, we introduce the notions of positions and formations of constituent limbs or body parts. A formation describes the specific manner in which a body part is posed in the posture. For body parts that occur in pair (like leg, arm, hand, eye), the combined formation of the individual (left and right) parts define a position. For the rest (like head, neck) position and formation are taken to be synonymous. Accepted nomenclature (as identified by the experts) exists for many positions / formations of most of the body parts in Bharatanatyam. Naturally, we adopt those. For the rest, we assign names based on crisp descriptors of the positions. We observe that the postures mostly are distinguishable based on the four major body parts – leg, arm, head, and hand. Hence, we have not considered the eyes and the neck in building the posture ontology.

In Table 9, we list the vocabulary for formations of left and right legs as well as their combined legs positions. Some of the positions are asymmetric in which the left and the right leg assume different formations. For example, if the left leg is in Anchita formation and the right leg is in Samapadam formation, the combined legs position is named as Ardha Prenkhanam. Naturally, every asymmetric position has a position which is a mirror image of the other one, marked by [M] (Mirror), where the formations of the legs are swapped. That is, in Ardha Prenkhanam [M], the right leg is in Anchita formation and the left leg is in Samapadam formation. In the table, we have listed only one of these mirrored positions. Remaining leg positions are symmetric in which both legs assume the same formation. In such cases, the position is marked with an [S] (Symmetric) and the same name is used for the formation and the position. Hence in Aayata [S] position, both legs are in Aayata formation.

Left Leg Formation Right Leg Formation Leg Position
Asymmetric Positions
Anchita Samapadam Ardha Prenkhanam
Aayata Back Swastikam Back Swastikam
Agratala Sanchara Samapadam Chalan Chari
Aayata Diagona Anchita Diagonal Prenkhanam
Bend On Knee Support Ekapadam
Aayata Front Anchita Front Prenkhanam
Aayata Front Swastikam Front Swastikam
Aayata Prerita Prerita
Parsasuchi Bisamasuchi Garudamandalam
Aayata Forward / Side Low Lolita Chari
Aayata Anchita Prenkhanam
Aayata Side Middle / Low Prenkhanam Above Floor
Aayata Kunchita Aaleeda ([M] = Pratyaaleeda)
Kunchita Aayata Pratyaaleeda
Symmetric Positions
Aayata Anchita Ekapadam Bhramari
Samapadam Motita Mandal Side Chankramanang
Muzmandi Slip With Left Knee Chankramanang
Kuttana Slip With Right Knee Back Chankramanang
Parswa Aayata
Table 9: Vocabulary of formations and positions of legs (Pada Bheda)

In Table 10, we list the vocabulary for the formations of the arms. Either arm can assume any of these formations. In case of arms, no specific names are used for combined arms positions. Hence, they are referred to with the names of both the formations if they are different. For example, if the left arm is in Kunchita Natyarambhe formation and the right arm is in Natyarambhe formation, the combined arms position is named as Natyarambhe–Kunchita Natyarambhe. If, however, both formations are same, we name the position with an [S]. Hence Natyarambhe [S] has Natyarambhe formation for both arms. In Table 11, we list the vocabulary for the formations of the head. Naturally, there is no position descriptor here. Next we list the vocabulary for the formations of the hands (hasta mudra) in Table 12. Like arms, these are also denoted with formations of single hands only and combined hands position is similarly named. It may be noted that the vocabulary listed here is a subset of Asamyutha Hasta or single hand gestures as commonly observed in the Adavus. We do not consider Samyutha Hasta or combined (both) hand gestures in building the vocabulary.

Above Head Natyarambhe Diagonal High Kunchita Natyarambhe
Above Head Natyarambhe Diagonal Middle Left Diagonal High
(Joined ) Elbow Down Anchita Natyarambhe
Anchita Forward High Right Diagonal High
Anchita Above Left Ear Forward High Above Head Right Diagonal Middle
Anchita Above Right Ear Forward Low Side High
Ardha Vithi Forward Middle Side High Natyarambhe
Backward High Front Natyarambhe Side Low
Backward Low Katyang Behind Waist Side Middle
Backward Middle Kunchita Utsanga
Cross Kunchita Kunchita Above Shoulder
Table 10: Vocabulary of formations of arms (Bahu Bheda)

3.3.2 Ontology of Key Postures

We elaborate the ontology of Key Postures in Figure 8. Consider the Legs Positions. For the Prenkhanam Legs Position in Natta1P2 in the figure, the left leg makes the Aayata (bent at knee) and the right leg makes Anchita formation (straight and stretched). Prenkhanam [M] is a mirror image position of Prenkhanam where the formations of the two legs are swapped. Natta1P3 is a mirrored posture of Natta1P2 and has Prenkhanam [M] for the legs positions. With symmetry Natta1P1 has Aayata [S] legs position.

Consider instances of 3 key postures – Natta1P1, Natta1P2, and Natta1P3 – of Natta 1 Adavu. For example, for instance Natta1P1, we have Legs Position = Aayata [S], Arms Position = Natyarambhe [S], Hands Position = Tripataka [S], and Head Position = Samam.

Samam Left Adhomukham Right Adhomukham
Adhomukham Left Ardha Paravrittam Right Ardha Paravrittam
Back Paravrittam Left Paravrittam Right Paravrittam
Udvahitam Left Utshiptam Right Utshiptam
Ardha Aalolitam
Table 11: Vocabulary of positions / formations of head (Shiro Bheda)
Alapadma Kartarimukha Mushti Suchi
Avahitya Katakamukha Pataka Tripataka
Dola Mrigashirsha Shikhara
Table 12: Vocabulary of formations of hands (Hasta Mudra)

We identify 361 distinct postures and 48 distinct movements in the 58 Adavus.

Dotted lines denote isInstanceOf between an instance and a class
Dashed lines denote has_a between two an instances
Natta1P1 Natta1P2 Natta1P3
Figure 8: Ontology of Key Postures

3.4 Ontology of Audio-Visual Sync between Sollukattu & Adavu

With the ontology of music (Sollukattu) and (visual) sequence of postures (Adavu) of Bharatanatyam, we next capture the synchronization of the events. As the postures are driven by and are synchronized with the beats of the music, and as the performance repeats after a bar of the rhythm, we capture the ontology of synchronization between an Adavu and its Sollukattu as in Figure 9. Here specific instances of beats – Beat 1, Beat 2, , Beat – form the sequence of beats in a Sollukattu. So we expresses that Beat 1 isFollowedBy Beat 2, Beat 2 isFollowedBy Beat 3, and so on. Finally, after Beat , the bar repeats, and hence, Beat isFollowedBy Beat 1. Similarly, instances of key postures – Posture 1, Posture 2, , Posture – form the sequence of postures in an Adavu that also repeats. Being driven by music, every beat triggers the corresponding posture. In the figure, we show only one cycle (bar) of the Taalam. In an Adavu, usually 1, 2, 4, 6, 8, or more number of repetitions are performed by the dancer. Explicit instances of bols and time instants are omitted on the diagram for better clarity.

Figure 9: Ontology of Audio-Visual Sync in Bharatanatyam

In this section, we have captured the central concepts of Bharatanatyam Adavus in terms of a set of object-based ontological models. These models identify the key items with their interrelationships and help the annotation of data sets for training as well as testing. Naturally, they lead to algorithms for the analysis and recognition of various items (like bols and postures). However, these models are structural, and hence, are limited in their temporal specification.

4 Event-based Modeling of Adavus

The framework used so far is good for taxonomical and partonomical representation but lacks the expressibility in temporal terms. But Dance is multimedia in nature with music driving the steps. In order to capture dynamic association between music and video, we first tried to use the concept of triggers to model synchronization of events. The progression of time is captured by simple sequences (isFollowedBy) of occurrences of bols and beats . This approach is illustrated in Figure 9. Since a simple sequence of bols and beats misses actual quantum of time slice, it cannot deal with triggers between beat and posture actions, and cannot ensure equal time gap between beats. Temporal behavioral models are necessary to analyze and recognize such temporal and synchronization details in depth. Hence, we introduce an event-based modeling framework that, on one hand, can relate to the key concepts as introduced above and is defined in terms of temporal relationships on the other.

This event-based framework treats a performance as a multimedia stream and takes the models closer to the structure of the data that we capture later by Kinect. A Bharatanatyam Adavu, therefore, consists of (1) Composite Audio Stream (Sollakattu) containing – (a) Instrumental Sub-stream as generated by instrumental strikes and (b) Vocal Sub-stream as generated by vocalizations or bols; (2) Video Stream of frames containing either – (a) Key Posture (called, K-Frame), or (b) Transition Posture (called, T-Frame); and (3) Synchronization (Sync) of Position, Posture, Movement, and Gesture of an Adavu as performed in synchronization among themselves, and in synchronization with the rhythm of the music. In Instrumental and Vocal Sub-streams of a Sollukattu, beating and bols are usually generated in sync. The rules or structure of synchronization have been defined for every Sollakattu in Bharatanatyam.

4.1 Events of Adavus

An Event denotes the occurrence of an activity (called Causal Activity) in the audio or the video stream of an Adavu. Further, sync events are defined between multiple events based on temporal constraints. Sync events may be defined jointly between audio and video streams. An event is described by:

Event Event Event Event
Category Type Description Label
Audio Full beat with bol bol, downbeat, upbeat
Audio Half beat with bol bol
Audio Quarter beat with bol bol
Audio Full beat having no bol upbeat
Audio Half beat having no bol
Audio bol is vocalized bol
Video No motion Range of Frames, Key Posture
Video Transition Motion Range of Frames
Video Trajectory Motion Range of Frames, Trajectory
Sync bol @ Full beat bol
Sync bol @ Half beat bol
Sync No motion @ Full beat Key Posture
Sync No motion @ Half beat Key Posture
1: A (full) beat is the basic unit of time – an instance on the timescale
2: Vocalized bols accompany some beats
3: The first beat of a bar
4: The last beat in the previous bar which immediately precedes the downbeat
5: Half beats are soft strikes at the middle of a tempo period
6: Quarter beats strike at the middle of a Full-to-Half or a Half-to-Full beat
7: Frames over which the dancer does not move (assumes a Key Posture)
8: Sequence of consecutive frames over which the events spreads
9: A Key Posture is a well-defined and stationery posture
10: Transitory motion to change from one Key Posture to the next
11: Motion that follows a well-defined trajectory of movement for limbs
12: and in sync. That is,
Table 13: List of Events of Adavus
  1. Category: The nature of the event based on its origin (audio, video or sync).

  2. Type: Type relates to the causal activity of an event in a given category. Event types are listed in Table 13 with brief description.

  3. Time-stamp / range: The time of occurrence of the causal activity of the event. This is elapsed time from the beginning of the stream and is marked by a function . Often a causal activity may spread over an interval which will be associated with the event. For video events, we use range of video frame numbers as the temporal interval. Since the video has a fixed rate of 30 fps, for any event we interchangeably use or as is appropriate in a context.

  4. Label: Optional labels may be attached to an event for annotating details.

  5. ID: Every instance of an event in a stream is distinguishable. These are sequentially numbered in the temporal order of their occurrence (Table 16).

The list of events are given in Table 13 and characterized in the next sections.

4.2 Characterization of Audio Events

A Sollukattu is the musical meter of an Adavu. Traditionally, a Tatta Palahai (wooden stick) is periodically struck on a Tatta Kozhi (wooden block) in the rhythmic pattern of Adi or Roopakam Taalams to produce the periodic beats (or events in Table 14). Usually beats repeat in a bar131313A bar (or measure) is a segment of time corresponding to a specific number of beats. Sollukattus also use longer bars (12, 16, 24, or 32). of = 6 or 8. The tempo of a meter is measured by beats per minute () and can be slow, medium or fast. We use Tempo Period or Period or the time interval between two consecutive beats in secs as the temporal measure for a meter.

In the current work we use only the slow tempo. While there is no fixed definition for the bpm of a slow tempo (medium and fast progressively doubles relative to the slow one), it is typically found to be between 75 (period = 0.8 sec.) and 30 (period = 2 sec.) in most of the performances. Theoretically, the tempo period should not vary during the performance of a specific Sollukattu or across Sollukattus. However, in reality it does vary depending on the skill of the beat player. Naturally, the event model needs to take care of such variations.

Next let us consider two consecutive beats and in a bar of length , where denotes the () period. The time-stamps of the respective events are then related as . Further the bar repeats after an equal time interval of . That is, , . We refer to such beats as full beats and hence the superscript fb in events. The first beat (last beat ) of a bar is referred to as a downbeat (upbeat). We mark these on the events as labels. In many Sollukattus beating is also performed at the middle of a period. These are called half beats and produce the events in the period. Naturally, .

Often in a Sollukattu the beat player (an accomplice of the dancer) also utters bols. These are done in sync with a full beat or a half beat. We represent bols as labels of the respective or events. A bol is optional for an event.

It may be noted that a beat is actually an instant of time that occurs in every secs. So it is possible that a beat has no beating (and obviously no bol). Such cases, however, are not in the scope of the present study and we always work with a beating at a beat.

There are 23 Sollukattus. We illustrate a few here to understand various meters. All Sollukattus are shown in slow tempo or Vilambit Laya.

  1. Kuditta Mettu ( 1.2 secs, = 8): We show two bars in Tables 14 with bols and time-stamps. In Figure 10, we illustrate the signal for a Kuditta Mettu recording highlighting various events, time-stamps, and bols. While this Sollukattu has only events by definition, some incidental events can still be seen in the signal. These will need to be later removed.

    Table 16 shows its relationship with the Adavu.

    Event Time Beat Offset Event Time Beat Offset
    (sec.) (sec.) (sec.) (sec.)
    () () () ()
    (tei) 2.681 (tei) 12.271 1.207
    (hat) 3.912 1.231 (hat) 13.386 1.115
    (tei) 5.108 1.196 (tei) 14.512 1.126
    (hi) 6.269 1.161 (hi) 15.603 1.091
    (tei) 7.523 1.254 (tei) 16.764 1.161
    (hat) 8.742 1.219 (hat) 17.902 1.138
    (tei) 9.891 1.149 (tei) 19.028 1.126
    (hi) 11.064 1.173 (hi) 20.178 1.150
    sec.,
    Table 14: Pattern of Kuditta Mettu Sollukattu (Figure 10 (a))
    Parameters: No. of bars = 2, = 8 and sec.
    Full beat () event positions are highlighted (yellow blobs) and corresponding bols and time-stamps are shown (Table 14). Note that several are visible in the signals. These are rather incidental and not intended in the Sollukattu. Also, the beatings before the downbeat () are ignored. Right-sided Key Postures (Figure 12) are also shown for the first 8 beats. Left-sided Key Postures are performed for the next 8 beats.
    Figure 10: Marking of beats and annotations of bols for Kuditta Mettu Sollukattu
  2. Tatta_C ( 1.6 secs, = 8): It has as well as events (Table 15 and Figure 11).

    Event Time Beat Offset 1/2–Beat Offset
    (sec.) (sec.) (sec.)
    () () ()
    (tei) 6.571
    (ya) 7.395 0.82
    (tei) 8.185 1.61
    (ya) 8.962 0.78
    (tei) 9.752 1.57
    (ya) 10.565 0.81
    (tei) 11.366 1.61
    (tei) 13.003 1.64
    (ya) 13.815 0.81
    (tei) 14.628 1.63
    (ya) 15.441 0.81
    (tei) 16.184 1.56
    (ya) 17.031 0.85
    (tei) 17.809 1.63
    sec.,
    Table 15: Patterns of Tatta_C Sollukattu (Figure 10 (b))
    Parameters: No. of bars = 2, = 8 and sec.
    Full beat () (yellow blobs) and half beat () (green blobs) event positions are highlighted and corresponding bols and time-stamps are shown (Table 15).
    Figure 11: Marking of beats and annotations of bols for Tatta_C Sollukattu
  3. Kuditta Nattal_A & Tatta_E ( 1.0 secs, = 8): In addition to , and events are also found (Table 16) where there is only beating and no bol.

  4. Joining_B ( 1.5 secs, = 8): As such it uses only s (Table 16).

All Sollukattus in terms of the Bols are listed in Table 1.

Sollukattu Description of Bol / Adavus
Kuditta (tei) (hat) (tei) (hi)
Mettu (tei) (hat) (tei) (hi)
2-2 Adavu: Kuditta_Mettu 1, 2, 3, 4
Kuditta (tat) (tei) (tam)
Nattal A (dhit) (tei) (tam)
2-2 Adavu: Kuditta_Nattal 1, 2, 3, 6
Tatta E (tei) (tei) (tam)
(tei) (tei) (tam)
2-2 Adavu: Tatta 6
Joining B (dhit) (dhit) (tei)
(dhit) (dhit) (tei)
2-2 Adavu: Joining 2
Table 16: Variations in the patterns of Sollukattus with Adavus

4.3 Characterization of Video Events

While performing an Adavu the dancer closely follows the beats of the accompanying music. At a beat, the dancer assumes a Key Posture and holds it for a little while before quickly changing to the next Key Posture at the next beat. Consequently, while the dancer holds the key posture, she stays almost stationary and there is no or very slow motion in the video. This leads to no-motion events. Further, while the dancer changes to the next key posture, we observe the (transition) or (trajectory) motion events. Since a frame is an atomic observable unit in a video, we can classify the frames of the video of an Adavu into 2 classes:

(a) , (tei) (b) , (hat) (c) , (tei) (d) , (hi)
(e) , (tei) (f) , (hat) (g) , (tei) (h) , (hi)
Sollukattu = Kuditta Mettu) with bols for Bar 1.
From a tei to the next hat or hi the dancer sharply lowers her raised feet.
Further, 8 left-sided Key Postures are performed for the next 8 beats in Bar 2.
Figure 12: Right-sided Key Postures of Kuditta Mettu Adavu (Variant 2)
  1. K-frames or Key Frames: These frames contain key postures where the dancer holds the Posture. Evidently, a has the sequence of K-frames as labels. All K-frames of an contain the same key posture.

  2. T-frames of Transition Frame: These are transition frames between two K-frames while the dancer is rapidly changing posture to assume the next key posture from the previous one. T-frames contain Natural Transition Postures (leading to events) or Trajectorial Transition Postures (leading to events). A or event has the corresponding sequence of T-frames as labels. In the current work, we do not deal with movements and transitions. Hence, we ignore T-frames.

In Figure 12 we show the key postures of Kuditta Mettu Adavu at every beat of the first bar of Kuditta Mettu Sollukattu. The corresponding video and audio events are marked in Table 17 with K-/T-Frames. These are also marked on the Sollukattu in Figure 10. Note that only the right-sided half of the postures are shown in both figures.

Events K-/T-Frames Events K-/T-Frames
Range # of Range # of
[(tei)] 70–99 30 [(tei)] 359–386 28
100–103 4 387–390 4
[(hat)] 104–124 21 [(hat)] 391–410 20
125–145 21 411–429 19
[(tei)] 146–172 27 [(tei)] 430–451 22
173–176 4 452–455 4
[(hi)] 177–191 15 [(hi)] 456–470 15
192–214 23 471–492 22
[(tei)] 215–245 31 [(tei)] 493–521 29
246–249 4 522–525 4
[(hat)] 250–262 13 [(hat)] 526–542 17
263–287 25 543–564 22
[(tei)] 288–314 27 [(tei)] 565–587 23
315–317 3 588–590 3
[(hi)] 318–345 28 [(hi)] 591–620 30
346–358 13 621–
Table 17: Patterns of Kuditta Mettu Adavu (Figure 12)

4.4 Characterization of Synchronization

A Bharatanatyam dancer intends to perform the key postures of an Adavu in synchronization with the beats. Hence various audio events like and corresponding video events like should be in sync. Every Adavu has a well-defined set of rules that specifies this synchronization based on its associated Sollukattu. For example, in Figure 12, we show how different key postures of Kuditta Mettu Avadu should be assumed at every beat of the Kuditta Mettu Sollukattu. That is, how the s of a bar in the audio should sync with the s of the video. Other Adavus require several other forms of synchronization between the audio-video events including sync between beats and trajectory-based body movements .

We assert a sync event if a key posture () should sync with a corresponding (full) beat (). In simple terms, a occurs if the time intervals of and events overlap. That is, . Similar sync events may be defined between other audio and video events according to the rules of Adavus.

Perfect synchronization is always intended and desirable for a performance. However, we often observe the lack of it due to various reasons. The beating instrument, vocal bols, and body postures each has a different latency. If a posture is assumed after hearing the beat, will lag . If the dancer assumes the posture in anticipation, may lead

. Lack of sync may also arise due to imperfect performance of the dancer, the beater or the vocalist. Hence, analysis and estimation of sync is critical for processing

Adavu.

While sync between the audio and video streams is fundamental to the choreography, there are a variety of other synchronization issues that need to be explored. These include sync between beating (instrumental) beats and (vocalized) bols, uniformity of time gap between consecutive beats, sync between different body limbs while changing from one key posture to the next, and so on.

5 Ontology of Events and Streams

We have captured the structural models of Sollukattus and Adavus in Section 3 and then, the temporal behavioral models in Section 4 based on these structures. Now, we would like to relate these to the actual recording data of the performances. For the current work we capture the performances of Bharatanatyam Adavus using Kinect XBox 360 (Kinect 1.0) sensor. So in this section, we model the relationships between the events and the Kinect streams to facilitate the formulation of the algorithms later.

Kinect 1.0 is an RGBD sensor that captures a multi-channel audio stream with 3 video streams – RGB, Depth, and Skeleton in its data file. The video streams are captured at 30 frames per second (fps). The RGB stream comprises frames containing color intensity images. The depth stream comprises frames containing depth images. And the skeleton stream comprises frames containing 20-joints skeleton images of human beings in the view. The video streams are synchronized between themselves. Hence for any RGB frame, the corresponding depth and skeleton frames carry the same frame number. The audio is also synchronized with the video by the same clock. Hence, any time on the audio stream corresponds to an RGB (depth, skeleton) frame by .

We now present a combined ontology for the events (as introduced in the last section) and the streams (of a Kinect data file), and capture their interrelationships. For this we identify sets of classes (Table 18), instances (Table 19), and relations (Table 20).

Class Type Class Type
Kinect Data File Concrete Audio-Event Stream Concrete
Audio Stream Concrete Video-Event Stream Concrete
Video Stream Concrete Audio Event Abstract
RGB Stream Concrete Video Event Abstract
Depth Stream Concrete Beat Event Abstract
Skeleton Stream Concrete Bol Event Concrete
RGB Frame Concrete Full Beat Event Concrete
Depth Frame Concrete Half Beat Event Concrete
Skeleton Frame Concrete Full Beat with bol (FB+B) Event Concrete
K-Frame Concrete Half Beat with bol (HB+B) Event Concrete
T-Frame Concrete No-Motion Event Concrete
Transition Event Concrete
Table 18: List of Classes for the ontology of Events and Streams
Class:Instance Remarks
Full Beat Event: FBB1, FBB2, Instances of full beat with bol events
Half Beat Event: HBB1, HBB2, Instances of half beat with bol events
No-Motion Event: NM1, NM2, Instances of no motion events
K-Frame: , , , Intensity (RGB) image frames from no. to
T-Frame: , , , Intensity image frames from no. to , where
:, , Depth image frames from number
:, , Skeleton image frames from number
Table 19: List of Instances for the ontology of Events and Streams
Relation Domain Co-Domain Remarks
is_a Class Class As in Table 7
has_a Class Class As in Table 7
isInstanceOf Instance Class As in Table 7
isSyncedWith Instance Instance Expresses low-level synchronization – between audio / video events and video frames. For example, an audio event FBB1 (instance of ’full-beat with bol’) isSyncedWith a unique K-Frame.
isSequenceOf Class Class As in Table 7
isExtractedFrom Class Class An event isExtractedFrom Kinect video
isInSync Relation over 3 Instances Expresses the inherent synchronization in data – between audio and multiple video streams – RGB, Depth and Skeleton. Every RGB Frame isInSync with a corresponding Depth Frame or Skeleton Frame.
All relations, with the exception of ’isInSync’, are binary
Table 20: List of Relations for the ontology of Events and Streams

The ontology is presented in Figure 15. The following points about the ontology may be noted:

  • The event-side is shown in blue and the stream-side is shown in black.

  • A K-frame is a semantic notion that is instantiated as a triplet of an RGB, Depth and Skeleton frames. Also, it actually represents a sequence of consecutive frames in the video having no-motion. T-frames are treated similarly.

  • isExtractedFrom represents the processes of extraction (or detection, estimation etc.) of audio (video) events from audio (video) streams. These are not directly available from the Kinect streams and need to be computationally determined. Specific algorithms required include:

    • Beat detection to produce or

    • Bol recognition to produce or

    • No-Motion detection to produce events

  • isInSync represents the fact that streams in Kinect are synchronized by the sensor.

  • In contrast isSyncedWith denotes the explicit attempt of the dancer to synchronize her / his moves and postures with the beats and bols. These are or events. To estimate isSyncedWith, K-Frames and T-frames need to be extracted.

Figure 13: Ontology of Kinect Data File, Streams and Audio-Video Events

6 Representation of Adavus in Labanotation

We intend to represent Bharatantyam ontology according to the ontology of a parse-able standard notation. Labanotation  [guest2005labanotation] (often referred to as Laban Encoding or simply Laban) is a standard notation system used for recording human movements. To record a movement the Laban system symbolizes space, time, energy, and body parts. Here, we introduce a limited set of symbols that are particularly used for representing posture of Bharatantyam Adavus. A Posture are encoded in laban is called frame and the laban frames are stack in laban staff as shown in Figure 14. When there is a sequence of postures or gestures changing over time, we stack their symbols on the staff vertically to show the progression over time. The center line of the staff indicates the time. The symbols are read from the bottom to the top of the staff.

(a) (b)
Figure 14: Ontology of Labanotation

The Staff represents the body. The Center Line divides the body into two parts – Left and Right. The immediate next to the center line are Support Columns. The symbols placed in these columns indicate the body parts which carry the weight of the body. Other columns are represent the gestures of other body parts such as Leg, Body (torso), Arm, and Head. Except head, other body parts have left and right columns. Labanotation captures the movements of the human body parts in terms of the directions and levels of the movement. The direction symbols are used to indicate in which direction in the space the movements occur and in any direction can have three different levels, namely, upward or high, horizontal or middle, and downwards or low. Every body part can be expressed in terms of the direction and level by placing respective symbols in the designated columns.

Figure 15: Ontology of Labanotation

The arms and the legs do not always remain straight while performing an Adavu. Few joints of the body like knee and elbow can get folded. Hence degree of folding is useful for these joints. There are a total of six degrees of folding. Bharatanatyam also involves a lot of foot work. Hence, we need to encode the type of touch between the foot and the ground and also which part of the foot is in contact with the ground. Labanotation system has symbols to diagrammatically illustrate the specific part of the foot that contact the ground. This attribute is called touch in Labanotation. There are 11 parts of foot that can touch the ground. The concepts are shown in Figure 16.

Figure 16: Ontology of Labanotation
Figure 17: Ontology of Transcription/ Sensor to parser representation

We want to use concepts of Labanotation to transcript the data captured by the sensor into machine parse-able form. We map the kinect data to the concepts Bharatanatyam in Figure 15. Now, we intend to map the concept of Bharatanatyam to the concept of Labanotation as our goal is generate a parse-able XML descriptor of Bharatanatyam Adavu. The ontology is shown in Figure 17. There are 4 layers in the ontology–

  1. Input or Sensor Layer: This layers contains the data captured by the sensor. We capture the video of the dance using Kinect sensor. The data contains the K-frames as well as T-frames. Here we intend to transcript only the K-frames.

  2. Dance Layer: According to the ontology shown in Figure 15, the K-frames contains No-Motions events. The No-Motions events are nothing but the Key postures of Bharatanatyam Adavu as shown in Figure 8. Here, we map the key posture in terms of direction, level, degree of folding and touch concept of Labanotation. The leg, arm and head of a key posture are get mapped into the Laban concept.

  3. Laban Descriptor Layer: Each key posture has corresponding Laban frame in the Laban staff. The legs described in terms of leg and support of Labanotation. Arm and head have one to one mapping between Bharatanatyam ontology to Laban ontology.

  4. Transcript Layer: Finally, the Laban ontology gets encoded into a parse-able XML format so that the Labanotation can get visualized or animation can get generated from the XML.

7 Laban Encoding of an Adavu Posture

We represent Adavu as a sequence of key postures. To transcribe an Adavu, we need to transcribe every key posture that occur in the Adavu. For the purpose of use, we encode the symbols of direction and level in Table 21, the degree of folding in Table 22, and the touch attribute in Table 23..

Direction Place Left Side Right Side Left Forward Right Forward Left Backward
Encoding 1 2 3 4 5 6
Direction
Right
Backward
Left
Forward Diagonal
Right
Forward Diagonal
Left
Backward Diagonal
Right
Backward Diagonal
Encoding 7 8 9 10 11
Level HIGH MID LOW
Encoding 1 2 3
Table 21: Encoding of Directions and Levels in Labanotation
Degree of Folding No Fold
Fold
Degree 1
Fold
Degree 2
Fold
Degree 3
Fold
Degree 4
Fold
Degree 5
Full
Fold
Encoding 0 1 2 3 4 5 6
Table 22: Degree of Folding
Foot
Parts
Full heel One half heel Whole foot
One eighth
ball
One fourth
ball
One half
ball
Encoding 1 2 3 4 5 6
Foot Parts Full ball Pad of toe Full toe Nail of toe No touch
Encoding 7 8 9 10 0
Table 23: Type of touch with floor
Posture Start End Beat Bols
Name Frame Frame Number
(a) (b) (c) (d) (e)
Natta1P1 70 89 0 No Bol
Natta1P2 101 134 1 tei yum
Natta1P1 144 174 2 tat ta
Natta1P3 189 218 3 tei yum
Natta1P1 231 261 4 ta
Table 24: Annotation of the video of Natta Adavu Variation 1

The sequence of key postures occurring in the first 4 beats of Natta Adavu Variation 1 are shown in Table 24. A posture is described in terms of legs, arms, head, and hands using the vocabulary (Section 3.3.1) for the annotation of the limbs. Now, we want to transcribe the posture. Hence, we need to encode the body parts from Bharatanatyam terminology to Labanotation descriptor. For example, consider key posture Natta1P1 Natta Adavu Variation 1. Let us describe the posture using the Labanotation symbols. The posture is shown in Figure 19. The different body parts of the posture are marked in different colors like arm is marked as yellow. Annotation of the body parts of postures Natta1P1 is given in Table 25 (We exclude Hasta Mudra from the transcription work).

Figure 18: Natta1P1 Posture annotated for Laban encoding
Body Position Formation Vocab
Part Left Right
Posture = Natta1P1
Leg Aayata [S] Aayata Aayata Table 9
Arm Natyarambhe [S] Natyarambhe Natyarambhe Table 10
Head Samam Table 11
Table 25: Annotation of the body parts of postures in Natta Adavu 1

The next challenge is to map the Bharatanatyam ontology to the Labanotation ontology. As an example, we encode the posture Natta1P1 (Figure 18, Table 25) to Laban in Table 26 and Figure 19.

  1. Leg: The leg is in Aayata position which means:

    • The weight of the body is on both legs. So the legs are in support (as both leg are taking the weight of the body). The Support Direction and Support Level are encoded accordingly in Table 26.

    • The left (right) foot is in left (right) direction. The legs are not stretched in any direction, so the legs are in place.

    • The folding of the legs indicate that the level of the leg is low.

    • The legs are not crossing each other, so the Leg Crossing = 0.

    • We mark the symmetric position of both the legs using Mirror = 1. If Mirror = 1, then the direction of the right leg will just be in the opposite of the left leg.

    • The body weight is not on hip so Hip Support = 0.

    • Both legs are folded at the knee. The Knee in folding in around , so Knee Folding = 3 (Figure 22).

    • The whole feet are touching the ground, so Touch = 3 (Figure 23).

  2. Arm: The arms are in Natyarambhe which means:

    • The hands are stretched in left and right side of the body at the shoulder level and are slightly folded at elbow. So, Arm Direction = 2, Arm Level = 2 and Elbow Folding = 1.

    • Arm is not occluding with the body (Body Inclusion = 0) and

    • Both arms are similar (Mirror = 1).

  3. Head: The head is in Samam which means:

    • The head is straight and forward (Head Direction = 1 and Level = Middle).

The complete Laban encoding for Natta1P1 is shown in Table 26. In a similar manner we have encodes the other postures used in Bharatanatyam Adavu. This has been done with the help of the experts.

Leg Vocab
Support
Direction
Support Level Leg Direction Leg Level Leg Crossing Mirror
Aayata 1 3 0 0 0 1
Hip Support Knee Folding Touch
0 3 3
Arm Vocab Arm Direction Arm Level Arm Crossing Elbow Folding Body Inclusion Mirror
Natyarambhe 2 2 0 1 0 1
Head Vocab Direction Level
Samam 1 2
Table 26: Laban Encoding of Leg, Arm and Head of Posture = Natta1P1
Figure 19: Labanotation of Leg, Arm and Head of Posture = Natta1P1

7.1 LabanXML

While the graphical symbolization of Laban and our encoding in tabular formats as above are both forms of transcription, neither is amenable to machine processing. To visualize the postures and to build further applications based on the transcripts, we need a searchable and parseable representation. So we adopt LabanXML [nakamura2006xml] – an eXtensible Markup Language (XML) design for Labanotation. LabanXML bundles columns of the staff in four groups – left, right, support and head. Left and right, in turn, contains arm and leg.

The tags of LabanXML are as follows:

  • <laban>: This is root tag which includes <attribute> and <notation> tags.

  • <attribute>: This includes tag <title> used to name the XML file.

  • <notation>: This includes tag <measure>.

  • <measure>: Which gives position of current pose on time line.

  • <left>: Contains tags for columns appearing on left side of Labanotation.

  • <right>: Contains tags for columns appearing on right side of Labanotation.

  • <support>: Describes the support element in Labanotation columns. It has attribute side having value left or right indicating the side of the support.

  • <arm>, <leg>, <foot>, <head>, and <support>: These tags include
    <direction> and <level> tags of the respective limb.

  • <elbow>, <knee>: These tags include <degree> for degree of folding.

  • <touch>: This tag is included in <support> tag and <leg> tag. It describes how foot is hooked to floor.

Using the above tags, we represent the information from Table 26 in XML format in Table 27. The graphical representation of Laban encoding is shown in Figure 19. The symbols described earlier are used to write the XML tags in Laban staff.

-<laban>
   -<attribute>
      <title>natta_1</title>
   </attribute>
   -<notation>
      -<measure num="0">
         -<left>
            -<arm duration="1">
               <direction>2</direction>
               <level>2</level>
             </arm>
            -<elbow duration="1">
               <Degree>1</Degree>
             </elbow>
            -<foot>
               <touch>3</touch>
             </foot>
            -<knee duration="1">
               <Degree>3</Degree>
             </knee>
          </left>
         -<right>
            -<arm duration="1">
               <direction>3</direction>
               <level>2</level>
             </arm>
            -<elbow duration="1">
               <Degree>1</Degree>
             </elbow>
            -<foot>
               <touch>3</touch>
             </foot>
            -<knee duration="1">
               <Degree>3</Degree>
             </knee>
          </right>
         -<support side="left">
            <direction>1</direction>
            <level>3</level>
          </support>
         -<support side="right">
            <direction>1</direction>
            <level>3</level>
          </support>
         -<head>
            <direction>1</direction>
            <level>2</level>
            </head>
          </measure>
    </notation>
</laban>
Table 27: LabanXML of Posture Natta1P1

7.2 Tool Overview

To build the Adavu Transcription Tool, we first encode our ontological models of Adavus, especially the key postures and their sequences, and the video annotations in Laban ontology following the approach as illustrated in Section 7. This cross-ontology of concepts (called Posture Ontology) are then represented in a mapping database indexed by the posture ID. This is used by the Adavu Transcription Tool as given in Figure 20. We explain the modules below.

Figure 20: Architecture of the Adavu Transcription Tool
(C01) (C02) (C03) (C04)
(C05) (C06) (C07) (C08)
(C09) (C10) (C11) (C12)
(C13) (C14) (C15) (C16)
(C17) (C18) (C19) (C20)
(C21) (C22) (C23)
Figure 21: 23 Key Postures of Natta Adavus with Class / Posture IDs
Posture Training Test Posture Training Test
ID data data ID data data
C01 6154 1457 C13 235 80
C02 3337 873 C14 393 117
C03 3279 561 C15 404 121
C04 1214 219 C16 150 48
C05 1192 268 C17 161 51
C06 1419 541 C18 323 81
C07 1250 475 C19 175 46
C08 284 112 C20 168 43
C09 306 133 C21 19 6
C10 397 162 C22 21 6
C11 408 117 C23 118 61
C12 229 84
Numbers indicate the number of K-frames. Each K-frame is given by the frame number of the RGB frame in the video. Associated depth and skeleton frames are used as needed. Various position and formation information on body parts are available for every K-frame from annotation
Table 28: Data Set for Posture Recognition using 23 posture classes in Figure 21

Posture Recognizer

This is a machine learning based system [mallick2019posture] helps to recognize a unique posture id when RGB frame of key posture is given. We first extract the human figure, eliminate the background, and convert the RGB into grayscale image. We next compute the Histograms of Oriented Gradient (HOG) descriptors for each posture frame. Finally, we use HOG feature to train the same SVM classifier. There are total 23 key postures in Natta Adavus. To recognize the postures into 23 posture classes, we use One vs. Rest type of multi-class SVM. The data set shown in Table 28 is used for training and testing the SVM. For testing we use the trained SVM models to predict the class labels. Our accuracy of the posture recognition is 97.95%.

Now we use the trained classifier to recognizer the input sequence of key postures. The key posture recognizer extract the sequence of key postures in terms of their posture IDs from the video of an Adavu performance.

Indexing Laban Descriptor by Posture ID

Given a posture ID, we look up the Posture Ontology to get the Laban descriptor values for the different limbs in terms of a database record.

LabanXML Generator

From the database record of Laban descriptors an equivalent LabanXML file is generated using the definition of tags as in Section 7.1.

Laban Visualizer

Since Labanotation is graphical, it is important to visualize it in terms of its icons. So we implement a converter from LabanXML to Scalable Vector Graphics (SVG). SVG is an XML-based vector image format for two-dimensional graphics with support for interactivity and animation. Like XML, SVG images can also be created and edited with any text editor, as well as with drawing software. The SVG converter is written in C++ on cygwin64 using libxml xml parser. SVG images are also rendered in PNG (using Inkscape) for easy to use offline notation.

Figure 22: Key postures of Natta Adavu Variant 1 with transcription in LabanXML (a part) and depiction in Laban Staff by our tool

7.3 Results and Discussion

Our tool is able to generate transcription for a sequence of key frames. For given a sequence of RGB frames, Posture Recognizer generates their posture IDs. These posture IDs are mapped to corresponding cluster IDs in the laban ontology. By using posture IDs and ontology a Laban transcription for all frames is encoded in LabanXML. By using LabanXML a stack of Labans for BN Adavu key postures is generated. The Laban XML and stack of postures in Laban Staff, as generated by our tool for the sequence of key postures of Natta Adavu variation 1, are shown in Figure 22. For a sequence of key frames from Natta Adavu 1, we show the transcription in Figure 22. The RGB frames are shown on left and the corresponding Laban descriptors are shown on the staff on right. An initial part of the LabanXML is given in the middle.

8 Conclusion

In this paper, we demonstrate a system to generate parse-able representation of Bharatanatyam dance performance and document the parse-able representation using Labanotation. The system uses a unique combination of multimedia ontology and machine learning techniques. To the best of our knowledge this is the first work towards automatic documentation of dance using any notation.

In the process of developing the system, we have also presented a detailed ontology for Bharatanatyam Adavus which is a maiden such attempt for any Indian Classical Dance. Finally, we have captured and annotated a sizable dataset for Adavus, part of which is also available for use at: [icd_dataset].

In future we intend to extend our work to document more fine description of each postures. We are also interested to capture movement which we have used for this study. Finally, we also want to extend our work to generate the ontology automatically guided by the grammar of the dance form.

Acknowledgment

The authors would like to thank Tata Consultancy Services (TCS) for providing the fund and support for this work.

References