feature selection for traditional malay musical instruments sounds classification using rough set

Upload: journal-of-computing

Post on 08-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    1/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 72Feature Selection for Traditional MalayMusical Instruments SoundsClassification using Rough SetNorhalina Senan, Rosziati Ibrahim, Nazri Mohd Nawi,Iwan Tri Riyadi Yanto and Tutu t Herawan

    Abstract Finding the most relevant features are crucial in data mining task including musical instruments soundsclassification problem. Various feature selection techniques have been proposed in this domain focusing on Western musical

    instruments. However, study on rough set theory for feature selection of non-Western musical instruments sounds is

    insufficient and still needs further exploration. Thus, in this paper, an alternative feature selection technique using maximum

    attributes dependency based on rough set theory for Traditional Malay musical instruments sounds is proposed. The

    modelling process comprises eight phases: data acquisition, sound editing, data representation, feature extraction, data

    discretization, data cleansing, feature selection using the proposed technique and finally features evaluation via classifier.

    The results show that the selected features generated from the proposed technique able to reduce the complexity process

    and improve the classification performancesignificantly.Index Terms Rough set theory; Dependency attribute; Feature selection, Classification, Traditional Malay musicalinstruments sounds.

    1 INTRODUCTIONITH the growing volume of digital audio data

    and feature schemes, feature selection has be-

    come very vital aspect in musical instruments sounds

    classification problems. In general, the purpose of the

    feature selection is to alleviate the effect of the curse of

    dimensionality. While, from the classification point ofview, the main idea of feature selection is to construct

    an efficient and robust classifier. It has been proven in

    practice that the optimal classifier difficult to classify

    accurately if the poor features are presented as the

    input. This is because some of the input features have

    poor capability to split among different classes and

    some are highly correlated [1]. As a consequence, the

    overall classification performance might decrease

    with this large number of features available. For that,

    finding only relevant subset of features may signifi-

    cantly reduce the complexity process and improve theclassification performance by eliminating irrelevant

    and redundant features.

    This shows that the problem of feature selection must

    be addressed appropriately. For that, various feature

    selection algorithms in musical instruments sounds

    classification have been proposed by several research-

    ers [1,2,3]. Liu and Wan [1] carried out a study on

    classifying the musical instruments into five families

    (brass, keyboard, percussion, string and woodwind)

    using NN, k-NN and Gaussian mixture model

    (GMM). Three categories of features schemes whichare temporal features, spectral features and coefficient

    features (with total of 58 features) were exploited. A

    sequential forward selection (SFS) is used to choose

    the best features. The k-NN classifier using 19 features

    achieves the highest accuracy of 93%. In [2], they con-

    ducted a study on selecting the best features schemes

    based on their classification performance. The 44 fea-

    tures from three categories of features schemes which

    are human perception, cepstral features and MPEG-7

    were used. To select the best features, three entropy-

    based feature selection techniques which are Informa-tion Gain, Gain Ratio and Symmetrical Uncertainty

    were utilized. The performance of the selected fea-

    tures was assessed and compared using five classifi-

    ers which are k-nearest neighbor (k-NN), naive bayes,

    support vector machine (SVM), multilayer perceptron

    (MLP) and radial basic functions (RBF). They found

    that the Information Gain produce the best classifica-

    tion accuracy up to 95.5% for the 20 best features with

    SVM and RBF classifiers. Benetos et al. [3] applied

    subset selection algorithm with branch-bound search

    strategy for feature reduction. A combination of 41features from general audio data, MFCC and MPEG-7

    was used. By using the best 6 features, the non-

    negative matrix factorization (NMF) classifier yielded

    an accuracy rate of 95.2% at best. They found that the

    N. Senan is with the Universiti Tun Hussein Onn Malaysia, 86400 BatuPahat, Johor, Malaysia.

    R. Ibrahim is with the Universiti Tun Hussein Onn Malaysia, 86400 BatuPahat, Johor, Malaysia.

    N.M. Nawi is with the Universiti Tun Hussein Onn Malaysia, 86400 Batu

    Pahat, Johor, Malaysia. I.T.R. Yanto is with the Universitas Ahmad Dahlan, Yogyakarta 55166,

    Indonesia. T. Herawan is with the Universiti Malaysia Pahang, 26300 Gambang

    Kuantan. Malaysia.

    W

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    2/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 73

    feature subset selection method adopted in their

    study able to increase the classification accuracy. In

    overall, all these works demonstrate that the reduced

    features able to produce highest classification rate

    with less computational time. On the other hand,

    Deng et al. [2] claimed that benchmarking is still an

    open issue in this area of research. This shows that the

    existing feature selection approaches applied in the

    various sound files may not effectively work to other

    conditions. Therefore, there were significant needs to

    explore other feature selection methods with different

    types of musical instruments sounds in order to find

    the best solution.

    One of the potential techniques for dealing with this

    problem is based on the rough set theory. The theory

    of rough set proposed by Pawlak in 1980s [4] is a ma-

    thematical tool for dealing with the vague and uncer-tain data. Rough sets theory is one of the useful tools

    for feature selection [5,6,7]. Banerjee et al. [5] claimed

    that the concept of core in rough set is relevant in fea-

    ture selection to identify the essential features

    amongst the non-redundant ones. The attractive cha-

    racteristics of rough set in tackling the problem of

    imprecision, uncertainty, incomplete, irrelevant or

    redundancy in the large dataset, has magnificently

    attracted researchers in wide areas of data mining

    domain to utilize rough set for feature selection.

    However, to date, a study on rough sets for featureselection of musical instruments sounds classification

    is scarce and still needs an intensive research. It is

    well-known that one of the most crucial aspects of

    musical instruments sounds classification is to find

    the best features schemes. With the special capability

    of rough set for feature selection, we are going to ap-

    ply this technique in musical instruments sounds

    classification to overcome this issue.

    In this paper, an alternative feature selection tech-

    nique based on rough set theory for Traditional Malay

    musical instruments sounds classification is proposed.This technique is developed based on rough set ap-

    proximation using maximum degree of dependency

    of attributes proposed by [8]. The idea of this tech-

    nique is to choose the most significant features by

    ranking the relevant features based on the highest

    dependency of attributes on the dataset and then re-

    move the redundant features with the similar depen-

    dency value. To accomplish this study, the quality of

    the instruments sounds is first examined. Then, the 37

    features from two combination of features schemes

    which are perception-based and Mel-Frequency Cep-stral Coefficients (MFCC) are extracted [9].In order to

    employ the rough set theory, this original dataset

    (continuous values) is then discritized into categorical

    values by using equal width and equal frequency bin-

    ning algorithm [10]. Afterwards, data cleansing

    process is done to remove the irrelevant features. The

    propose technique is then adopted to rank and select

    the best feature set from the large number of features

    available in the dataset. Finally, the performance of

    the selected features in musical instruments sounds

    classification is further evaluated with two classifiers

    which are rough set and Multi-Layer Perceptron

    (MLP).

    The rest of this paper is organized as follows: Section

    2 presents the theory of rough set. Section 3 describes

    the details of the modelling process. A discussion of

    the result is presented in Section 4 followed by the

    conclusion in Section 5.

    2 ROUGH SET THEORYIn the 1980s, Pawlak [4] introduced rough set theory

    to deal the problem of imprecise knowledge. Similarly

    to fuzzy set theory it is not an alternative to classical

    set theory but it is embedded in it. Fuzzy and rough

    sets are not competitively, but complementary each

    other [11,12]. Rough set theory has attracted attention

    of many researchers and practitioners all over the

    world, who contributed essentially to its development

    and applications. The original goal of the rough set

    theory is induction of approximations of concepts.The idea consists of approximation of a subset by a

    pair of two precise concepts called the lower approxi-

    mation and upper approximation. Intuitively, the lower

    approximation of a set consists of all elements that

    surely belong to the set, whereas the upper approxi-

    mation of the set constitutes of all elements that pos-

    sibly belong to the set. The difference of the upper

    and the lower approximation is a boundary region. It

    consists of all elements that cannot be classified uni-

    quely to the set or its complement, by employing

    available knowledge. Thus any rough set, in contrastto a crisp set, has a non-empty boundary region. Mo-

    tivation for rough set theory has come from the need

    to represent a subset of a universe in terms of equiva-

    lence classes of a partition of the universe. In this Sec-

    tion, the basic concepts of rough set theory in terms of

    data are presented.

    2.1 Information SystemData are often presented as a table, columns of which

    are labeled by attributes, rows by objects of interest and

    entries of the table are attribute values. By an informa-tion system , a 4-tuple (quadruple) ( )fVAUS ,,,= ,where U is a non-empty finite set of objects, A is a

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    3/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 74

    non-empty finite set of attributes, Aa aVV = , aV isthe domain (value set) of attribute a, VAUf : is

    a total function such that ( )a

    Vauf , , for every

    ( ) AUau , , called information (knowledge) func-

    tion. An information system is also called a know-

    ledge representation systems or an attribute-valued

    system and can be intuitively expressed in terms of an

    information table (refer to Table 1).

    In many applications, there is an outcome of classifi-

    cation that is known. This a posteriori knowledge is

    expressed by one (or more) distinguished attribute

    called decision attribute; the process is known as su-

    pervised learning. An information system of this kind

    is called a decision system. A decision system is an in-

    formation system of the form { }( )fVdAUD ,,, = ,

    where Ad

    is the decision attribute. The elements ofAare called condition attributes. A simple example of

    decision system is given in Table 2.

    TABLE1AN INFORMATION SYSTEMU 1a 2a Aa

    1u ( )

    11,auf ( )

    21,auf Aauf ,1

    2u ( )

    12,auf ( )

    22,auf Aauf ,2

    Uu

    1,auf

    U

    2,auf

    U AU auf ,

    Example 2.1. Suppose there are given data about 6

    students, as shown in Table 2.

    TABLE2ADECISION SYSTEMStudent Analysis Algebra Statistics Decision

    1 bad good medium accept

    2 good bad medium accept

    3 good good good accept

    4 bad good bad reject

    5 good bad medium reject

    6 bad good good accept

    From Table 2, it has

    { }6,5,4,3,2,1=U ,{ } { } DCA === DecisionStatisticsAlgebra,Analysis,

    ,

    { }goodbad,Analysis

    =V ,

    { }goodbad,Algebra

    =V ,

    { }goodmedium,bad,Statistics

    =V ,

    { }rejectaccept,Decision

    =V .

    A relational database may be considered as an infor-

    mation system in which rows are labeled by the ob-

    jects (entities), columns are labeled by attributes and

    the entry in row u and column a has the value ( )auf , .

    It is note that a each map ( ) VAUauf :, is a tup-

    ple ( ) ( ) ( )Aiiiii

    aufaufaufauft ,,,,,,,,321= , for

    Ui 1 , where X is the cardinality of X. Note that

    the tuple t is not necessarily associated with entity

    uniquely (refer to students 2 and 5 in Table 2). In an

    information table, two distinct entities could have the

    same tuple representation (duplicated/redundant

    tuple), which is not permissible in relational databases.

    Thus, the concepts in information systems are a gene-

    ralization of the same concepts in relational databases.

    2.2 Indiscernibili ty RelationFrom Table 2, note that students 2, 3 and 5 are indis-

    cernible (similar or indistinguishable) with respect to

    the attribute Analysis. Meanwhile, students 3 and 6

    are indiscernible with respect to attributes Algebra

    and Decision, and students 2 and 5 are indiscernible

    with respect to attributes Analysis, Algebra and Statis-

    tics. The starting point of rough set theory is the in-

    discernibility relation, which is generated by informa-

    tion about objects of interest. The indiscernibility rela-

    tion is intended to express the fact that due to the lack

    of knowledge it is difficult to discern some objects

    employing the available information. That means, in

    general, it is unable to deal with single objects but

    clusters of indiscernible objects must be considered.

    Now the notion of indiscernibility relation between

    two objects can be defined precisely.

    Definition 2.1. Let ( )fVAUS ,,,= be an informationsystem and let B be any subset of A. Two elements Uyx ,

    are said to be B-indiscernible (indiscernible by the set of

    attribute AB in S) if and only if ( ) ( )ayfaxf ,, = , forevery Ba .

    Obviously, every subset ofA induces unique indiscer-

    nibility relation. Notice that, an indiscernibility rela-

    tion induced by the set of attribute B , denoted by( )BIND , is an equivalence relation. It is well known

    that, an equivalence relation induces unique partition.

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    4/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 75

    The partition of U induced by ( )BIND in

    ( )fVAUS ,,,= denoted by BU/ and the equivalenceclass in the partition BU/ containing Ux , denoted

    by [ ]Bx .

    Given arbitrary subset UX , in general, X as unionof some equivalence classes in U might be not pre-

    sented. It means that, it may not be possible to de-

    scribe X precisely inAS . X might be characterized by

    a pair of its approximations, called lower and upper

    approximations. It is here that the notion of rough set

    emerges.

    2.3 Set ApproximationsThe indiscernibility relation will be used next to de-

    fine approximations, basic concepts of rough set

    theory. The notions of lower and upper approxima-tions of a set can be defined as follows.

    Definition 2.2. Let ( )fVAUS ,,,= be an informationsystem, let B be any subset of A and let X be any subset of

    U. The B-lower approximation of X, denoted by ( )XB and

    B-upper approximations of X, denoted by ( )XB , respective-

    ly, are defined by

    ( ) [ ] XxUxXBB= and

    ( ) [ ]{ }= XxUxXB B .

    The accuracy of approximation (accuracy of rough-

    ness) of any subset UX with respect to AB , de-

    noted ( )XB

    is measured by

    ( )( )

    ( )XB

    XBX

    B= ,

    where X denotes the cardinality of X. For empty set

    , ( ) 1=B

    is defined. Obviously, ( ) 10 XB

    . If X

    is a union of some equivalence classes of U , then

    ( ) 1=XB

    . Thus, the set X is crisp (precise) with re-

    spect to B. And, if X is not a union of some equiva-

    lence classes of U , then ( ) 1

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    5/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 76

    1=k . Otherwise, D is partially depends on C. Thus, D

    fully (partially) depends on C , if all (some) elements

    of the universe Ucan be uniquely classified to equiva-

    lence classes of the partition DU/ , employing C.

    Example 2.3. From Table 2, there are no total depen-dencies whatsoever. If in Table 1, the value of the

    attribute Statistics for student 5 were bad instead of

    medium, there would be a total dependency

    { } { }DecisionStatistics , because to each value of the

    attribute Statistics there would correspond unique

    value of the attribute Decision. For example, for de-

    pendency { } { }DecisionStatisticsAlgebra,Analysis, ,

    3

    2

    6

    4==k is obtained, because four out of six students

    can be uniquely classified as having Decision or not,

    employing attributes Mathematics, Algebra and Sta-tistics.

    Note that, a table may be redundant in two ways. The

    first form of redundancy is easy to notice: some ob-

    jects may have the same features. This is the case for

    tuples 2 and 3 of Table 2. A way of reducing data size

    is to store only one representative object for every set

    of so-called indiscernible tuples as in Definition 2.1.

    The second form of redundancy is more difficult to

    locate, especially in large data tables. Some columns

    of a table may be erased without affecting the classifi-cation power of the system. This concept can be ex-

    tended also to information systems, where the condi-

    tional and decision attributes are do not distin-

    guished. Using the entire attribute set for describing

    the property is time-consuming, and the constructed

    rules may be difficult to understand, to apply or to

    verify [13]. In order to deal with this problem,

    attribute reduction is required. The objective of reduc-

    tion is to reduce the number of attributes, and at the

    same time, preserve the property of information.

    2.5 Reducts and CoreA reduct is a minimal set of attributes that preserve the

    indiscernibility relation. A core is the common parts of

    all reducts. In order to express the above idea more

    precisely, some preliminaries definitions are needed.

    Definition 2.5. Let ( )fVAUS ,,,= be an informationsystem and let B be any subsets of A and let a belongs to B.

    It say that a is dispensable (superfluous) in B if

    { }( ) BUbBU // = , otherwise a is indispensable in B.

    To further simplification of an information system,

    some dispendable attributes from the system can be

    eliminated in such a way that the objects in the table

    are still able to be discerned as the original one.

    Definition 2.6. Let ( )fVAUS ,,,= be an informationsystem and let B be any subsets of A. B is called indepen-

    dent (orthogonal) set if all its attributes are indispensable.

    Definition 2.7. Let ( )fVAUS ,,,= be an informationsystem and let B be any subsets of A. A subset *B of B is a

    reduct of B if *B is independent and BUBU /*/ = .

    Thus a reduct is a set of attributes that preserves parti-

    tion. It means that a reduct is the minimal subset of

    attributes that enables the same classification of ele-

    ments of the universe as the whole set of attributes. In

    other words, attributes that do not belong to a reduct

    are superfluous with regard to classification of ele-ments of the universe. While computing equivalence

    classes is straighforward, but the problem of finding

    minimal reducts in information systems is NP-hard.

    Reducts have several important properties. One of

    them is a core.

    Definition 2.8. Let ( )fVAUS ,,,= be an informationsystem and let B be any subsets of A.The intersection off

    all reducts of is called the core of B, i.e.,

    ( ) ( ) BB RedCore = ,

    Thus, the core of B is the set off all indispensable

    attributes of B. Because the core is the intersection of

    all reducts, it is included in every reduct, i.e., each

    element of the core belongs to some reduct. Thus, in a

    sense, the core is the most important subset of

    attributes, for none of its elements can be removed

    without affecting the classification power of

    attributes.

    Example 2.4. To illustrate in finding the reducts and

    core, the information system as shown in Table 3 is

    considered. The information system is modified from

    Example 3 as in [14].

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    6/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 77TABLE3A MODIFIED INFORMATION SYSTEM [14]# A B C D

    1 low bad loss small

    2 low good loss large

    3 high good loss medium

    4 high good loss medium

    5 low good loss large

    Let { }DCBAX ,,,= , { }CBAX ,,1= and { }DCX ,

    2= .

    These sets of attributes produce the following

    partitions

    { } { } { } { }{ }5,4,3,2,1/ =XU , { } { } { } { }{ }5,4,3,2,1/ 1 =XU and

    { } { } { } { }{ }5,4,3,2,1/ 2 =XU ,

    respectively.

    Therefore, by Definition 2.5, the sets { }D and{ }BA,

    are dispensable (superfluous). From definition 2.6, the

    sets1

    X and2

    X are independent (orthogonal). Hence,

    from Definition 2.7, conforming that1

    X and2

    X are

    reducts of X . Furthermore, from Definiton 2.8, the

    intersection { }CXX =21

    is the core of X .

    3 THE MODELING PROCESSIn this section, the process of this study is presented.

    There are seven main phases which are data acquisi-

    tion, sound editing, data representation, feature ex-

    traction, data discretization, data cleansing and fea-

    ture selection using proposed technique. Figure 1 illu-

    strates the phases of this process. To conduct this

    study, the proposed model is implemented in MAT-

    LAB version 7.6.0.324 (R2008a). It is executed on a

    processor Intel Core 2 Duo CPUs. The total main

    memory is 2 gigabytes and the operating system is

    Windows Vista. The details of the modelling process

    as follows:

    3.1 Data Acquisition, Sound Editing, DataRepresentation and Feature ExtractionThe 150 sounds samples of Traditional Malay musical

    instruments were downloaded from personal [15] and

    Warisan Budaya Malaysia web page [16]. The dataset

    comprises four different families which are membra-

    nophones, idiophones, aerophones and chordo-

    phones. The distribution of the sounds into families is

    shown in Table 4. This original dataset is non-

    benchmarking (real work) data. The number of the

    original sounds per family is imbalance which also

    differs in term of the lengthwise. It is well-known that

    the quality of the data is one of the factors that might

    affect the overall classification task. To this, the data-

    set is firstly edited and trimmed. Afterwards, two cat-

    egories of features schemes which are perception-

    based and MFCC features were extracted. All 37 ex-

    tracted features from these two categories are shown

    in Table 5. The first 1-11 features represent the percep-

    tion-based features and 12-37 are MFCCs features.

    The mean and standard deviation were then calcu-

    lated for each of these features. In order to avoid bi-

    ased classification, the dataset is then eliminated to

    uniform size. The details of these phases can be found

    in [17].

    Fig 1. The modelling process for feature selection of the Tradi-tional Malay musical instruments sounds classification.

    TABLE4DATA SETSFamily Instrument

    Membranophone Kompang, Geduk,

    Gedombak, Gendang,

    Rebana, Beduk, Jidur,

    Marwas, Nakara

    Idiophone Gong, Canang, Kesi,

    Saron, Angklung, Cak-

    lempong, Kecerik, Kem-

    pul, Kenong, Mong,

    Mouth Harp

    Aerophone Serunai, Bamboo Flute,

    Nafiri, Seruling Buluh

    Chordophone Rebab, Biola, Gambus

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    7/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 78

    TABLE5FEATURES DESCRIPTIONSNo Feature Description

    1 ZC Zero Crossing

    2 MEANZCR Mean of Zero Crossings Rate

    3 STDZCR Standard Deviation of Zero

    4 MEANRMS Mean of Root-Mean-Square

    5 STDRMS Standard Deviation of Root-

    6 MEANC Mean of Spectral Centroid

    7 STDC Standard Deviation of Spectral

    8 MEANB Mean of Bandwidth

    9 STDB Standard Deviation of Band-

    10 MEANFLUX Mean of Flux

    11 STDFLUX Standard Deviation of Flux

    12 MMFCC1 Mean of the MFCCs #1

    13 MMFCC2 Mean of the MFCCs #2

    14 MMFCC3 Mean of the MFCCs #3

    15 MMFCC4 Mean of the MFCCs #4

    16 MMFCC5 Mean of the MFCCs #5

    17 MMFCC6 Mean of the MFCCs #6

    18 MMFCC7 Mean of the MFCCs #7

    19 MMFCC8 Mean of the MFCCs #8

    20 MMFCC9 Mean of the MFCCs #9

    21 MMFCC10 Mean of the MFCCs #10

    22 MMFCC11 Mean of the MFCCs #11

    23 MMFCC12 Mean of the MFCCs #12

    24 MMFCC13 Mean of the MFCCs #13

    25 SMFCC1 Standard Deviation of the

    26 SMFCC2 Standard Deviation of the

    27 SMFCC3 Standard Deviation of the

    28 SMFCC4 Standard Deviation of the

    29 SMFCC5 Standard Deviation of the

    30 SMFCC6 Standard Deviation of the

    31 SMFCC7 Standard Deviation of the32 SMFCC8 Standard Deviation of the

    33 SMFCC9 Standard Deviation of the

    34 SMFCC10 Standard Deviation of the

    35 SMFCC11 Standard Deviation of the

    36 SMFCC12 Standard Deviation of the

    37 MFCC13 Standard Deviation of the

    3.2 Data DiscretizationThe features (attributes) extracted in the dataset is in

    the form of continuous value with non-categorical

    features (attributes). In order to employ the rough set

    approach in the proposed technique, it is essential to

    transform the dataset into categorical ones. For that,

    the discretization technique known as the equal width

    binning in [10] is applied. In this study, this unsuper-

    vised method is modified to be suited in the classifica-

    tion problem. The algorithm first sort the continuous

    valued attribute, then the minimummin

    x and the

    maximum maxx of that attribute is determined. Theinterval width, w, is then calculated by:

    *

    minmax

    k

    xxw

    = ,

    where, *k is a user-specified parameter for the num-

    ber of intervals to discretize of each target class. The

    interval boundaries are specified asi

    wx +min

    , where

    1,,2,1 = ki . Afterwards, the equal frequency bin-

    ning method is used to divide the sorted continuousvalues into k interval where each interval contains

    approximately kn / data instances with adjacent val-

    ues of each class. In this study, the difference of k val-

    ue (from 2 to 10) is examined. The purpose is to iden-

    tify the best k value which able to produce highest

    classification rate. For that, rough set classifier is used.

    3.3 Data Cleansing using Rough SetAs mentioned in Section 1, the dataset used in this

    study is raw data obtained from multiple resources

    (non-benchmarking data). In sound editing and datarepresentation phases, the reliability of the dataset

    used have been assessed. However, the dataset may

    contain irrelevant features. Generally, the irrelevant

    features present in the dataset are features that having

    no impact on processing performance. However, the

    existence of these features in the dataset might in-

    crease the response time. For that, in this phase, the

    data cleansing process based on rough sets approach

    explained in sub-section 2.5 is performed to eliminate

    the irrelevant features from the dataset.

    3.4 The Proposed TechniqueIn this phase, the construction of the feature selection

    technique using rough set approximation in an infor-

    mation system based on dependency of attributes is

    presented. The idea of this technique is derived from

    [8]. The relation between the properties of roughness

    of a subset UX with the dependency between two

    attributes is firstly presented as in Proposition 3.1.

    Proposition 3.1. Let ( )fVAUS ,,,= be an information

    system and let D and C be any subsets of A. If D dependstotally on C, then

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    8/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 79

    ( ) ( )XXCD

    ,

    for every .UX

    Proof. Let D and C be any subsets ofA in information

    system ( )fVAUS ,,,= . From the hypothesis, the in-

    clusion ( ) ( )DINDCIND holds. Furthermore, thepartition CU/ is finer than that DU/ , thus, it is clear

    that any equivalence class induced by ( )DIND is a

    union of some equivalence class induced by ( )CIND .

    Therefore, for every UXx , the property of equi-

    valence classes is given by

    [ ] [ ]DC xx .

    Hence, for every UX , we have the following rela-

    tion

    ( ) ( ) ( ) ( )XDXCXXCXD .

    Consequently,

    ( )( )

    ( )

    ( )

    ( )( )X

    XC

    XC

    XD

    XDX

    CD == .

    The generalization of Proposition 3.1 is given below.

    Proposition 3.2. Let ( )fVAUS ,,,= be an informationsystem and let

    nCCC ,,,

    21 and D be any subsets of A. If

    DCDCDCnknkk

    ,,,21 21

    , where

    121kkkk

    nn

    , then

    ( ) ( ) ( ) ( ) ( )XXXXXCCCCD nn 121

    ,

    for every .UX

    Proof. Let nCCC ,,, 21 and D be any subsets of A in

    information system S. From the hypothesis and Prop-

    osition 3.1, the accuracies of roughness are given as

    ( ) ( )XXCD 1

    ( ) ( )XXCD 2

    ( ) ( )XXnCD

    Since121kkkk

    nn

    , then

    [ ] [ ]1

    nn CC

    xx

    [ ] [ ]21

    nn CC

    xx

    [ ] [ ]

    12 CCxx .

    Obviously,

    ( ) ( ) ( ) ( ) ( )XXXXX CCCCD nn 121 .

    Figure 2 shows the algorithm of the proposed tech-

    nique. The technique uses the dependency of

    attributes in the rough set theory in information sys-

    tems. It consists of five main steps. The first step deals

    with the computation of the equivalence classes of

    each attribute (feature). The equivalence classes of the

    set of objects Ucan be obtained using the indiscerni-

    bility relation of attribute Aai in information sys-

    tem ( )fVAUS ,,,= . The second step deals with thedetermination of the dependency degree of attributes.

    The degree of dependency attributes can be deter-

    mined using formula in equation (1). The third step

    deals with selecting the maximum dependency de-

    gree. Next step, the attribute is ranked with the as-

    cending sequence based on the maximum of depen-

    dency degree of each attribute. Finally, all the redun-

    dant attributes are identified. The attribute with the

    highest value of the maximum degree of dependency

    within these redundant attributes is then selected.

    Algorithm: FSDA

    Input: Data set with categorical value

    Output: Selected non-redundant attribute

    Begin

    Step 1. Compute the equivalence classes using

    the indiscernibility relation on each attribute.

    Step 2. Determine the dependency degree of

    attributei

    a with respect to all ja , where

    ji .

    Step 3. Select the maximum of dependency de-gree of each attribute.

    Step 4. Rank the attribute with ascending se-

    quence based on the maximum of dependency

    degree of each attribute.

    Step 5. Select the attribute with the highest val-

    ue of maximum degree of dependency within

    the redundant attributes.

    End

    Fig 2. The FSDA algorithm

    The example to find the degree of dependency of

    attributes of an information system based on formula

    in equation (1) will be illustrated as in Example 3.1.

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    9/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 80

    Example 3.1. To illustrate in finding the degree of de-

    pendency of attributes, the information system as

    shown in Table 3 is considered. From Table 3, based

    on each attribute, there are four partitions of U in-

    duced by indiscernibility relation on each attribute,

    i.e.

    { } { }{ }4,3,5,2,1/ =AU , { } { }{ }5,4,3,2,1/ =BU ,

    { } { }{ }5,4,3,2,1/ =CU and { } { } { }{ }4,3,5,2,1/ =DU .

    Based on formula in equation (1), the degree of de-

    pendency of attribute B on attribute A , denoted

    BAk

    , can be calculated as follows.

    BA k ,

    ( ) { }

    { } 4.05,4,3,2,1

    4,3/

    ===

    U

    XA

    kBUX

    .

    Using the same way, the following degrees are ob-

    tained

    CBk

    ,( ) { }

    { }2.0

    5,4,3,2,1

    1/

    ===

    U

    XBk CUX .

    DCk

    ,( ) { }

    { }2.0

    5,4,3,2,1

    5/

    ===

    U

    XCk DUX .

    The degree of dependency of all attributes of Table 3

    can be summarized as in Table 6.

    TABLE6THE DEGREE OF DEPENDENCY OF ATTRIBUTES OF TABLE 3Attribute

    (Depends on)

    Degree of dependen-

    cy

    Maximum

    Dependency

    of Attribute

    A B C D 1

    0.2 0.2 1B A C D 1

    0.4 0.2 1

    C A B D 0.6

    0.4 0.2 0.6

    D A B C 0.4

    0.4 0.2 0.2

    From Table 6, the attributes A, B, C and D are ranked

    based on the maximum degree of dependency. It can

    be seen that the attributesA and B have similar max-imum degree of dependency. In order to select the

    best attributes and reduce the dimensionality respec-

    tively, only one of the redundant attributes will be

    chose. To do this, the selection approach in [8] is

    adopted where it is suggested to look at the next

    highest of maximum degree of dependency within the

    attributes that are bonded and so on until the bind is

    broken. In this example, attributeA is deleted from

    the list.

    3.5 Feature Evaluation via ClassificationThe performance of the best features generated from

    Section 3.4 is then further evaluated using two differ-

    ent classifiers which are rough set and MLP. Classifier

    is used to verify the performance of the selected fea-

    tures. The accuracy rate and response time achieved

    by the classifier will be analysed to identify the effec-

    tiveness of the selected features. Achieving a high

    accuracy rate is important to ensure that the selected

    features are the best relevance features that perfectly

    serve to the classification architecture which able to

    produce a good result. While, less response time is

    important to allow the classifier to operate more effec-

    tively. At the end of this phase, the result is compared

    between full features and selected features. This is

    done in order to identify the effectiveness of the se-

    lected features in handling classification problem.

    4 RESULTS AND DISCUSSIONThe main objective of this study is to select the bestfeatures using the proposed technique. Afterwards,

    the performance of the selected features is assessed

    using two different classifiers which are rough set and

    MLP. As mentioned, the assessment of the perfor-

    mance is based on the accuracy rate and response

    time achieved. Thus, in this section, the results of this

    study are presented as follows:

    4.1 The Best k Value for Discretization isDeterminedThe original dataset in continuous value is discritizedinto categorical form in order to employ rough set

    theory. For that, the modified equal width binning

    technique is employed. In this study, the difference of

    k (number of intervals) value from 2 to 10 is investi-

    gated. The best k value is determined based on the

    highest classification accuracy achieved by the rough

    set classifier. The finding reveals that k=3 able to gen-

    erate the highest classification accuracy up to 99% as

    shown in Table 7. This k value is then applied in the

    proposed feature selection technique to identify the

    best features for the dataset.

    4.2 Irrelevant Features is EliminatedThe dataset is represented in decision table form as

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    10/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 81

    S=(U,A{d}, V, f). There are 1116 instances in the un-

    iverse U , with the family of the instruments as the

    decision attribute d and all other attributes shown in

    Table 5 as the set of condition attributes,A. The distri-

    bution of all instances in each class is uniform with no

    missing values in the data. From the data cleansing

    step, it is found that {MMFCC1, SMFCC1} is the dis-

    pensable (irrelevant) set of features. It is means that

    the number of the relevant features is 35 out of 37 of

    original full features. Thus, this relevant features can

    be represented as A{MMFCC1, SMFCC1}.

    4.3 Finding the Best FeaturesIn this experiment, the proposed technique is em-

    ployed to identify the best features for Traditional

    Malay musical instruments sounds classification. As

    demonstrated in Table 8, all the 35 relevant features

    are ranked in ascending sequence based on the value

    of the maximum degree of attribute dependency.

    From the table, it is fascinating to see that some of the

    features adopted in this study are redundant. In order

    to reduce the dimensionality of the dataset, only one

    of these redundant features is selected. It is revealed

    that the propose feature selection technique able to

    select the best 17 features out of 35 features available

    successfully. The best selected features are given in

    Table 9.

    4.4 The Performance of the Selected FeaturesIn this study, two datasets which consists of the full

    features and the selected features (generated from the

    propose technique) are used as an input to classify the

    Traditional Malay musical instruments sounds into

    four families which are membranophone, idiophone,

    chordophone and aerophone. This approach is meant

    to assess the performance of the selected features as

    compared to the full features. The performance is de-

    termined based on two factors which are the accuracy

    rate and the response time. For that, two different

    classifiers which are rough set and MLP are exploited.

    From Table 10, there was slightly improvement in

    terms of the accuracy rate and response time with the

    best 17 features as compared to 35 full features by

    using MLP classifier. However, the overall perfor-

    mance of this classifier is quite satisfactory up to 95%.

    On the other hand, with rough set classifier, it can be

    seen that the accuracy rate of 99% achieved by the

    selected features is similar with the full features.

    However, it is fascinating to see that the response time

    is faster up to 80% for the selected features as com-

    pared to full features. These results show that the

    propose feature selection technique able to select the

    best features for Traditional Malay musical instru-

    ments sounds and improve the classification perfor-

    mance especially on the response time.

    TABLE7FINDING THE BEST K VALUE FOR DISCRETIZATIONk 2 3 4 5 6 7 8 9 10

    Classification Ac-

    curacy (%)93.6 98.9 98.6 98.4 98.4 98.3 98.3 98.3 98.3

    TABLE8FEATURE RANKING USING PROPOSED METHODNumber of Features Name of Features Maximum Degree of Dependency of Attributes

    3 STDZCR 0.826165

    36 SMFCC12 0.655914

    23 MMFCC12 0.52509

    24 MMFCC13 0.52509

    22 MMFCC11 0.237455

    30 SMFCC6 0.208781

    31 SMFCC7 0.208781

    1 ZC 0.193548

    37 SMFCC13 0.1819

    32 SMFCC8 0.108423

    33 SMFCC9 0.108423

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    11/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 82

    34 SMFCC10 0.108423

    35 SMFCC11 0.108423

    27 SMFCC3 0.087814

    29 SMFCC5 0.087814

    11 STDFLUX 0.077061

    21 MMFCC10 0.07706120 MMFCC9 0.074373

    6 MEANC 0.065412

    19 MMFCC8 0.065412

    18 MMFCC7 0.056452

    28 SMFCC4 0.056452

    7 STDC 0.042115

    8 MEANB 0.042115

    9 STDB 0.042115

    13 MMFCC2 0.031362

    16 MMFCC5 0.03136217 MMFCC6 0.031362

    5 STDRMS 0.021505

    10 MEANFLUX 0.011649

    2 MEANZCR 0

    4 MEANRMS 0

    14 MMFCC3 0

    15 MMFCC4 0

    26 SMFCC2 0

    TABLE9THE BESTSELECTED FEATURESNumber of Features Name of Features Maximum Degree of Dependency of Attributes

    3 STDZCR 0.826165

    36 SMFCC12 0.655914

    23 MMFCC12 0.52509

    22 MMFCC11 0.237455

    30 SMFCC6 0.208781

    1 ZC 0.193548

    37 SMFCC13 0.181932 SMFCC8 0.108423

    27 SMFCC3 0.087814

    11 STDFLUX 0.077061

    20 MMFCC9 0.074373

    6 MEANC 0.065412

    18 MMFCC7 0.056452

    7 STDC 0.042115

    13 MMFCC2 0.031362

    5 STDRMS 0.021505

    10 MEANFLUX 0.011649

  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    12/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 83

    TABLE10THE COMPARISON OF FEATURES CAPABILITIES VIA CLASSIFICATION PERFORMANCE

    Features

    Rough Set MLP

    Accuracy Rate

    (%)

    Response time

    (sec)

    Accuracy Rate

    (%)

    Response time

    (sec)

    All 35 99 2.075 94 125.53

    Best 17 99 0.405 95 120.72

    5 CONCLUSION AND FUTURE WORKSIn this study, an alternative technique of feature selection

    using rough set theory based on the maximum depen-

    dency of the attributes Traditional Malay musical in-

    struments sounds is proposed. A non-benchmarkingdataset of Traditional Malay musical instruments sounds

    is utilized. Two categories of features schemes which are

    perception-based and MFCC which consist of 37

    attributes are extracted. Afterward, the dataset is discre-

    tized into 3 categorical values. The proposed technique is

    then adopted for feature selection through feature rank-

    ing and dimensionality reduction. Finally, two classifiers

    which are rough set and MLP are employed to evaluate

    the performance of the selected features in terms of the

    accuracy rate and response time produced. In overall,

    the finding shows that the relevant features selectedfrom the proposed model able to reduce the complexity

    process and produce highest classification accuracy sig-

    nificantly. Thus, the future work will investigate the ef-

    fectiveness of the proposed technique towards other

    musical instruments sounds domain and apply different

    types of classifier to validate the performance of the se-

    lected features.

    ACKNOWLEDGEMENTThis work was supported by Universiti Tun Hussein

    Onn Malaysia (UTHM).

    REFERENCES[1] Liu, M., and Wan, C.: Feature Selection for Automatic Classifi-

    cation of Musical Instrument Sounds. In Proceedings of the 1st

    [2] Deng, J.D., Simmermacher, C., and Cranefield. S.: A Study on

    Feature Analysis for Musical Instrument Classification. IEEE

    Transactions on System, Man, and Cybernetics-Part B: Cyber-

    netics 38 (2), 429438, (2008)

    ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 01,

    247248 (2001)

    [3] Benetos, E., Kotti, M., and Kotropoulus, C.: Musical Instru-

    ment Classification using Non-Negative Matrix Factorization

    Algorithms and Subset Feature Selection. In Proceeding of

    IEEE International Conference on Acoustics, Speech and Sig-

    nal Processing, ICASSP 2006, 5, 221224, (2006)

    [4] Pawlak, Z.: Rough Sets. International Journal of Computer and

    Information Science 11, 341356 (1982)

    [5] Banerjee, M., Mitra, S., and Anand, A.: Feature Selection usingRough Sets. In M. Banerjee et al.: Multi-Objective Machine

    Learning, Studies in Computational Intelligence 16, 320,

    (2006)

    [6] Modrzejewski, M.: Feature Selection using Rough Sets Theory.

    In Proceeding of the 11th

    [7] Li, H., Zhang, W., Xu, P., and Wang, H..: Rough Set Attribute

    Reduction in Decision Systems. In Proceeding of 7

    International Conference on Machine

    Learning, LNCS 667, 213226 (1993)

    th

    [8] Herawan, T., Mustafa, M.D., and Abawajy, J.H.: Rough set

    approach for selecting clustering attribute. Knowledge Based

    Systems, 23 (3), 220231, (2010)

    Interna-

    tional Conference on Artificial Immune Systems, LNCS 5132,

    132141, (2008)

    [9] Senan, N., Ibrahim, R., Nawi, N.M, and Mokji, M.M.: Feature

    Extraction for Traditional Malay Musical Instruments Classifi-

    cation. In Proceeding of International Conference of Soft

    Computing and Pattern Recognition, SOCPAR 09, 454459,

    (2009)

    [10] Palaniappan, S., and Hong, T.K.: Discretization of Continuous

    Valued Dimensions in OLAP Data Cubes. International Jour-

    nal of Computer Science and Network Security, 8, 116126,

    (2008)

    [11] Pawlak, Z.: Rough set and Fuzzy sets. Fuzzy sets and systems,

    17, 99102, (1985)

    [12] Pawlak, Z. and Skowron, A.: Rudiments of rough sets. Infor-

    mation Science, 177 (1), 327, (2007)[13] Zhao, Y., Luo, F., Wong, S.K.M. and Yao, Y.Y.: A general defi-

    nition of an attribute reduct, LNAI 4481, 101108, (2007)

    [14] Pawlak, Z.: Rough classification. International Journal of Hu-

    man Computer Studies 51, 369383, (1983)

    [15] Warisan Budaya Malaysia: Alat Muzik Tradisional,

    http://malaysiana.pnm.my/kesenian/Index.htm

    [16] Shriver, R.: Webpage, www. rickshriver.net/hires.htm

    [17] Senan, N., Ibrahim, R., Nawi, N.M., Mokji, M.M., and Hera-

    wan, T. The Ideal Data Representation for Feature Extraction

    of Traditional Malay Musical Instrument Sounds Classifica-

    tion. To appear in De-Shuang Huang et al. ICIC

    2010, LNCS,

    (2010)

    http://www.ic-ic.org/paper/1240%5C1240.pdfhttp://www.ic-ic.org/paper/1240%5C1240.pdfhttp://www.ic-ic.org/paper/1240%5C1240.pdfhttp://www.ic-ic.org/paper/1240%5C1240.pdfhttp://www.ic-ic.org/paper/1240%5C1240.pdfhttp://www.ic-ic.org/paper/1240%5C1240.pdfhttp://www.ic-ic.org/paper/1240%5C1240.pdf
  • 8/7/2019 Feature Selection for Traditional Malay Musical Instruments Sounds Classification using Rough Set

    13/13

    JOURNAL OF COMPUTING, VOLUME 3, ISSUE 2, FEBRUARY 2011, ISSN 2151-9617

    HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

    WWW.JOURNALOFCOMPUTING.ORG 84

    Norhalina Senan received her B. Sc. and M.Sc. degree in Com-puter Science in Computer Science from Universiti Teknologi Ma-laysia. She is currently pursuing her study for Ph.D. degree in Fea-ture Selection of Traditional Malay Musical Instruments SoundsClassification using Rough Set at Universiti Tun Hussein Onn Ma-laysia. She is a lecturer at Faculty of Computer Science and Infor-

    mation Technology, Universiti Tun Hussein Malaysia. Her researcharea includes data mining, multimedia and rough set.

    Rosziati Ibrahim is with the Software Engineering Department,Faculty of Computer Science and Information Technology, UniversitiTun Hussein Onn Malaysia (UTHM). She obtained her PhD in Soft-ware Specification from the Queensland University of Technology(QUT), Brisbane and her MSc and BSc (Hons) in ComputerScience and Mathematics from the University of Adelaide, Australia.Her research area is in Software Engineering that covers SoftwareSpecification, Software Testing, Operational Semantics, FormalMethods, Data Mining, Image Processing and Object-OrientedTechnology.

    Nazri Mohd Nawi received his B.S. degree in Computer Sciencefrom University of Science Malaysia (USM), Penang, Malaysia.His

    M.Sc.degree in computer science was received from University ofTechnology Malaysia (UTM), Skudai, Johor, Malaysia. He receivedhis Ph.D.degree in Mechanical Engineering department, SwanseaUniversity, Wales. He is currently a senior lecturer in Software En-gineering Department at Universiti Tun Hussein Onn Malaysia(UTHM). His research interests are in optimization, data miningtechniques and neural networks.

    Iwan Tri Riyadi Yanto received his B.Sc degree Mathematics fromUniversitas Ahmad Dahlan, Yogyakarta. He is a Master candidate inData Mining at Universiti Tun Hussein Onn Malaysia (UTHM). Hisresearch area includes Data Mining, KDD, and numeric computa-tion.

    Tutut Herawan received his B.Ed and M.Sc degrees in Mathemat-ics from Universitas Ahmad Dahlan and Universitas Gadjah MadaYogyakarta, respectively. He obtained his Ph.D from Universiti TunHussein Onn Malaysia. Currently, he is a senior lecturer at Com-puter Science Program, Faculty of Computer Systems and Soft-ware Engineering, Universiti Malaysia Pahang (UMP). He publishedmore than 40 research papers in journals and conferences. Hisresearch area includes data mining and KDD, rough and soft settheories.