determination of the provenance of vinica using terra cotta icons using support vector machines...

1
cotta icons/reliefs (Figure 1.) were found cotta icons/reliefs (Figure 1.) were found stematic archaeological excavations in 1985, stematic archaeological excavations in 1985, Fortress, Southwest of the town of Vinica, Fortress, Southwest of the town of Vinica, rn part of Republic of Macedonia (Figure rn part of Republic of Macedonia (Figure undamaged terra cotta icons and over hundred undamaged terra cotta icons and over hundred h more than fifteen different scenes, have h more than fifteen different scenes, have ed so far. According to the art historians, ed so far. According to the art historians, d from the 6 from the 6 th th to 7 to 7 th th century AD and represent century AD and represent xamples of our Christian cultural heritage xamples of our Christian cultural heritage partially preserved fragments of terra cotta partially preserved fragments of terra cotta rty three clay samples from eight different rty three clay samples from eight different dius of 12 km from Vinica (Figure 2. b.) have dius of 12 km from Vinica (Figure 2. b.) have . Nineteen elements were determined using the Nineteen elements were determined using the trumental techniques: X-ray fluores-cence, trumental techniques: X-ray fluores-cence, tion spectrophotometry and flame photometry. tion spectrophotometry and flame photometry. mparison of the obtained data did not reveal mparison of the obtained data did not reveal ation of the clay used for the terra cotta ation of the clay used for the terra cotta sed on previous chemometric experience [4], a sed on previous chemometric experience [4], a upport vector machines (SVM) was developed to upport vector machines (SVM) was developed to provenance. provenance. Figure 1. Vinica terra Figure 1. Vinica terra cotta icons cotta icons procedure using genetic algorithms was procedure using genetic algorithms was ral times. In order to force the genetic ral times. In order to force the genetic search for simpler models, a large penalty search for simpler models, a large penalty d to the models defined by more than six d to the models defined by more than six s approach helps to exclude from the s approach helps to exclude from the lements that does not have discriminative lements that does not have discriminative models (combination of elements, penalty models (combination of elements, penalty width of the Gaussian function) were able width of the Gaussian function) were able he samples from the training set correctly he samples from the training set correctly alidation leave-one-out. The frequency of alidation leave-one-out. The frequency of the analysed elements for the models the analysed elements for the models ess than seven elements are presented in ess than seven elements are presented in can notice that the most frequently two can notice that the most frequently two ents are K ents are K 2 2 O and Sr. The elements Cr O and Sr. The elements Cr 2 2 O O 3 3 , , Co, Zn, Cu and Ag are not as important for Co, Zn, Cu and Ag are not as important for as the rest of the elements. as the rest of the elements. e 2. a – Map of the Republic of Macedonia; e 2. a – Map of the Republic of Macedonia; b – Vinica region (clay pits: 1 – 8) b – Vinica region (clay pits: 1 – 8) discussion discussion Support vector machines (SVM) are an algorithm Support vector machines (SVM) are an algorithm binary classification of linearly separable cla binary classification of linearly separable cla a very fast, simple and promising algorit a very fast, simple and promising algorith generalization performances. Using an appro generalization performances. Using an appro function (Figure 3.), SVM could successfully be function (Figure 3.), SVM could successfully be into a non-linear classifier. into a non-linear classifier. The multi-category classification is p The multi-category classification is p consecutive construction of several binary consecutive construction of several binary (Figure 4.). The parameters of the SVM models (Figure 4.). The parameters of the SVM models parameter and the width of the Gaussian kernel parameter and the width of the Gaussian kernel well as the most suitable elements for the clas well as the most suitable elements for the clas the clay samples were chosen using genetic al the clay samples were chosen using genetic al encoding of the chromosomes in the population encoding of the chromosomes in the population as presented in Figure 5. as presented in Figure 5. During the During the preliminary in- preliminary in- vestigation, the vestigation, the width of the kernel width of the kernel function and the function and the penalty parameter penalty parameter were searched in the were searched in the inter-vals presented inter-vals presented in Figure 6. Using in Figure 6. Using genetic algorithms, genetic algorithms, in the final phase of in the final phase of the analysis, the the analysis, the penalty pa-rameter of penalty pa-rameter of the models as well as the models as well as the width of the the width of the kernel function, were kernel function, were searched in the searched in the intervals that intervals that produce models with produce models with no classification no classification errors (for the errors (for the samples in the samples in the training set) and at training set) and at the same time, models the same time, models with smaller number with smaller number of support vectors. of support vectors. In this phase, the In this phase, the best combination of best combination of ele-ments suitable ele-ments suitable for clas-sification for clas-sification was also deter-mined. was also deter-mined. 1. K. Balabanov, 1. K. Balabanov, Terakotnite ikoni od Makedonija Terakotnite ikoni od Makedonija , Tabernak , Tabernak 1995. 1995. 2. E. Dimitrova, 2. E. Dimitrova, Keramičkite reljefi od Viničkoto Kale Keramičkite reljefi od Viničkoto Kale , Gjurg , Gjurg References References frequency of appearance of different elements in frequency of appearance of different elements in Figure 6. The performances of the SV Figure 6. The performances of the SV the preliminary investiga the preliminary investiga (a – number of misclassified (a – number of misclassified b – number of support vectors for t b – number of support vectors for t Figure 5. The encoding of the Figure 5. The encoding of the used for optimization of SVM used for optimization of SVM m GA GA Genetic Algorithms Support Vector Support Vector Machines Machines Figure 3. Nonlinear Figure 3. Nonlinear mapping in higher mapping in higher dimensional feature dimensional feature space space Figure 4. Use of the SVM algorithm for classificati data set consisting of three classes b. b. a. a. b. b.

Upload: lenard-mcdowell

Post on 29-Jan-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DETERMINATION OF THE PROVENANCE OF VINICA USING TERRA COTTA ICONS USING SUPPORT VECTOR MACHINES Vinka Tanevska, Igor Kuzmanovski*, Orhideja Grupče and

Vinica terra cotta icons/reliefs (Figure 1.) were found during the Vinica terra cotta icons/reliefs (Figure 1.) were found during the systematic archaeological excavations in 1985, in the Vinica Fortress, systematic archaeological excavations in 1985, in the Vinica Fortress, Southwest of the town of Vinica, in the Eastern part of Republic of Southwest of the town of Vinica, in the Eastern part of Republic of Macedonia (Figure 2.a.). Fifty undamaged terra cotta icons and over Macedonia (Figure 2.a.). Fifty undamaged terra cotta icons and over hundred fragments with more than fifteen different scenes, have been hundred fragments with more than fifteen different scenes, have been discovered so far. According to the art historians, they are dated from the discovered so far. According to the art historians, they are dated from the 66thth to 7 to 7thth century AD and represent exceptional examples of our Christian century AD and represent exceptional examples of our Christian cultural heritage [1, 2].cultural heritage [1, 2].

Ten samples of partially preserved fragments of terra cotta icons and thirty Ten samples of partially preserved fragments of terra cotta icons and thirty three clay samples from eight different sites in a radius of 12 km from three clay samples from eight different sites in a radius of 12 km from Vinica (Figure 2. b.) have been analyzed. Nineteen elements were Vinica (Figure 2. b.) have been analyzed. Nineteen elements were determined using the following instrumental techniques: X-ray fluores-determined using the following instrumental techniques: X-ray fluores-cence, atomic absorption spectrophotometry and flame photometry. The cence, atomic absorption spectrophotometry and flame photometry. The simple comparison of the obtained data did not reveal the exact location of simple comparison of the obtained data did not reveal the exact location of the clay used for the terra cotta icons [3]. Based on previous chemometric the clay used for the terra cotta icons [3]. Based on previous chemometric experience [4], a method using support vector machines (SVM) was experience [4], a method using support vector machines (SVM) was developed to determine the provenance.developed to determine the provenance.

Figure 1. Vinica terra cotta iconsFigure 1. Vinica terra cotta icons

The entire procedure using genetic algorithms was repeated several The entire procedure using genetic algorithms was repeated several times. In order to force the genetic algorithm to search for simpler times. In order to force the genetic algorithm to search for simpler models, a large penalty was introduced to the models defined by more models, a large penalty was introduced to the models defined by more than six elements. This approach helps to exclude from the models, than six elements. This approach helps to exclude from the models, the elements that does not have discriminative power.the elements that does not have discriminative power.More than 68 models (combination of elements, penalty parameters More than 68 models (combination of elements, penalty parameters and width of the Gaussian function) were able to classify the samples and width of the Gaussian function) were able to classify the samples from the training set correctly using cross validation leave-one-out. from the training set correctly using cross validation leave-one-out. The frequency of appearance of the analysed elements for the models The frequency of appearance of the analysed elements for the models composed of less than seven elements are presented in Figure 7. One composed of less than seven elements are presented in Figure 7. One can notice that the most frequently two analysed elements are Kcan notice that the most frequently two analysed elements are K22O and O and

Sr. The elements CrSr. The elements Cr22OO33, V, V22OO55, Pb, Ni, Co, Zn, Cu and Ag are not as , Pb, Ni, Co, Zn, Cu and Ag are not as

important for classification as the rest of the elements.important for classification as the rest of the elements.

Figure 2. a – Map of the Republic of Macedonia; Figure 2. a – Map of the Republic of Macedonia; b – Vinica region (clay pits: 1 – 8)b – Vinica region (clay pits: 1 – 8)

Results and discussionResults and discussion

Support vector machines (SVM) are an algorithm suitable for binary Support vector machines (SVM) are an algorithm suitable for binary classification of linearly separable classes. SVM are a very fast, simple and classification of linearly separable classes. SVM are a very fast, simple and promising algorithm with good generalization performances. Using an promising algorithm with good generalization performances. Using an appropriate kernel function (Figure 3.), SVM could successfully be trans-appropriate kernel function (Figure 3.), SVM could successfully be trans-formed into a non-linear classifier.formed into a non-linear classifier.The multi-category classification is performed by consecutive construction The multi-category classification is performed by consecutive construction of several binary classifiers (Figure 4.). The parameters of the SVM models of several binary classifiers (Figure 4.). The parameters of the SVM models (the penalty parameter and the width of the Gaussian kernel function) as (the penalty parameter and the width of the Gaussian kernel function) as well as the most suitable elements for the classification of the clay samples well as the most suitable elements for the classification of the clay samples were chosen using genetic algorithms. The encoding of the chromosomes were chosen using genetic algorithms. The encoding of the chromosomes in the population was performed as presented in Figure 5.in the population was performed as presented in Figure 5.

During the preliminary in-During the preliminary in-vestigation, the width of vestigation, the width of the kernel function and the kernel function and the penalty parameter the penalty parameter were searched in the inter-were searched in the inter-vals presented in Figure 6. vals presented in Figure 6. Using genetic algorithms, Using genetic algorithms, in the final phase of the in the final phase of the analysis, the penalty pa-analysis, the penalty pa-rameter of the models as rameter of the models as well as the width of the well as the width of the kernel function, were kernel function, were searched in the intervals searched in the intervals that produce models with that produce models with no classification errors no classification errors (for the samples in the (for the samples in the training set) and at the training set) and at the same time, models with same time, models with smaller number of support smaller number of support vectors. In this phase, the vectors. In this phase, the best combination of ele-best combination of ele-ments suitable for clas-ments suitable for clas-sification was also deter-sification was also deter-mined.mined.

1. K. Balabanov, 1. K. Balabanov, Terakotnite ikoni od MakedonijaTerakotnite ikoni od Makedonija, Tabernakul, Skopje, 1995., Tabernakul, Skopje, 1995.2. E. Dimitrova, 2. E. Dimitrova, Keramičkite reljefi od Viničkoto KaleKeramičkite reljefi od Viničkoto Kale, Gjurgja, Skopje, 1993., Gjurgja, Skopje, 1993.3. S. Pavlovska – Josifovska, 3. S. Pavlovska – Josifovska, Hemiski ispituvanja na viničkite terakotiHemiski ispituvanja na viničkite terakoti, M.Sc. , M.Sc. thesis, Univerzitet “Sv. Kirili i Metodij”, Prirodno–matematički fakultet, Institut za thesis, Univerzitet “Sv. Kirili i Metodij”, Prirodno–matematički fakultet, Institut za hemija, Skopje, 1996.hemija, Skopje, 1996.4. V. Tanevska, I. Kuzmanovski, O. Grupče, 4. V. Tanevska, I. Kuzmanovski, O. Grupče, Ann. Chim-RomeAnn. Chim-Rome, 97 (2007) 541–552., 97 (2007) 541–552.5. V.N. Vapnik, 5. V.N. Vapnik, The Nature of Statistical Learning TheoryThe Nature of Statistical Learning Theory, Wiley, New York, 1995., Wiley, New York, 1995.6. L. Davis, 6. L. Davis, The Handbook of Genetic AlgorithmsThe Handbook of Genetic Algorithms, Van Nostrand Reingold, New , Van Nostrand Reingold, New York, 1991.York, 1991.

ReferencesReferences

Figure 7. The frequency of appearance of different elements in the SVM modelsFigure 7. The frequency of appearance of different elements in the SVM models

Figure 6. The performances of the SVM models during the Figure 6. The performances of the SVM models during the preliminary investigation preliminary investigation

(a – number of misclassified samples; (a – number of misclassified samples; b – number of support vectors for the same models)b – number of support vectors for the same models)

Figure 5. The encoding of the chromosomes used for Figure 5. The encoding of the chromosomes used for optimization of SVM models with GAoptimization of SVM models with GA

Genetic Algorithms

Support Vector MachinesSupport Vector Machines

Figure 3. Nonlinear mapping Figure 3. Nonlinear mapping in higher dimensional in higher dimensional feature spacefeature space

Figure 4. Use of the SVM algorithm for classification of a data set consisting of three classes

b.b.

a.a. b.b.