türkçe İçin otomatik zıt anlamlılar sözlüğü oluşturma aracıtürkçe İçin otomatik zıt...

17
International Journal of Turkish Education and Training Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi Sayfa:1-17 Sayı/Volume: 2 Yıl/Years 1 ISSN: 2458-9462 Türkçe İçin Otomatik Zıt Anlamlılar Sözlüğü Oluşturma Aracı (AADICT) Automated Antonym Dictionary Generation Tool for Turkish (AADICT) Çağdaş Can Birant 1 , Özlem Aktaş 2 , Özgün Koşaner 3 , Belgin Aksu 4 , Yalçın Çebi 5 Özet Bu çalışmada, Türkçe için geliştirilen Otomatik Zıt Anlamlılar Sözlüğü Oluşturma Aracı (AADICT) kısaca açıklanmış ve algoritmaların gelişim süreci ayrıntılı olarak anlatılmıştır. Türk Dil Kurumu (TDK: Türk Dil Kurumu) tarafından yayınlanan Güncel Türkçe Sözlük verilerine AADICT uygulanarak zıt ve karşıt kelime veritabanı elde edilmiştir. Üç seviyede işlemler uygulanarak zıt anlamlılar sözlüğü oluşturma süreci gerçekleştirilmiştir. Bu işlemlerin bir sonucu olarak belirlenen zıt anlamlı sözcükler Kesin Zıt Anlamlı (Definite Antonym (Dn)) olarak sınıflandırılmış ve Zıt Anlamlı Sözcükler Listesi’ne (Antonym List (ALi)) konulmuştur. Kesin Zıt Anlamlı (Dn) olarak sınıflandırılamayan bazı sözcükler Belirsiz Dosya (Ambiguity File (AF)) olarak adlandırılan ve ileride daha güvenilir zıt anlamlılar veritabanını oluşturabilmek amacıyla denetimli yöntemlerle kontrol edilebilecekleri bir dosyada saklanmıştır. Kesin Zıt Anlamlılar Veritabanı (Definite Antonyms Database (DADB)) olarak adlandırılan zıt anlamlılar listesi, Güncel Türkçe Sözlük üzerine AADICT uygulanarak inşa edilmiş ve TDK resmi internet sitesinde "Türkçe Karşıt Anlamlılar Sözlüğü" olarak yayınlanmıştır. Geliştirilen bu sözlük, okullarda herhangi bir seviyedeki öğrencilere ve yabancı dil olarak Türkçe öğrenen öğrencilere Türkçeyi doğru şekilde öğrenmelerine yardımcı olabilecektir. Ayrıca, bilişimsel dilbilim alanında yapılan çalışmalarda da kullanılabilecektir. Anahtar Kelimeler: zıt anlam, sözlük, Türkçe, dilbilim. 1 Res.Asst.Dr., Dokuz Eylul University Computer Engineering Department, e-mail: [email protected] 2 Asst.Prof.Dr., Dokuz Eylul University Computer Engineering Department, e-mail: [email protected], Corresponding Author 3 Asst.Prof.Dr., Dokuz Eylul University Linguistics Department, e-mail: kosaner.ozgun @deu.edu.tr 4 Türk Dil Kurumu, e-mail: [email protected] 5 Prof.Dr., Dokuz Eylul University Computer Engineering Department, e-mail: [email protected]

Upload: others

Post on 18-Jan-2020

22 views

Category:

Documents


0 download

TRANSCRIPT

International Journal of Turkish Education and

Training Uluslararası Türkçe

Eğitimi ve Öğretimi Dergisi Sayfa:1-17

Sayı/Volume: 2 Yıl/Years 1

ISSN: 2458-9462

Türkçe İçin Otomatik Zıt Anlamlılar Sözlüğü Oluşturma Aracı

(AADICT)

Automated Antonym Dictionary Generation Tool for Turkish

(AADICT)

Çağdaş Can Birant1, Özlem Aktaş2, Özgün Koşaner3, Belgin Aksu4, Yalçın Çebi5

Özet

Bu çalışmada, Türkçe için geliştirilen Otomatik Zıt Anlamlılar Sözlüğü Oluşturma

Aracı (AADICT) kısaca açıklanmış ve algoritmaların gelişim süreci ayrıntılı olarak anlatılmıştır.

Türk Dil Kurumu (TDK: Türk Dil Kurumu) tarafından yayınlanan Güncel Türkçe Sözlük

verilerine AADICT uygulanarak zıt ve karşıt kelime veritabanı elde edilmiştir. Üç seviyede

işlemler uygulanarak zıt anlamlılar sözlüğü oluşturma süreci gerçekleştirilmiştir. Bu işlemlerin

bir sonucu olarak belirlenen zıt anlamlı sözcükler Kesin Zıt Anlamlı (Definite Antonym (Dn))

olarak sınıflandırılmış ve Zıt Anlamlı Sözcükler Listesi’ne (Antonym List (ALi)) konulmuştur. Kesin

Zıt Anlamlı (Dn) olarak sınıflandırılamayan bazı sözcükler Belirsiz Dosya (Ambiguity File (AF))

olarak adlandırılan ve ileride daha güvenilir zıt anlamlılar veritabanını oluşturabilmek

amacıyla denetimli yöntemlerle kontrol edilebilecekleri bir dosyada saklanmıştır.

Kesin Zıt Anlamlılar Veritabanı (Definite Antonyms Database (DADB)) olarak adlandırılan

zıt anlamlılar listesi, Güncel Türkçe Sözlük üzerine AADICT uygulanarak inşa edilmiş ve TDK

resmi internet sitesinde "Türkçe Karşıt Anlamlılar Sözlüğü" olarak yayınlanmıştır. Geliştirilen

bu sözlük, okullarda herhangi bir seviyedeki öğrencilere ve yabancı dil olarak Türkçe öğrenen

öğrencilere Türkçeyi doğru şekilde öğrenmelerine yardımcı olabilecektir. Ayrıca, bilişimsel

dilbilim alanında yapılan çalışmalarda da kullanılabilecektir.

Anahtar Kelimeler: zıt anlam, sözlük, Türkçe, dilbilim.

1 Res.Asst.Dr., Dokuz Eylul University Computer Engineering Department, e-mail:

[email protected] 2 Asst.Prof.Dr., Dokuz Eylul University Computer Engineering Department, e-mail:

[email protected], Corresponding Author 3 Asst.Prof.Dr., Dokuz Eylul University Linguistics Department, e-mail: kosaner.ozgun

@deu.edu.tr 4 Türk Dil Kurumu, e-mail: [email protected] 5 Prof.Dr., Dokuz Eylul University Computer Engineering Department, e-mail:

[email protected]

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 2

Abstract

In this paper, an Automated Antonym Dictionary Generation Tool for Turkish

(AADICT) is briefly described and the development process of the algorithms is given in

details. By applying the AADICT on to the data of Contemporary Turkish Dictionary, which is

published by Turkish Linguistic Association (TDK: Türk Dil Kurumu), antonyms and opposite

words database was obtained. The antonym dictionary generation process was carried out

through three processes. As a result of these processes, the definite antonyms were classified as

Definite Antonym (Dn) and put into the Antonym List (ALi). Some words, which could not be

classified as Dn, were classified as “Ambiguity” and stored in a file called Ambiguity File (AF) to

be checked out by supervised methods to build more reliable antonym database.

The antonyms database, which is called “Definite Antonyms Database (DADB)”, for

Contemporary Turkish Dictionary was built by applying AADICT, has been currently

published as “Turkish Antonyms Dictionary” on the official web site of TDK. This dictionary

will help the students that get as lesson in the school at any level or learn Turkish as a foreign

language. Also, it may be used in the researches of computational linguistics field.

Keywords: antonym, dictionary, Turkish, linguistics.

1. Introduction

A dictionary is a collection of words (also called headwords) in a specific

language, often listed alphabetically, with definitions, etymologies, phonetics,

pronunciations, and other information (Agnes, 1999). Dictionaries are

commonly printed as the form of a book, but nowadays they can also be used as

online via the Internet.

Specialized dictionaries are sometimes found in specialized areas, such as

idioms, proverbs, synonyms, acronyms, antonyms, etc. Some concept of assets

may meet the opposite of the other words in a language. These words, which

mean opposite to other words, are called antonyms. Antonyms are opposite

pairs of words. They may have fully opposite or nearly opposite meanings,

which are used for communicating easily by expressing thoughts appropriately.

Linguists have found that there are three general categories of antonyms

(Juveland, 2012):

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 3

I. Complementary Pairs

Complementary pairs of antonyms are perfect opposites without a

meaning of intermediate states in between them. For example; Hungry is

the complementary antonym of full, far is the complementary antonym of

near.

II. Gradable Pairs

Gradable pairs of antonyms are opposites at the extreme ends of a

continuum of states. For example; first is the gradable antonym of last.

III. Relational Opposites

Relational opposite antonyms are words with opposite meanings that are

used in similar conditions. For example teach is the relational opposite of

learn.

Antonym dictionaries have, at least, the following advantages for both

general users and researchers, especially for researchers in the field of

linguistics specifically:

Automated data extraction from a large body of text corpus,

Query clustering applications to help the search engines which use a

question-answer structure (Wen, Nie and Zhang, 2002),

Automatic indexing procedures to help to assign each word-stem to a

concept class (Salton, 1971),

Automatic machine translation studies (Edmonds, 1999),

Automatic author recognition through lexical choice (Reiter and Sripada,

2002).

Defining verbs’ conceptual structures and event types in order to provide

more complete verb frames for syntactic parser software (Chu-Ren and

Hong, 2005),

Producing more lexically cohesive texts for authors from various fields

(Donnely, 1994),

Revealing the interactional relationship between syntax and semantics

(Chief et. al., 2000),

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 4

Helping foreign language teachers in teaching vocabulary and helping

learners of a second language in using appropriate vocabulary according

to the situation (Martin 1984).

In this paper, the Automated Antonym Dictionary Generation Tool

(AADICT) developed for Turkish has been described in details.

2. Automated Antonym Dictionary Generation Tool for Turkish

(AADICT)

2.1. The Structure of the Data Used for AADICT

The basic data source used in this study is The Contemporary Turkish

Dictionary, which includes more than 70,000 words and published by the

Turkish Linguistic Association (TDK). Antonyms are given in different forms in

this dictionary database. After taken from the Turkish Linguistic Association,

unnecessary fields and tables is removed from this database and the database

structure is simplified as given in Table 1.

Table 1. The Structure of the Source Data

Head Word Meaning

Aç (Hungry)

Yemek yemesi gereken, tok karşıtı.

(Need to eat food, opposite of satiated)

Yiyecek bulamayan

(Unable to find food)

Gözü doymaz, haris

(Eye insatiable, greedy)

Çok istekli, hevesli

(Very eager, enthusiastic)

Karnı doymamış

(Unsaturated)

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 5

Tok (Full)

Açlığını gidermiş, doymuş, aç karşıtı

(Eliminated hunger, satiated, opposite of hungry)

Kalın ve gür (ses)

(Thick and lush (sound))

Sevgi, sevecenlik, başarı, para, mal vb. şeyleri elde etmiş ve bunlara

kavuşmuş olan

(The one that achieved and reached the love, compassion, success, money, goods,

etc.)

As given in Table 1, some words have more than one meaning, and some

meanings include both synonyms and antonyms. In general, synonyms are

located as separate words in the meaning part, but in some cases, they are

located at the end of the descriptive sentence. And, antonyms are located before

the word opposite at the end of the meaning part, but in some cases there is more

than one word which are separated by a comma before the word opposite and

complementary antonym of the main word, for example “Kara: siyah, ak, beyaz

karşıtı”. Therefore, some ambiguities occurred during the antonym extraction

process.

2.2.General Flow of AADICT

In this study, depending on both the characteristics of Turkish and the

structure of the Turkish Dictionary, an Automated Antonym Dictionary

Generation Tool (AADICT) was developed. The general workflow diagram of

the tool is given in Figure 1.

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 6

Antonym Parser

Turkish

Dictionary

Definite

Antonyms

Database

(DADB)

Ambiguity

Resolver

Ambiguity File

(AF)

DB Simplification

Figure 1: General Workflow Diagram.

In order to extract acronyms from the database and put them in an order,

an acronym parser should be used. By using the simplified dictionary database,

including head words and meanings of each word, the “Antonym Parser (AP)”

module determines the antonyms and the words causing ambiguities.

The words which are definitely classified as “antonyms” are stored in

another database for further processes and the words or phrases classified as

“ambiguities” are stored in a file for resolving process which is carried out by

supervised techniques.

In the dictionary database simplification step, the meanings of the headwords

were taken one by one from different tables in the current Turkish dictionary database,

and a new simplified database was generated. After generating the simplified

database, the acronym parsing process is applied by “Antonym Parser (AP)” module.

The flowchart of the AP module is given in Figure 2.

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 7

Turkish

Dictionary

Read headword

(mainword)

Read Meaning of

mainword

Last head word

Store antonyms in AL

into Definite Antonyms

Database (DADB)

Get a PO

Last meaning

Chain Cross Check Process

no

no

yes

yes

Last PD

yes

Cross Check Process

no

Antonym Parsing Process

A

B

C

D

E

Get a PD

F

Last POno

yes

Get a D

WR

Store D in

Antonym List (AL)

G

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 8

Figure 2: General workflow diagram of “Antonym Parser” module

2.3.Description of AADICT

In the first step of the process, which can be defined as antonym parsing,

meanings of head words are taken one by one and examined (Figure 3).

Get the words between commas and Dn

and indicate as PDn

Meaning includes

the word “opposite”

Get POs

yes

no

A

B

W

Get last word between comma and the

word “opposite”, indicate as Dn

R

Figure 3: Workflow Diagram of “Antonym Parsing Process”

The head word, the antonym of which is to be searched, is denoted as

“main word”. Each meaning of the main word is parsed and controlled if it is a

whole sentence, a single word or a sequence of single words separated by

commas (“,”) with the word “opposite of” located end of the sentence. A word

in the meaning of the main word is accepted as a “Possible (POn)” antonym,

when it is located in the meaning part alone, or a single word between commas,

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 9

with the word “opposite of” located at end of the sentence. The word located

just before the word “opposite” is accepted as the antonym of the main word

and classified as “definite (Dn)” antonym, and stored in the “Antonym List (AL)”.

For the data given in Table 1, the words “tok (full)” was classified as

definite antonym (D) and stored in “Antonym List (AL)” for the main word “aç

(hungry)”; the word “doymuş (satiated)” was classified as possible synonym

(PO), the word “aç (hungry)” was classified as Definite antonym (D) and stored

in AL for the main word “tok” in the “Antonym Parsing Process” module.

After the antonym parsing process, all POn antonyms are cross-checked

with the head words in the dictionary (Figure 4). If it is found as a head word in

the dictionary then it is classified as “Pre-Definite (PDn)” acronym to be

processed in the next module, otherwise it is discarded.

PO

is a head word

Classify as “Pre-Definite

Antonym(PDn)”

yes

Ignore PO

C

D

no

Figure 4: Workflow Diagram of “Cross Check Process”

After a POn antonym is classified as PDn antonym, the antonym chain

cross check process is applied (Figure 5). In this process, meaning of each PDn is

checked as if it includes the main word before the word “opposite” located end

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 10

of the meaning. If so, it is classified as Dn, otherwise it is stored in Ambiguity

File to be processed later.

Read Meaning of a PD

Meaning includes

“opposite of main word” yes

no Classify PD as D

F

E

G

Figure 5: Workflow Diagram of “Chain Cross Check Process”

The chain cross check process continues until no more words are found

to be classified as Dn. In some cases, the entire dictionary is required to be

checked to find the antonyms of the main word.

The notations, which are used in “Antonym Parser” module, are simply

given as follows.

𝑃𝑂(𝑤) = { 𝑊𝑗 , 𝑗 = 1 … 𝑚,

𝑚: 𝑠𝑖𝑛𝑔𝑙𝑒 𝑤𝑜𝑟𝑑𝑠 𝑠𝑒𝑝𝑒𝑟𝑎𝑡𝑒𝑑 𝑏𝑦 𝑐𝑜𝑚𝑚𝑎𝑠 𝑏𝑒𝑓𝑜𝑟𝑒 𝑤𝑜𝑟𝑑 opposite

𝑖𝑛 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛𝑖𝑛𝑔𝑠 𝑜𝑓 "𝑚𝑎𝑖𝑛 𝑤𝑜𝑟𝑑"

𝑃𝐷(𝑤) = { 𝑊𝑖 , 𝑖 = 1 … 𝑛,𝑛: 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑃𝑂𝑠, 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑒𝑑 𝑖𝑛 𝑡ℎ𝑒 "Cross Check Process" 𝑚𝑜𝑑𝑢𝑙𝑒

𝐷(𝑤) = { 𝑃𝐷𝑎, 𝑎 = 1 … 𝑛,𝑛: 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑃𝐷𝑠 , 𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑒𝑑 𝑖𝑛 " 𝐶hain Cross Checking Process"

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 11

𝐴𝑛𝑡𝑜𝑛𝑦𝑚(𝑤) = { 𝐷𝑖(𝑤) , 𝑖 = 1 … 𝑛, 𝑛: 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐷𝑠 𝑖𝑛 𝑡ℎ𝑒 𝐴𝐿

The processes given above can be explained on a sample word. The

selected main word for this process is “beyaz (white)” which has five different

meanings in Turkish. The meanings of the main word, the Pre-Definite and

Possible Antonyms obtained during the processes are given in Table 2.

Table 2. The Meanings of Main Words, Pre-definite and Possible Antonyms

Head Word Meaning

Beyaz (White)

Ak, kara, siyah karşıtı.

(Hoar, opposite of dark, black.)

Bu renkte olan.

(Things which are in this colour.)

Beyaz ırktan olan kimse.

(A person belonging to white race.)

Baskıda normal karalıkta görünen harf çeşidi.

(A kind of letter visible in normal darkness at press.)

Beyaz zehir.

(White poison/heroin.)

Ak (Hoar)

Kar, süt vb.nin rengi, beyaz, kara ve siyah karşıtı

(The color of the snow, milk, etc., white, the opposite of dark and black)

Bu renkte olan

(Things which are in this colour.)

Beyaz leke

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 12

(White stain)

Temiz

(Clean)

Dürüst

(Honest)

Sıkıntısız, rahat

(Untroubled, comfortable)

Siyah (Black)

Kara, ak, beyaz karşıtı

(Dark, the opposite of hoar and White)

Bu renkte olan

(Things which are in this colour.)

Baskıda başka harflerden daha kalın görünen harf türü

(Bold letters in the print)

Kara (Dark)

En koyu renk, siyah, ak, beyaz karşıtı

(The darkest colour, black, opposite of hoar, white.)

In the first step of the general process, which is called as “Antonym

Parsing Process”, the word “siyah (black)” classified as Definite Antonym (D1)

and stored in Antonym List for the main word “beyaz (white)”; the words “ak

(hoar)” and “kara (dark)” are defined as “Possible Antonym (PO1 and PO2)”

(Figure 6).

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 13

Figure 6: Antonym Parsing Process for “beyaz” Main Word

After the “antonym parsing process” module, the “cross checking process”

module run, the words “ak (hoar)” and “kara (dark)”were checked if they exist in

the dictionary as head words. All of these words were found in the dictionary,

classified as PD1 and PD2 respectively (Figure 7).

Figure 7: Cross Checking Process for “ak (hoar)” and “kara (dark)”.

In the “Chain Cross Checking Process” step, firstly, the meanings of PD1

and PD2 were checked whether they includes the word “opposite”. In the first

meaning of the PD1,“ak (hoar)”, the word located just before the word

“opposite” was not same as the main word, so PD1 was ignored and put into

the ambiguity list to be processed later. In the first meaning of the PD2, “kara

(dark)”, the PD2 was classified as D2 and stored in Antonym List (AL) (Figure 8).

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 14

Figure 8: Definite Antonyms for The Main Word “beyaz”.

After the definite antonyms found by AADICT, they were stored into the

“Definite Antonym Database (DADB)” in the form as shown in Table 3.

Table 3. Sample Data in DADB

Head Word Antonyms

beyaz siyah

ak kara

ince iri / kaba / kalın /pes

kaba ince

2.4.Ambiguities and Solving Methods

Like synonyms, the antonyms of the head word are written in the full

meaning between commas. Therefore, during antonym search process, some

ambiguities are appeared such as in the “beyaz (white)” headword which has

five different meanings as given in Table 1.

In the first meaning of the headword “beyaz”, the definition is given as

“Ak, kara, siyah karşıtı”. The words between commas, “ak” and “kara” are

accepted as POs. Also, after applying the method mentioned above onto these

words, both of them are determined as Pre-Definite antonyms (PDn) of the

word “beyaz”.

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 15

However, even “ak” is synonym of “beyaz”, “kara” is antonym. Since the

word “ak” is located in the meaning part of “kara” and is determined as its pre-

definite antonym, this problem cannot be solved during cross check process.

Meanings, which include such problems (ambiguities), has also an additional

phrase as “ ….. karşıtı (opposite of …)” at the end. In most cases, this problem

should be taken into consideration by taking the preceding word just before

that phrase. The PDs or POs, which have such ambiguities and cannot be solved

by the AADICT system, are stored in a file named as “Ambiguity File (AF)” for

further supervised resolving processes.

The supervised process is carried out by the experts in Turkish Linguistic

Association (TDK) and Dokuz Eylul University Linguistics Department. After

the verification process, the antonyms, which are denoted as D of any head

word, are stored into the DADB.

Conclusion

The antonyms dictionaries help the students that get as lesson in the

school at any level or learn Turkish as a foreign language. They are also used by

the researchers work in Natural Language Processing field. The studies on the

computerized analysis of Turkish, such as n-gram analysis, dependency

analysis of words, automated machine translation, automatic author detection,

etc., were begun in the 1990s and till now an antonym dictionary has not been

created. Therefore, with the collaboration of Turkish Linguistic Association

(TDK: Türk Dil Kurumu), an Automated Antonym Dictionary Generation Tool

for Turkish (AADICT) was developed.

In this study, AADICT is described in general. The data used throughout

this study was taken from Contemporary Turkish Dictionary, which is

published by TDK. By using this dictionary data, an antonym database for

Turkish has been developed and has currently been published by TDK in the

official web site at www.tdk.org.tr (Turkish Language Association, 2012)

(Figure 9).

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 16

Figure 9: Antonyms of The Word “ince” in the “Online Antonyms Dictionary of

TDK”

Since the dictionary database was not suitable and included unnecessary

data, a simplification and filtering process was carried out by removing

unnecessary fields and tables. Although the antonyms in the dictionary were

correctly determined by AADICT, in some cases, depending on both the nature

of Turkish and structure of the dictionary, some words were mis-determined.

For example, some words have more than one meaning, and some meanings

include both synonyms and antonyms in the data source, resulting ambiguities

in antonyms extraction process. In general, antonyms are located at the end of

the meaning list, before the word “opposite” such as “aç: yemek yemesi gereken,

tok karşıtı”, but in some cases, they are located at the end of the descriptive

sentence after the “,” (comma) mark such as “beyaz: ak, kara, siyah karşıtı”.

However, the antonyms are located in the same way may cause ambiguity, such

as in the meaning of “beyaz”. In order to generate a reliable antonym dictionary

and overcome some ambiguities, supervised methods are required. For the

antonym dictionary, all ambiguities were overhauled and finalized by the

experts in Turkish Linguistic Association (TDK) and Dokuz Eylül University

Linguistics Department.

IJTET, Cilt 1, Sayı 2, Temmuz 2016

Çağdaş Can Birant, Özlem Aktaş, Özgün Koşaner,

Belgin Aksu, Yalçın Çebi

Uluslararası Türkçe Eğitimi ve Öğretimi Dergisi: Kuram ve Uygulama 17

References

Agnes, M. E. (1999). Webster's New World College Dictionary, Cleveland: New

World Dictionaries.

Chief, L., Huang, C., Chen, K., Tsai, M. & Chang, L. (2000). What can near

synonyms tell us?. Computational linguistics and Chinese Language

Processing, 5 (1), 47-60.

Chu-Ren, H. & Hong, J. (2005). Deriving Conceptual Structures Form Sense: A

Study of Near Synonymous Sensation Verbs, Journal of Chinese Language

and Computing (JCLC), 15 (3), Singapore.

Donnely, C. (1994). Linguistics for Writers, Buffalo: State University of New York

Press.

Edmonds, P. (1999). Semantic Representations of Near-Synonyms for

Automatic Lexical Choice (Ph.D. thesis). Toronto: Computer Science

Department University of Toronto.

Juveland, S. (2012). What is an Antonym?,

http://www.ehow.com/facts_7227940_antonym_.html#ixzz294W8QxR2

(last accessed: 14.11.2012).

Martin, M. (1984). Advanced Vocabulary Teaching: The Problem of Synonyms,

The Modern Language Journal, 68 (2), pp. 130-137.

Reiter, E. & Sripada, S. (2002). The SMART Retrieval System - Experiments in

Automatic Document Processing, Computational Linguistics, 28 (4), pp.

447-485.

Salton, G. (1971). The SMART retrieval system: Experiments processing,

Englewood Cliffs, NJ: Prentice-Hall Inc.

Turkish Language Association (Türk Dil Kurumu: TDK). (2012). Zıt Anlamlı

Kelimeler Sözlüğü, tdk.org.tr/esveyakin/ (last accessed: 14.11.2012)

Wen, J., Nie, J. & Zhang, H. (2002). Query clustering using user logs, ACM

Transactions on Information Systems, 20 (1), pp.59-81.