corpus and ouhk

18
{ Corpus and OUHK Charles Ko Ka Shing

Upload: charlesko

Post on 12-Nov-2014

680 views

Category:

Education


0 download

DESCRIPTION

Aim and Objectives: • To introduce the readers and leading them to the power of Corpus Linguistics • To stimulate the readers’ curiosity of the use of corpus as a methodology of research of any kind

TRANSCRIPT

Page 1: Corpus and OUHK

{Corpus and OUHK

Charles Ko Ka Shing

Page 2: Corpus and OUHK

Introducing the audience and leading them to the power of Corpus Linguistics

Aim

Page 3: Corpus and OUHK

What is Corpus? Corpus (Latin plural corpora, English plural

corpuses or corpora) is Latin for body. It may refer to: Habeas corpus, a legal mechanism to end detention of a suspect Corpus delicti, a legal term meaning "body of the crime... http://en.wikipedia.org/wiki/Corpus :

[Corpus in Linguistics, Applied Linguistics and Corpus Linguistics]

“Text corpus, in linguistics, a large and structured set of texts

Speech corpus, in linguistics, a large set of speech audio files”

Foreword

Page 4: Corpus and OUHK

Find Corpora on Yahoo! http://tw.search.yahoo.com/search?fr=fp-tab-web-t&ei=UTF-8&p=corpus

Example: The Corpus of Contemporary American English (COCA)

is the largest freely-available corpus of English, and the only large and balanced corpus of American English. The corpus was created by Mark Davies of Brigham Young University, and it is used by tens of thousands of users every month (linguists, teachers, translators, and other researchers). COCA is also related to other large corpora that we have created.

Source: http://corpus.byu.edu/coca/

Page 5: Corpus and OUHK

[…] corpus linguistics is a whole system of methods and principles of how to apply

corpora in language studies and teaching/learning, it certainly has a theoretical status. Yet theoretical status is not theory in itself…

(McEnery et al. 2006: 7f.)

What is Corpus Linguistics?

Page 6: Corpus and OUHK

In this presentation, I will use the most updated corpus, Corpus of Global Web-based ENGLISH (GloWbE)

The Corpus of Global Web-based English (GloWbE) http://corpus2.byu.edu/glowbe/that is developed on the corpus2 website involves 20 varieties of English.

To taste the GloWbE

Page 7: Corpus and OUHK

Familiaritiy towards the names of the following selected universities:

  1. The Open University of Hong Kong (OUHK) 2. The Hong Kong Polytechnic University (PolyU) 3. City University of Hong Kong (CityU) 4. Hong Kong Shue Yan University (HKSYU) 5. Hong Kong Academy for Performing Arts (HKAPA) 6. The Hong Kong Institute of Education (HKIEd) 7. The Chinese University of Hong Kong (CUHK) 8. The University of Hong Kong (HKU) 9. Hong Kong Baptist University (HKBU) 10. The Hong Kong University of Science & Technology

(HKUST)  N.B. stress on OUHK

10 selected Hong Kong “universities”

Page 8: Corpus and OUHK

I would type the abbreviation of the universities’ names one by one in searching, to find out which university’s name is mentioned the most amongst varieties of English (or Englishes) around the world.

Methods

Page 9: Corpus and OUHK

Analysis and Evaluation It is found that, the world “status”

(occurrences in the GloWbE corpus) of the University of Hong Kong is the highest; while the one of the Hong Kong Shue Yan University is the lowest.

Page 10: Corpus and OUHK

1. HKU 2. CUHK 3. HKUST

4. PolyU 5. CityU 6. OUHK 7. HKIEd

8. HKBU 9. HKAPA

10. SYU

1124 640 590 341 164 161 78 65 28 14

In addition, the following table shows all the ten selected Hong Kong institutions’ word frequency*, the order is as follows:

Above Column represents the words, or word types (and 1=highest frequency; 10=lowest frequency.)Below Column represents the number of occurrences in GloWbE corpus

Page 11: Corpus and OUHK

It is surprising that in the world of the GloWbE corpus, the OUHK has a higher ranking than the HKBU and the HKIEd, although it is generally agreed that the academic status of OUHK is lower than the two tertiary institutions (e.g., in http://www.4icu.org/hk/, see figure below, sort by 2013 university web ranking according to 4icu.org: in terms of general facilities and academic support, the OUHK does not provide more enough than the HKBU and the HKIEd.)

Page 12: Corpus and OUHK

However, it may also be interpreted that the OUHK’s promotion of the open learning, especially during the 2012-13 has been done better than the ones of the two universities (i.e. HKBU and HKIEd), so more people can receive its education by the OUHK through online and the people know its name and mention more of its name, hence the occurrence of its name in the corpus is higher than the HKBU and HKIEd (however I did not have a peek into each word types’ KWIC, Key Word in Context precisely so the results may not be completely accurate that in actual case there could be other things not referring to the selected universities are encoded, or there may be other names encoded to represent the HKBU and the HKIEd.)

The GloWbE corpus is released in April, 2013, and it is a 2012-2013 corpus. (http://corpus.byu.edu/) N.B. in the whole process of data collecting, I did not use any wildcards, and any use of tag set is not involved.

Page 13: Corpus and OUHK

there is definitely the room for potential scholars to research on the most mentioned name of universities in Hong Kong in each variety of Englishes, creating an extra dimension into the research;

In the future,

Page 14: Corpus and OUHK

the researchers can conduct research on the most mentioned name of world universities in English, further increasing one more probable dimension.

or even

Page 15: Corpus and OUHK

Conclusions?

Page 16: Corpus and OUHK

Try!!! the latest Corpus of Global Web-based English (GloWbE) http://corpus2.byu.edu/glowbe/

Page 17: Corpus and OUHK

McEnery, T., Xiao, R. & Tono, Y. (2006). Corpus-based language studies: an advanced resource book. London/New York: Routledge.

Reference

Page 18: Corpus and OUHK

Thanks.