11742 neon hausa transcription guidelines (hau_asr002)-v10-20150904_0651
DESCRIPTION
Hausa transcription guidelinesTRANSCRIPT
![Page 1: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/1.jpg)
Projects
Page of 1 19
11742 Neon Hausa
Transcription
Guidelines
(HAU_ASR002)
URL:
Date:
Author: Bushra Zawaydeh
04-Sep-2015 06:51
https://wiki.appen.com/pages/viewpage.action?pageId=40820112
![Page 2: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/2.jpg)
Projects
Page of 2 19
Table of Contents
1 Writing ____________________________________________________________________________ 4
1.1 Punctuation ___________________________________________________________________ 4
1.2 Capital letters __________________________________________________________________ 5
1.3 Numbers ______________________________________________________________________ 6
1.4 Abbreviations __________________________________________________________________ 7
1.5 Acronyms _____________________________________________________________________ 7
1.6 Initialisms _____________________________________________________________________ 8
1.7 Mixed Initialisms ________________________________________________________________ 8
1.8 Email and website addresses ______________________________________________________ 8
1.9 Fragments ____________________________________________________________________ 8
1.10 Interjections __________________________________________________________________ 9
2 Span Tags (highlighting) _____________________________________________________________ 10
3 Tags ____________________________________________________________________________ 13
3.1 Fillers _______________________________________________________________________ 13
3.2 Foreign words _________________________________________________________________ 13
3.3 Unintelligible Speech ___________________________________________________________ 14
3.4 No Speech ___________________________________________________________________ 14
3.5 Pause _______________________________________________________________________ 14
3.6 Speaker noises ________________________________________________________________ 15
3.7 Other noises __________________________________________________________________ 16
3.8 Truncations ___________________________________________________________________ 16
4 Less common tags _________________________________________________________________ 18
4.1 Overlapping speech ____________________________________________________________ 18
4.2 Speaker change (male < - > female) _______________________________________________ 18
4.3 Prompt ______________________________________________________________________ 18
4.4 Untranscribable _______________________________________________________________ 19
![Page 3: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/3.jpg)
Projects
Page of 3 19
The audio you will be listening to consists of recorded conversations in Hausa. This means that you will
hear the same speaker through more than one utterance and these utterances will be in sequence so that
the conversation makes sense.
Carefully read the guidelines below. Contact your supervisor if you have any questions about these
guidelines, as it is most important that you understand them and are able to use them correctly in your work.
![Page 4: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/4.jpg)
Projects
Page of 4 19
1 Writing
1.1 Punctuation
Do not use any sentence punctuation (e.g. full stops, commas, question marks).
You can use punctuation when it is required for a word to be acceptable (e.g. the apostrophe or hyphen)
![Page 5: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/5.jpg)
Projects
Page of 5 19
Examples of hyphen appear mainly in English words to which you will add the Hausa suffix.
America-wa
Guardiola-n
aunty-n
company-nunukan
lecture-cin
lecturer-in
As for the apostrophe, it is used for glottal stop sound which appears in words like the below:
wa'azi
ta'azi
sana'a
sa'a
The appostrophe is also used for the <'y> as in:
'ya'ya
'ya'yan
'yan
wa'yannan
wa'yansu
1.2 Capital letters
Name Entities (e.g. person names, place names, some time words) should be spelled with a capital letter as
per usual writing conventions for . Hausa
![Page 6: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/6.jpg)
Projects
Page of 6 19
Examples:
Correct Incorrect
Champions League champions league
christian Christian
Bompai bompai
If a business name is spelled with a capital letter in the middle of the word, this is okay.
Example
eBay
iPhone
YouTube
Do not use a capital letter if the only reason is that the word is at the start of a sentence.
Example of a sentence that does not start with a captial letter. Do not capitalize words sentence
initially unless the first word is a proper noun.
TRANSCRIPTION: yaya jiya yaya labari jiya ka je wurin to kua #um
In these examples, the first word is capitalized because it is a proper name:
TRANSCRIPTION: Allah ya jiya
TRANSCRIPTION: Bashir fa ya dan ne min zani na Bahijja
Use the name tag to highlight all names (see below)Span Tags (highlighting)
1.3 Numbers
Do not use any digits (e.g. 1 2 3 4 5 ...). All numbers must be spelled out as full words in the way they
.were pronounced
![Page 7: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/7.jpg)
Projects
Page of 7 19
Example - the number '2012' may be pronounced in many different ways:
2012 ==> dubu biyu da goma sha biyu
2012 ==> alif dubu biyu da goma sha biyu (the Arabized ‘alif’ especially in yearly dates rarely used)
1.4 Abbreviations
Do not use any abbreviations. Words must be spelled out in full.
Example from English:
Correct Incorrect
Saturday Sat don't write Sat or Sat. write Saturday, assuming they said the full word
Elizabeth
Street
Elizabeth
St.
notice here people will probably say "street" not "st". So you would write
the full word.
The only exception is if someone pronounces the word as an abbreviation.
Example
Appen Butler Hill Inc ==> Appen Butler Hill Inc (if the person pronounced 'Inc' as 'Inc', not
'Incorporated')
1.5 Acronyms
An acronym is a word made up of the first letters of other words that is spoken as a word (e.g. NASA, FIFA).
Acronyms are spelled using capital letters joined with no space.
Example
NASA
FIFA
![Page 8: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/8.jpg)
Projects
Page of 8 19
1.6 Initialisms
An initialism is an abbreviation made up of the first letters of other words where each letter is pronounced
separately (e.g. IBM, CPU, ADHD). Initialisms are spelled using capital letters joined by underscores.
Example
I_B_M
C_P_U
A_D_H_D
1.7 Mixed Initialisms
Mixed initialisms involve combinations of words, letters, and numbers. When a single concept is expressed,
all parts are written together with an underscore. Models like 4S (below) are written separately from the
brand name. Numbers in a proper name are capitalised when written out.
iPhone four_S
Seven_Eleven
A_K_forty_seven
M_P_three
1.8 Email and website addresses
If you need to transcribe an email address or website address, part of it may be a 'nonsense' word that does
not mean anything. To identify the nonsense word, add an underscore at the start of the word.
Example
www.pjojeou.com ==> W_W_W dot _pjojeou dot com
[email protected] ==> J_Smith at _pjojeou dot com
1.9 Fragments
When a speaker pronounces only part of a word, write that part of the word and attach a hyphen to it. We
call this a fragment.
![Page 9: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/9.jpg)
Projects
Page of 9 19
Example - someone begins to say 'motorcycle' but stops after 'moto'
she came to work today by moto- I mean car
Example: someone begins to say 'onions' but stops after 'on-' and then repeats the word in full
my eyes hurt when I cut on- onions
Make sure there is a space after the hyphen.
If it is not clear what the full word was going to be, do not transcribe the word and instead use the
tag (see below ).Unintelligible Speech
1.10 Interjections
Interjections are very common in spoken Hausa, but strictly speaking they are not 'words' and would be
unlikely to show up in a dictionary or a newspaper article. You should write all interjections and spell them
as per the table below.
Description Sounds like ...
Agreement (yes) eee, mm, mhm, ooo
Disagreement (no) a-a, m-m
![Page 10: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/10.jpg)
Projects
Page of 10 19
2 Span Tags (highlighting)There are two types of tags: tags (colored) and tags (grey). Look for these in the screenshot span event
below.
Event tags are inserted between words, while span tags are used to highlight words.
Span Tag How to use it
Use this to highlight any foreign words you can understand, but
This tag should not be used for are to you. completely unknown
foreign names (places, businesses, personal names).
See below.Foreign words
English loanwords are words from English.borrowed
They are considered foreign words for the purposes of this NOT
project and they should receive a foreign span tag NOR should not
they be replaced with a foreign tag. Please spell all English words in
English.
Example:
Correct Spelling Incorrect Spelling
captain kyaftin
![Page 11: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/11.jpg)
Projects
Page of 11 19
Span Tag How to use it
Arabic loanwords are words from Arabic.borrowed
They are considered foreign words for the purposes of this NOT
project and they should receive a foreign span tag NOR should not
they be replaced with a foreign tag. Please spell all Arabic words in
Latin script, using guidelines written in the Spelling Guidelines
document.
Example:
Correct Spelling Incorrect Spelling
sallallahu s`allallahu
subhan allahi subhanallahi
Use this to highlight any words that are classified as interjections.
Interjections are words that express emotions and reactions and are
very common in spoken Hausa, but are unlikely to show up in a
dictionary or newspaper article.
For example in English if someone is surprised they may say "ooh".
Some Hausa examples are provided above (see )Interjections
Use this to highlight any words that were accidentally mispronounced
by the person speaking.
Spell the word in the (correct) way, then highlight it.normal
There is no need to use this if someone has an accent - it should
only be used when the person accidentally said something the wrong
way.
When in doubt ask yourself "would this person pronounce the word
differently if I asked them to repeat themselves?"
If they would, it can be classified as a mispronunciation.
Use this to highlight any words that you are not sure how to spell.
This should not be used often, because you have spelling guidelines
and you can search Google or Bing for the names of people and
places.
![Page 12: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/12.jpg)
Projects
Page of 12 19
Span Tag How to use it
Use this to highlight names, names, andperson place brand /
/ names.product / organization / business movie
Not all words that are capitalized are names.
You do need to tag adjectives of NOT nationality, holidays, days of
, since these words do not denote the week, months of the year, etc.
a person, place or company name.
![Page 13: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/13.jpg)
Projects
Page of 13 19
3 Tags
3.1 Fillers
Fillers are the sounds people make while they are thinking of what to say next:
Choose the tag which most closely resembles the sound the speaker makes when hesitating.
Example: speaker says "ee" after some noise:
gun budurwa zai je man
3.2 Foreign words
You may hear someone speaking in a foreign language. If you cannot understand the foreign speech, just
place a "foreign" tag in place of the words you cannot understand.
Example: "ariyo nono achiel ariyo" ==> you would tag this as
If someone uses just the occasional foreign word and you know how to spell it, write out the word and then
highlight it using the "foreign word" highlighting tag. See above.Span Tags (highlighting)
Example: the words in bold should be highlighted because it is not Hausa
Hausa ich spreche
Note, foreign names (people's names, place names, festival names, etc.) do NOT constitute foreign words
and should be spelled. If you are unsure of the spelling, you can make your best guess and highlight it. If a
foreign name is particularly difficult to spell, you could search for it in to find the most common Google
variant of spelling.
Similarly, you must consider whether the 'foreign' word is in fact a 'loanword', meaning that it could be
considered part of the Hausa now. If a word of foreign origin is commonly used and/or understood by
speakers (or a community of speakers) in the Hausa you are transcribing, it should be transcribed.
![Page 14: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/14.jpg)
Projects
Page of 14 19
It is very important that we are consistent in the treatment of , so when in doubt, loanwords
choose to spell the word and highlight it with the loanword tag (English or Arabic) rather than using
the 'foreign' tag to replace it. (see above).Span Tags
3.3 Unintelligible Speech
If you come across a word or several words that are not clear because there is interference, audio
problems, or because the person is not talking clearly, enter this tag in place of the unintelligible speech.
Of course you should try your best to listen and determine what was said, but in natural speech there will be
unintelligible words often. As a guide you should try at least three times to understand what was being said.
If it is not clear, insert the tag and move on.
Example - speaker mumbles something after "tun" and then continues speaking. The
is some unintelligible speech.
oke ba shi ke nan tun mu bare haka zuwa an jima zai yi wannan magana zai dinga
fita sosai mu ci gaba haka in ka na ji na dai mu ci gaba haka
3.4 No Speech
If an utterance contains no speech (e.g. there is only silence or noises) insert the entiretag only and move on. Do not tag the noises in such utterances.
Unintelligible speech, fillers and interjections ARE considered speech.
All other noises - human and non-human i.e. lipsmack, laugh, breath, cough, click, ring, dtmf and
short_noise and long_noise, are NOT considered speech.
3.5 Pause
Whenever there is a pause in speech for a period of , insert this tag.1 second or more
![Page 15: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/15.jpg)
Projects
Page of 15 19
Example - speaker takes a two second pause between "yi" and "yanzu" in the sentence below.
You would insert the pause tag.
yaya a ka yi yanzu ya back wannan magana sa dai nan dai ba bai sake yin irin
Use the tag for pauses of 1 second or more (between words) and also for silence of 1 within speech
second or more or .before the person commences speaking after they finish
If noises such as lipsmack, laugh, breath, cough, click, ring, dtmf and short_noise occur in the
foreground pauses of 1 second or more within speech, do not tag these noises - simply put during
only a pause tag.
If there is no speech at all within an utterance, use the 'no speech' tag (see above)No Speech
3.6 Speaker noises
All noises made by the main speaker must be marked with one of the tags below.
Insert the tag exactly where the noise first occurs.
If it occurs at the same time as a word, put the tag BEFORE the word.
.If the noise occurs more than once in sequence, you only need a single tag
Tag When to use it
lip smacks
tongue clicks
loud inhalation and exhalation between words
yawning
coughing
throat clearing
sneezing
laughing
chuckling
![Page 16: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/16.jpg)
Projects
Page of 16 19
3.7 Other noises
Insert the relevant tag when you hear a noise that is not made by the speaker and which is at a comparable
volume to the speech.
Insert the tag exactly where the noise first occurs.
If it occurs at the same time as a word, put the tag BEFORE the word.
If the noise occurs more than once in sequence, you only need a single tag.
Tag When to use it
Any interference from the phone line (e.g. crackling sounds) or click
The sound of a phone ringing.
The sound made by pressing the telephone keypad (DTMF stands for Dual Tone
Multi-Frequency).
Any other short noises that continue over several words (generally lasting less do not
than one second), for example: door slams, a loud cough by a person in the
background, car horns.
Any other long noises that and perhaps continue over longer periods of time
multiple words (generally lasting more than one second), for example: wind, rain,
background speech or music. This tag is used . The point at when the noise begins
which the long noise ends is marked. Low level background sounds are expected not
and do not need to be tagged.
3.8 Truncations
If a word gets cut off at the end of an utterance because the computer has not cut up the audio correctly,
this is called a truncation. This is different from a fragment (where the person stops talking part way through
a word). In a truncation, the recording has cut someone off while they were saying a word. Therefore,
truncations .only occur at the start or end of an utterance
When you hear a truncation at the , write out the truncated word followed by the end of an utterance in full
tag.
Then, when you hear the remainder of the word in the second utterance, insert the tag
ONLY
![Page 17: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/17.jpg)
Projects
Page of 17 19
Example - the sentence below is split across two separate utterances, and the word 'kawai' got
cut off at the end of the first utterance and at the start of the second
kar ka damu kawai
mu je dai a hakan
so in the example above, we add the truncation tag after "kawai", although the audio had just "kaw".
Then in the other audio which has the remaining part, you add the truncation tag.
If you can tell that a word was truncated but you don't know what the word is, simply insert the
tag in place of the word and the tag at the end of the first utterance,
and the tag at the start of the second utterance.
Example - the sentence below is split across two separate utterances, and the word 'kawai' got
but you couldn't understand what the truncated word cut off at the end of the first utterance,
was:
kar ka damu
mu je dai a hakan
![Page 18: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/18.jpg)
Projects
Page of 18 19
4 Less common tags
You are not likely to need the tags below because you are listening to half of a telephone
conversation. However, you need to learn how to use them in case they are needed.
4.1 Overlapping speech
When two people are talking at the same time and at a similar volume this is called overlapping speech. If
you hear this, do NOT transcribe the speech that is overlapped, and instead insert the tag
in place of those words.
Someone talking in the background (quieter than the main speaker) is not an overlap and should
be treated as noise.
Noises overlapping with speech do not constitute an overlap. Only mark overlaps when two people
are speaking into the phone at once.
4.2 Speaker change (male < - > female)
Insert the relevant tag if a new person starts talking part way through a call, and they are of a different
gender.
Example - a female speaker has been talking then a male speaker starts talking
Example - a male speaker has been talking then a female speaker starts talking
If the speaker changes after a section of overlapping speech, this tag should be inserted AFTER
the overlap tag if the gender of the speaker has now changed.
4.3 Prompt
Use this tag if you hear any speech coming from a computer or a background recording, rather than from a
real person. For example:
computer generated voice
pre-recorded voicemail message
![Page 19: 11742 Neon Hausa Transcription Guidelines (HAU_ASR002)-V10-20150904_0651](https://reader035.vdocument.in/reader035/viewer/2022081123/563db77a550346aa9a8b6b2c/html5/thumbnails/19.jpg)
Projects
Page of 19 19
a call centre prompt
You should NOT transcribe the words. Insert the tag in place of the words.
Example - you can hear a computer generated voice suggesting the speaker to press '1' to
speak with an operator.
YOU TRANSCRIBE:
4.4 Untranscribable
If an entire utterance contains persistent or overwhelming distortion, static or background noise which
makes it impossible to be transcribed, insert the tag and move on to the next utterance. In
reality this tag is not likely to be necessary.