bibletech2010.ppt
TRANSCRIPT
Andi Wu
Asia Bible Society
From Identical Strings From Identical Strings From Identical Strings From Identical Strings to Similar Stringsto Similar Stringsto Similar Stringsto Similar Strings
Intelligent Search of Biblical Texts Based on Intelligent Search of Biblical Texts Based on Intelligent Search of Biblical Texts Based on Intelligent Search of Biblical Texts Based on Syntax and SemanticsSyntax and SemanticsSyntax and SemanticsSyntax and Semantics
Original Motivation
� Systematic approach to Bible translation
� To make the translation consistent, translators need to know not only the phrases that are identical but phrases that are not identical but similar in meaning.
Asia Bible Society 2
亚洲圣经协会
�Traditional Search:�Based on matches in form�Same words�Same word orders
� Intelligent Search:�Based on matches in meaning�Words can be different�Word orders can be different
Identical Strings vs. Similar StringsIdentical Strings vs. Similar StringsIdentical Strings vs. Similar StringsIdentical Strings vs. Similar Strings
Example of Traditional Search:Concordance
Genesis 1:1
׃׃׃׃ץץץץאת השמים ואת האר את השמים ואת האר את השמים ואת האר את השמים ואת האר ים ים ים ים א ה א ה א ה א ה בראשית ברא בראשית ברא בראשית ברא בראשית ברא Deuteronomy 31:28
ם ם ם ם באזניה באזניה באזניה באזניה ברה ברה ברה ברה ואד ואד ואד ואד ם ם ם ם ושטריכ ושטריכ ושטריכ ושטריכ זקני שבטיכם זקני שבטיכם זקני שבטיכם זקני שבטיכם ־ ־ ־ ־ י את־ כלי את־ כלי את־ כלי את־ כלאל אל אל אל הקהילו הקהילו הקהילו הקהילו ׃׃׃׃ים ואת־ הארץים ואת־ הארץים ואת־ הארץים ואת־ הארץת־ השמ ת־ השמ ת־ השמ ת־ השמ א א א א ידה בם ידה בם ידה בם ידה בם ה ואע ה ואע ה ואע ה ואע ל ל ל ל הא הא הא הא את הדברים את הדברים את הדברים את הדברים
Jeremiah 23:24
ץץץץים ואת־ האר ים ואת־ האר ים ואת־ האר ים ואת־ האר ת־ השמ ת־ השמ ת־ השמ ת־ השמ א א א א ה עשית ה עשית ה עשית ה עשית את את את את הנה הנה הנה הנה ה ה ה ה יהו יהו יהו יהו י י י י אדנ אדנ אדנ אדנ ה ה ה ה אה אה אה אה ויה ויה ויה ויה הנטהנטהנטהנטח+ הגדול ובזרע+ ח+ הגדול ובזרע+ ח+ הגדול ובזרע+ ח+ הגדול ובזרע+ בכ בכ בכ בכ ל ל ל ל־ דבר׃ל־ דבר׃ל־ דבר׃ל־ דבר׃כ כ כ כ מ+ מ+ מ+ מ+ מ מ מ מ א א א א ל ל ל ל יפ יפ יפ יפ א־ א־ א־ א־ ל
Haggai 2:21
ת־ השמים ת־ השמים ת־ השמים ת־ השמים א א א א ר אני מרעישר אני מרעישר אני מרעישר אני מרעישה לאמ ה לאמ ה לאמ ה לאמ חת־ יהוד חת־ יהוד חת־ יהוד חת־ יהוד פ פ פ פ בל בל בל בל זרב זרב זרב זרב ל־ ל־ ל־ ל־ א א א א אמר אמר אמר אמר ׃׃׃׃ואת־ הארץואת־ הארץואת־ הארץואת־ הארץ
Asia Bible Society 4
亚洲圣经协会
Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:� Same words in different orders
Jeremiah 2:1
Ezekiel 24:20
亚洲圣经协会
Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:Example of Similar Strings:� Different words in different orders
Proverbs 1:7
Psalms 111:10
Similar Strings
� Strings that are similar in meaning
� Similar words in similar syntactic relationships
� Need in Bible translation
Asia Bible Society 7
The importance of Syntactic Relations
� Similar strings != strings containing similar words
� The same words in different syntactic relations can mean very different things
An old man with a dog chased a young lady with an umbrella.
vs.
An old lady with a dog chased a young man with an umbrella.
Asia Bible Society 8
Semantic Units of SentencesTriples: dependency relationships between two words
e.g. In the beginning God created the heavens and the earth.
� God – create ( subject-verb)
� create – heavens (verb-object)
� create – earth (verb-object)
� create – in the beginning (verb-adverbial)
� heavens – earth (conjunction).
Asia Bible Society 9
Different Strings With the Same Triples
God created the heavens and the earth.
The heavens and the earth were created by God.
God created the heavens and He created the earth.
It is God who created the heavens and the earth.
� God – create ( subject-verb)
� create – heavens (verb-object)
� create – earth (verb-object)
� heavens – earth (conjunction).
Asia Bible Society 10
Different Strings With Similar Triples
God created man in his own image.
Adam is the man that God created.
Man was created by God on the sixth day.
I am a man created by God.
Triples in common:
� God – create ( subject-verb)
� create – man (verb-object)
Asia Bible Society 11
Similar Triples With Different Words
His troops were annihilated.
His army was destroyed.
His forces were wiped out.
annihilate troops
destroy army (verb-object)
wipe-out forces
Asia Bible Society 12
Data Requirement
To recognize similar strings in Biblical texts, we need
� Syntactic analysis of the original Hebrew and Greek texts
� Synonym database of Hebrew and Greek
Both of them have already been developed at Asia Bible Society
Asia Bible Society 13
Triples
� Extracted from the trees
� Strings for comparison:
Text covered by each node/subtree
� Similar strings:
Subtrees containing similar triples
Asia Bible Society 16
Compute Similarities Between Subtrees
� Semantic space of a subtree:
The set of triples (including their synonymous expansions) contained in the subtree
� Similar subtrees
Subtrees whose semantic spaces overlap
(set intersection)
� Degree of similarity
Set Intersection / Set Union
Asia Bible Society 19
Semantic Distance
= log ( Intersection / Union ) * -1
Set A = { a, b, c } Set B = { b, c, d, e }
Intersection = { b, c }
Union = { a, b, c, d, e }
Distance(A,B) = log(2/5)* -1 = 0.9162907318742
Set C = { a, b, c, d } Set D = { c, e, f, g, h }
Intersection = { c }
Union = { a, b, c, d, e, f, g, h }
Distance(C,D) = log(1/8)* -1 = 2.0794415416798Asia Bible Society 20
� Semantic Space of Joshua 18:3
= { fathers~you(Poss), God~fathers(Poss), Yahweh~God(Appos), Yahweh~give(S-V),give~you(V-O), land~give(Mod) }
� Semantic Space of Leviticus 4:1
= { fathers~you(Poss), God~fathers(Poss), Yahweh~God(Appos), Yahweh~give(S-V),give~you(V-O), land~give(Mod) }
� Intersection
= { fathers~you(Poss), God~fathers(Poss), Yahweh~God(Appos), Yahweh~give(S-V),give~you(V-O), land~give(Mod) }
� Union
= { fathers~you(Poss), God~fathers(Poss),Yahweh~God(Appos), Yahweh~give(S-V),give~you(V-O), land~give(Mod) }
� Semantic distance = log(6/6)* -1 = 0.0Asia Bible Society 22
Asia Bible Society 24
� Semantic Space of Psalms 14:12
= { repay~person(V-O), as~deed(P-O),deed~him(Poss),
repay~as(V-PP)}
� Semantic Space of Psalms 62:1
= { reward~everyone(V-O), as~deed(P-O),deed~him(Poss),
reward~as(V-PP), you~reward(S-V)}
� Intersection = { repay/reward~person/everyone(V-O), as~deed(P-O),deed~him(Poss), repay/reward~as(V-PP)}
� Union = {repay/reward~person/everyone(V-O), as~deed(P-O),deed~him(Poss),repay/reward~as(V-PP),you~reward2(S-V) }
The computation
� Pair-wise comparison of all phrases
� Keep pairs with semantic distance < 9.0
� 1,607,721 in the database
� More than 24 hours on a single machine for the computation
Asia Bible Society 25
Linking OT and NT
Hebrew OT � Septuagint � Greek NT
� Automatic alignment
� Strong number matching
� Greek Strong numbers for all words in OT which occur in NT
� Match based on Greek Strong numbers
Asia Bible Society 27
Search in Bible translations
� Alignment between translations and original texts
� Queries in other languages � queries in
Hebrew/Greek
� Search always done in Hebrew/Greek
Asia Bible Society 29
Further Improvements
The results will be better if
� All the references are annotated
� Better alignment between the Hebrew OT and Septuagint
Asia Bible Society 30