conserving linguistic heritage the foss way
TRANSCRIPT
![Page 1: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/1.jpg)
Conserving Linguistic
Heritage the FOSS way...
![Page 2: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/2.jpg)
Hello!I am Omshivaprakash
I’m a Bengaluru based Wikimedian and a FOSS contributor.
I’m here to share my experience helping reuse/conserve the linguistic heritage of Kannada the FOSS way!
![Page 3: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/3.jpg)
2013-14Vachana
Sanchaya
11th and 12th Century literature & the need of the hour...
![Page 4: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/4.jpg)
‘’We need to be able to research on Vachana Sahitya. We should be able to search Vachana’s on the NET.We need data to understand Sahitya much better.- Sri OL Nagabhushana Swamy- Sri Vasudendra
![Page 5: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/5.jpg)
Challenges
▣ ANSI Data available on GoK Website ▣ GOK website not being intuitive▣ 15 large volumes Printed Books + others▣ No real tool to analyze the data at fingertips▣ Hot discussions on public forums needed
concordance & numerical data to debate on literature
Researches wanted data authentically come to consensus via research… but how?
![Page 6: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/6.jpg)
Digitize in UnicodeIdea was to get hands on the digitized data in
a reusable format & in Unicode
![Page 7: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/7.jpg)
ScrapeWe found that the data was available in digital format on GoK website http://vachanasahitya.gov.in
but in ANSI format.
We pulled the data with wget and write a python script to systematically extract data and converted the text to Unicode.
ALL IN FLAT FILES
Getting to work on data
But...It was not really enough. How does anyone take all the text in files and do research?We proposed to push this to a database and provide simple GUI tools to search text to look at results.
![Page 8: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/8.jpg)
more challenges...
Technical difficulties
Providing the end results to large number of people.
Making them understand to use the tools such as MySQL WorkBench/ SQLite Manager etc...
Awareness
Text input methods
SQL syntax
OS compatibility
Expanding scope
What about other research requirements?
How many queries we can write and keep sharing with the linguists not the computer savvy people?
![Page 9: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/9.jpg)
An opportunity to build something
For language that is close to our heart with few like minded people around over a cup of coffee, during weekends, whenever we have sometime to scribble through the need of our people…
IT WAS FUN...
![Page 10: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/10.jpg)
We builtVachana Sanchaya
http://vachana.sanchaya.net
![Page 11: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/11.jpg)
Portal for linguistic research
![Page 12: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/12.jpg)
Visualization, Discussion board, Concordance & more...
![Page 13: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/13.jpg)
Enable everyone
studentsResearchers Common Man
![Page 14: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/14.jpg)
To unearth the wealth of literature
▣ by reading and searching through 21 thousand Vachana’s
▣ written by 250 Vachanakaara’s▣ Researching in finger tips via Concordance &
quick visualizations ▣ Building corpus of 2lac+ unique words ▣ Building biodata of all male & female
vachanakaaras▣ enabling crowd sourced review solution▣ opening up new possibilities for Linguistic
research across other literary work of Kannada.
![Page 15: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/15.jpg)
We reached masses across the world...
![Page 16: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/16.jpg)
FOSS
All because of the FOSS tools around us and its philosophy
that we believed in...
![Page 17: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/17.jpg)
Rails, Nginx, Passenger, Memcached, MySQL, Python, Gitlab, wordpress & more...Only server cost to keep it running
Localized& being adopted to other projects too...
It is being reviewedto be contributed to Wiki Source & Wikipedia
![Page 18: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/18.jpg)
Moving forward
Bring more literary works online
Standardize Research platform for language
Create timeline for Centuries of Heritage
![Page 19: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/19.jpg)
How we are planning to do this?
CollaborationEnable community collaboration to build research documents around our literary heritage
EngageEngage students and others to work together on our code to build robust and futuristic tools for all type of literary works(Text, Poems, Old Kannada) etc
EvolveEvolve over period of time, adopt learnings from mistakes, reviews and feedbacks
Consult with communitiesWe would like to consult and learn from multiple language communities. Because Vachana Sahitya is translated to more than 15 languages & more
Keep tweakingWe keep working on tweaking the tool and make it robust to be used as a platform for our upcoming projects
Reaching goalsWe are determined to reach our goal of building unified search tool with timeline for centuries of Kannada Literature the FOSS way...
![Page 20: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/20.jpg)
We are on Social Media - FB/Twitter/Google+
Embed us on Wordpress via Plugin
We will be on Mobile Soon…
We are opening up APIs to reuse data or build tools around Kannada literature
Adding English and other translated works too....
There is lot more to share
So, Keep in touch!!!
![Page 21: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/21.jpg)
Our TeamPavithra, Myself, OLN, Vasudendra, Devaraj
![Page 22: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/22.jpg)
Thanks!Any questions?
You can find me at:Kn/En Wiki: User:OmshivaprakashProject Page: http://vachana.sanchaya.netMain Project: http://kannada.sanchaya.net @omshivaprakash | @vachanasanchaya
![Page 23: Conserving Linguistic Heritage the FOSS way](https://reader035.vdocument.in/reader035/viewer/2022081404/55a20ba01a28abbd4e8b4633/html5/thumbnails/23.jpg)
Credits
Special thanks to all the people who made and released these awesome resources for free:▣ Team photo by Amit Mrugvadhe▣ To my team for having made this possible▣ Minicons by Webalys▣ Presentation template by SlidesCarnival▣ Photographs by Unsplash