natural and programming languages 07.01.2008 v0.2 – initial draft, pikaro tarmo 11.02.2011 v0.3...

12
Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

Upload: leonard-turner

Post on 31-Dec-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

Natural and programming languages

07.01.2008 v0.2 – initial draft, Pikaro Tarmo

11.02.2011 v0.3 – updated, Pikaro Tarmo

Page 2: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

Language categories

Natural language Artificial natural language (Esperanto, Toki Pona, Lojban)

Programming language

How easy to understand •Native – tends to be be simplest to understand.•”communication usually fails, except by accident”-saying. Native language suffers from being non-clear (E.g. politics), and fails as a result communication.

•Tries to address some problems available in natural language

•Tends to be more cryptic than native•Structured better than natural

Has a clear structure (or design/architecture)

•Tends to follow pattern ”exception from exception”-pattern.•No clear structurization. Computer based text parsers are too complex. (E.g. lex, yacc)

•Clear structure exists, design decisions are not necessarily documented.•Natural languages cannot be used as programming language

•Clear structure exists, however language architecture/design even if exists, is not necessarily documented properly.

Page 3: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Definition: URL: http://en.wikipedia.org/wiki/Programming_language• Huge amount of artificially created programming languages: C, C++, perl, python, assembler,

java, C# and so on…• Software development walks in hand-by-hand with natural language – use of clear,

consistent, self-describing terminology improves software quality, simplifies communication between development teams.

• Reference links:– The Importance Of Terminology

URL: http://www.computing.surrey.ac.uk/ai/pointer/report/section1.html– Case study: Good documentation reduces costs and increase sales

URL: http://www.techscribe.co.uk/techw/cssdl.htm

• Programming itself does not involve code writing on programming language – it involves documentation as well, for example GNU coding standards require documentation to be written: URL: http://www.gnu.org/prep/standards/standards.html#Documentation

• If software is not documented properly – the worst case scenario is to end up something analogue to:

– The International Obfuscated C Code Contest - URL: http://www.ioccc.org/ (Highly complicated code, which is understood only by compiler, not by people who read it)

– Symbian OS Design Faults, URL: http://www.codeproject.com/KB/mobile/Symbian_OS_design_faults.aspx

• Programming language by itself is nothing without natural language.

Programming languages and software development

Page 4: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Definition: URL: http://en.wikipedia.org/wiki/Natural_language• Lack of clear language structure, mostly bounded to history.• Takes time to learn.• Can be easily misused and misunderstood - for example ‘politics’ is quite often associated

with “does not end up anywhere” kind of discussion.• “Communication usually fails except by accident”• Synonyms (words with similar meaning) quite often makes it more difficult to identify what is

common and what is uncommon for example in software development. (Pollute the language)

Natural languages and communication

Page 5: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Definition: URL: http://en.wikipedia.org/wiki/Constructed_language• Quite often designed with some particular idea in mind, suitable for people & communication.• Commonly not-widely used – because each language learning takes it’s time.• Language: Lojban

– URLs: http://en.wikipedia.org/wiki/Lojban , http://www.lojban.org/tiki/tiki-index.php?page=Home+Page&bl

– Not used very widely. Designed to simplify computer parsing. Cannot be used as programming language.

• Language: Toki Pona– URL: http://www.tokipona.org/– Simplified natural language, which demonstrates how efficiently language can be organized –

language itself contains only 118 words.– There already exists an attempt to use this language in psychology.

• Language: Inform 7– URL: http://www.inform-fiction.org/I7/Inform%207.html– Attempts to use English as a basis to be able to write literature.

Artificial natural languages (or constructed language)

Page 6: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Learning each language takes time, no matter whether it’s programming language or natural language – it’s time for language to offer more than it’s normally offering – meaning natural language and programming language must be merged together.

• The desired new language – let’s call it ‘Simple language’ (for time being) should possess features of natural languages (e.g. being native and easily understandable) and features programming language (being able to run on computer).

• Starting point of new language design is English (natural). If it does not corresponds to language design needs – it needs to be tweaked in right direction. (Modularization and structurization)

Conclusion

Page 7: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Every (programming and natural) language has certain lexicon and semantics.• Lexicon and semantics of language allows or permits sentence expression flexibility – what

you can and cannot express in that language.• Natural language typically has more flexible rules over programming language, allowing

language to be ”compressed” – you can use fewer words to express more information.• This flexibility comes with cost – sentence which is understandable by first person, cannot be

understood by another person. • Language also reflects to our mindset. We typically think ”in natural language”.• Language misuse allows language ”shortcut” to happen – like terrorism.

Natural language: More ideas

Page 8: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Being able to use as natural language (native)• Being able to use as programming language (understood by machine)• As easy to use as possible.• Clear, consistent, understandable• As simply structured as possible.

• It should be possible to learn it – for example in parallel with normal natural language – e.g. by having user interface, which would display English and simple language correspondent words in parallel.

• Simple language must be designed for people from people perspective, not for computer / compiler architectures.

• All design subjects and decisions needs to be documented from very beginning – in case if language would need to be redesigned/restructured from some point.

Simple language requirements

Page 9: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Main data types, will be similar to perl-language: (http://perldoc.perl.org/perlguts.html , http://perldoc.perl.org/perlapi.html)

• Scalar (= String or integer or • Array• Hash / associative array

• Generalized into ’Word’ (Same as programming language’s ’variable’)

• Any more complex data structure will be subset of those three types. (Complex ”type trees” as well)

• Design: Typically in C we operate on structures – they are hashes, optimized for processor access. C++ libraries provide wide range of data types, like stl::string, stl::vector – however – all of them lacks of structure recursion. (Hash of hashes of arrays of hashes of scalars…)

Simple language: “Words” of simple language

Page 10: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• Each program should be possible to re-structure into complex type tree using simple language ”words”.

• Back translation into original language (for example source code generation) should be provided.

• Simple language compiler will consists of:• Bootstrap (in ansi C) – modules responsible for ”bootstrapping” simple language

processor / compiler• Simple language (in simple language) – main modules responsible for everything else.

(Code / source generation)

• Design: Help to improve existing software development.

Simple language: Compiler / generator

Page 11: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

• …

Notes to myself…

Page 12: Natural and programming languages 07.01.2008 v0.2 – initial draft, Pikaro Tarmo 11.02.2011 v0.3 – updated, Pikaro Tarmo

Some useful links on data type reflection:• Java suffers from its slowness, and reflection is one of the reasons java became so slow in

first place. URL: http://www.awprofessional.com/articles/article.asp?p=26872&seqNum=1&rl=1

• C# type information - see previous pages (on intermediate language)• XML is a brand new technology, which heavily suffers from over abstraction and unnecessary

complexity. W3C XML Schema provides something similar to type information. XSD links: URL: http://www.w3.org/TR/xmlschema-1/ URL: http://www.w3.org/XML/Schema Relax NG: URL: http://www.relaxng.org/

– RelaxNG compared to W3C XSD specification: http://www.xml.com/lpt/a/2002/01/23/relaxng.html

• Nemerle, a new language born with .NET / C# platform, provides its own type information availability. URL: http://nemerle.org/Code_Completion#Building_the_Type_Tree

http://nemerle.org/Image:TypeTreeNew.png (Quite good pictures of a type tree)

Software: Data type reflection