improving writing aids, the community way

Download Improving writing aids, the community way

If you can't read please download the document

Upload: alexandro-colorado

Post on 16-Apr-2017

1.382 views

Category:

Technology


0 download

TRANSCRIPT

OOoCon Budapest
2 September 2010

Improving Writing Aids,
the Community Way

Andrea PescettiItalian N-L project Lead

Getting Writing Aids Started

Writing Aids: Overview

Spell Checker

Thesaurus

Hyphenation Patterns

Grammar Checker

Spell Checker

The spell checking engine Hunspell is integrated in all versions of OOo.

Hunspell dictionaries (suitable for OOo, Thunderbird and more) are available for about 100 languages.

http://hunspell.sf.net

Thesaurus

Engine: integrated in all recent versions of OOo.

OOo-specific tool and format, you will usually have to start from scratch.

Documentation: OOo project lingucomponent.openoffice.org

Hyphenation Patterns

Engine: Hyphen, included in the Hunspell project; integrated in all versions of OOo.

Format: tool-specific, but conversion from TeX patterns available (with caveats): start based on TeX patterns!

TeX Archive: http://ctan.org/

Grammar Checker

Not integrated in OOo as a user-visible tool as of 3.2.1, but API available.

Several options available as extensions: LanguageTool, LightProof, CoGrOO and more.

Rules for your language: tool-dependent format.

Licensing Issues

Mere Aggregation

Wide spectrum of licenses for writing aids; most are incompatible with the OOo license, LGPLv3.

But they are pure data files.

FSF: this is mere aggregation, licenses do not need to be compatible: issue 65039.

Extensions OXT

Data for writing aids (except grammar) have been packaged as extensions since OOo 3.x.

This reinforces the mere aggregation concept.

Data files within the extension may have different licenses: still mere aggregation.

Choose your license

LGPLv3 (latest) is compatible with the OOo codebase and ensures that any distributed modified versions remain free.

GPLv3: in OOo, no significant differences (mere aggregation).

AGPLv3: usage on a network (WWW) counts as distribution.

Meet Sun/Oracle legal

Licenses aside, copyright holders must sign the OCA for their work to appear in the OOo code repository.

Usual choice: external contribution, no OCA required.

Sun legal was very slow; but Oracle legal froze the process!

Distributed Management

Use a repository

Make writing aids available to all contributors in an online repository.

Use version control.

Expose an easy, web-based, change tracking interface to show differences between revisions.

Spell Checker

One file in text format.

Human readable, except rules.

Good for collaborative editing.

Thesaurus

One file in text format and an automatically generated index.

Human readable.

Good for collaborative editing.

Hyphenation

One text file.

Format: as arcane as it can get!

Changes very rarely.

Fix bugs upstream, in TeX.

Grammar checker

LanguageTool: rules in XML.

Basic XML knowledge needed.

Fix upstream, in LanguageTool.

Collaboration possible.

Packaging

Generation of the OXT extension can be scripted.

It is even possible to automatically generate an updated OXT for every committed change of a file.

Keep generated OXT files in the same repository.

Team Structure

All components are independent; collaboration is possible in every component.

A packaging manager (or a script!) to generate extensions.

A release manager to make stable versions available in OOo.

Community Involvement

Community Involvement

The Native-Lang community is the best group of people to improve writing aids.

Motivated users, who will benefit directly from their work.

Main issue: providing tools that allow to manage contributions in an efficient way.

Web based interface

Allow quick and easy reporting of missing, erroneous and wrongly hyphenated words.

Easy to setup: basic web form or embed in, e.g., Drupal site.

Notifications: e-mail to maintainers group, suggestions stored in online database.

Web based interface

Expose web services

Allow direct usage of the web application, with no need to submit a form.

Parameters can be embedded in a URL, users don't have to explicitly open the site.

Suitable for inclusion in applications or macros.

Web services in OXT

Ideally, embed a macro in the OXT dictionary package distributed with OOo.

Right click on a word to show:Nominate for inclusion in dictionary.

Nominate for removal from dictionary.

Report wrong hyphenation.

Thesaurus maintenance

Vithesaurus: Existing online tool for collaboratively creating and maintaining a thesaurus.

In use (German) at http://www.openthesaurus.de

Can be installed on own server, free software.

http://vithesaurus.sf.net

Handling Duplicates

In a large community, usually suggestions are reported more than once by different users.

It's a plus: the web application can deal with duplicates and it ranks suggestions according to their frequency, for more efficient operation.

Handling Wrong Reports

Most annoying use case: users actually make some wrong suggestions and repeat them!

The web application helps with a motivated blacklisting: repeated wrong submissions are handled and a message can be shown to the user.

Thanks for attention

Andrea PescettiItalian N-L Project LeadPLIO Board Member

[email protected]

Image credits: Flickr, PLIO Archives.

Improving writing aids, the community way

Slide

Peter Junge:

Improving writing aids, the community way

Slide

Add your slide title here

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelEighth Outline LevelNinth Outline Level

Slide

Peter Junge:

Click to edit the notes format