folt treffen 16122008

37
1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör 1 OpenTMS Implementation Dr. Klemens Waldhör [email protected]

Upload: klemens-waldhoer

Post on 06-May-2015

321 views

Category:

Technology


0 download

DESCRIPTION

Slides from Klemens Waldhör from the FOLT OpenTMS meetin 16.12.2008

TRANSCRIPT

Page 1: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör 1

OpenTMS Implementation

Dr. Klemens Waldhö[email protected]

Page 2: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör2

Dr. Klemens Waldhör

� 1978 – 1986: University : Computer Science

� 1980 – 1986: Stud. Ass. and Ass. Prof. - Inst. for social psychology, Linz (Promotion)

� 1986 – 2003: TA Triumph Adler AG

� KI, GUI, language technology, electronic dictionaries

� 2003 – 1999: EP Electronic Publishing Partners GmbH

� Euramis

� Sirius – now Acolada

� UniLex – now Acolada

� UniTerm – now Acolada

� 1999- 2001: Alpnet Technology GmbH

� SunTrans (based on Euramis)

� 2004 – 2008: Krems Research

� Operative and scientific manager

Page 3: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör3

Dr. Klemens Waldhör cont.

� Since 2002:

� www.heartsome.de

� Language technology consulting

� eTourism consulting

� Araya translation tools

XLIFF Editor TMX Editor TMX Editor Term Extraction

.

Araya Server

Page 4: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör4

Overview

� Repetition: Architecture and Data Model

� OpenTMS XML RPC Server

� OpenTMS Package Structure

� Implementation packages

Page 5: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör5

OpenTMS Requirements

� Software� Web based application� Server / Client Architecture� Thin client� No installation� No proprietary run time components� Preferred open source software� Modular software approach

� OS independent operating system� Windows, Linux, Mac …

� Standard hardware � Interfaces

� Integration into CMS� Workflow management should be supported

� Open source database� Basically all SQL da-tabases should be supported

� Scalability� Single and multi user requirement

Page 6: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör6

Architecture based on Standards

� XLIFF

� TMX

� TBX

� SRX

� …

In general XML

Page 7: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör 7

Basic Architecture

General Structure and Data Model

Page 8: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör8

Translation

Memory

Converter

Back

Converter

Machine

TranslationOpenTMS

Editor

Segmenter

Terminology

Translation

XLIFF

Example Work Flow

� Seamless

integration

of different

tools in the

translation /

localisation

workflow

Page 9: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör9

Application Model

User

Model

Data

Model

Document

Model

Process

Model

Security Model

GUI Model Interface Model

OpenTMS Core Library

OpenTMS System Architecture

Page 10: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör10

Database Aspects

� Databases represented as “data sources”

� Idea

� Make the data access interface independent from the data

itself

� Not being restricted to SQL databases only

� Also flat data or xml files

� E.g. files or other XLIFF as a data source

� Spread sheets

� Object Oriented Databases

� DMS systems

� “Web Sites”

� Define a common interface for all access functions

Page 11: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör11

Data Interpretation

� Language related core� TMX

� XLIFF

� TBX

� Internally no distinction between translation memory and terminology� Only different interpretation when used = “usage on

demand”

� Thus “translations” can be used in different contexts and applications

� Attribute normalisation

� Advantages� Can be used for different application purposes

� One core data structure

� Maintenance tool easier to write

Page 12: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör12

Modelling Language

Monolingual Object Multilingual Object

General Linguistic Object

N:1

inherits

Database

Terminology

Translation Memory

mapping

Page 13: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör13

OpenTMS Objects

Documents Databases Processes

Create

Modify

Retrieve

Data Sources

Data Components

Integrating Models

Page 14: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör14

OPENTMS

SOFTWARE

Open

TMS

Data

Source

Layer

Data type

specific

access

functions

Maps the OpenTMSaccess functions to the

specific data component

Access to data sources through

standardised interface

Various data components like files

etc.

Data Sources

Page 15: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör15

Security Aspects

� Protection of parts of the document� Encrypt specific parts of

the xml documents

� Additional security when transferring files� Even if a file gets in the

wrong hands the file cannot be read.

� Secure XLIFF� Source

� Target

� Secure TBX� Secure TMX

� TU…

Page 16: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör 16

XML RPC Server

Methods and Configuration

Page 17: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör17

Current Prototype

OpenTMS

Web GUI

(php)

Araya Translation Toolsjava

arayaserver.jar, external.jar, …

ConvertTranslateCreate DBDelete DBRepetition…

ConfigurationHandler

OpenTMS XML RPC Serverjava

openTMS.jar

Messages

OpenTMS Properties File

Page 18: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör18

XML-RPC Server

OpenTMS Listening on port 4050

Execute class: de.folt.rpc.messages.TestMessage

Test message sent: testString = "OpenTMS Test Message"

Execute class: de.folt.rpc.messages.TestMessage

Test message sent: testString = "My personal OpenTMS

Test Message"

Page 19: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör19

OpenTMS Properties File#OpenTMS properties

# OpenTMSBasis directoryOpenTMS.dir=c:/Program Files/OpenTMS/

# standard xml rpc error fileOpenTMS.rpc.err.file=c:/Program Files/OpenTMS/log/service.err

# standard xml rpc log fileOpenTMS.rpc.log.file=c:/Program Files/OpenTMS/log/service.log

# [RPC Service section]# default port for Open TMS XML RPC Serverrpc.server.port=4050

# Connection string for XML RPC Serverrpc.server.connectstring=http://localhost:4050

# name of servicerpc.server.service.name=$default

# Service names to be loadedrpc.translation.service.name=TranslationTools

# [End RPC Service]

# other properties

# {Araya Service section]ArayaPropertiesFile=c:/Program Files/OpenTMS/eaglememex.Folt.properties

# Message configuration sectionConfigurationHandler=c:/Program Files/OpenTMS/araya.xml

arayaserver.jar

db2jcc.jar db2jcc_license_c.jar

derby.jar derbynet.jar

external.jar h2.jar

hsqldb.jar jtds-1.2.jar

msbase.jar mssqlserver.jar

msutil.jar

openTMS.jar

sqljdbc.jar

mysql-connector-java-5.0.4-bin.jar

c:\Program Files\OpenTMS

eaglememex.Folt.properties

OpenTMS.properties

javaStartOpenTMSServer.bat

javaStartOpenTMSServer.bat

Page 20: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör20

XML RPC Server

server = new WebServer(port);

String serverService =

OpenTMSProperties.getInstance().getOpenTMSProperty("rpc.server.servi

ce.name");

server.addHandler(serverService, new OpenTMSServer());

de.folt.rpc.services.TranslationToolsServices transService = new

de.folt.rpc.services.TranslationToolsServices();

transService.initTranslationToolsServices(propertiesFile,

configurationHandler);

server.addHandler(transToolsService, transService);

server.start();

String serverFile = openTMSdir + "running";

File f = new File(serverFile);

f.createNewFile();

# Service names to be loaded

rpc.translation.service.name=TranslationTools

Page 21: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör21

Translet InitialsationtransService.initTranslationToolsServices(propertiesFile, configurationHandler);

confHandler = new ConfigurationHandler(configurationHandler);

loadJarFiles(root);

loadTranslets(root);

addTranslet(translet);

org.w3c.dom.NodeList translets = root.getElementsByTagName("translet");for (int i = 0; i < translets.getLength(); i++){

org.w3c.dom.Element translet = (org.w3c.dom.Element) translets.item(i);addTranslet(translet);

}

<?xml version="1.0" encoding="UTF-8"?>

<OpenTMS-app app="Araya">

<jar-files>

<jar-file name="c:/Program Files/OpenTMS/lib/arayaserver.jar"/>

<jar-file name="c:/Program Files/OpenTMS/lib/external.jar"/>

</jar-files>

Page 22: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör22

Translet InitialisationString name = translet.getAttribute("name");

org.w3c.dom.NodeList transletClassList =

translet.getElementsByTagName("translet-class");

org.w3c.dom.NodeList transletMethodList =

translet.getElementsByTagName("translet-method");

String fullTransletMethod =

transletClass + "." + transletMethod;

org.w3c.dom.NodeList transletparamsList =

translet.getElementsByTagName("params");

org.w3c.dom.NodeList transletparamList =

transletparamsElement.getElementsByTagName("param");

for (int i = 0; i < transletparamList.getLength(); i++)

{

String paraname = param.getAttribute("name");

String mapTo = param.getAttribute("map-to");

String type = param.getAttribute("type");

String content = param.getTextContent();

Param parameter = new Param(paraname, mapTo, type, content);

hashParams.put(paraname, parameter);

}

Translet transletImpl =

new Translet(transletClass, transletMethod, hashParams);

transletTable.put(name, transletImpl);

Page 23: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör23

Translet XML File

<translet name="CreateDatasource">

<!-- the class to be called -->

<translet-class>com.araya.OpenTMS.Interface</translet-class>

<!-- requires a static method to be called -->

<!-- e.g. com.araya.OpenTMS.Interface.runCreateDB (Hashtable message) -->

<translet-method>runCreateDB</translet-method>

<!-- this section describes the mapping of the OpenTMS paramaters to the Araya paramaters -->

<!-- parameters are passed as hashtables to the specified translet-class-->

<params>

<param name="dataSourceName" map-to="name"/>

<param name="dataSourceType" map-to="type"/>

<param name="dataSourceServer" map-to="server"/>

<param name="dataSourcePort" map-to="port"/>

<param name="dataSourceUser" map-to="user"/>

<param name="dataSourcePassword" map-to="password"/>

<!-- A parameter which has no counter part in OpenTMS; just copies into the message -->

<param name="myparam">value</param>

<!-- here a parameter is added where the value is taken from currently loaded OpenTMSProperties file -

->

<!-- this example below adds the entry

"ArayaPropertiesFile=c:/Program Files/Araya/lib/eaglememex.properties" to the message -->

<param name="ArayaPropertiesFile" map="ArayaPropertiesFile" type="OpenTMSProperties" />

</params>

</translet>

addTranslet(root);

Page 24: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör24

Translet Execution

� TranslationToolsServicepublic Vector run(Hashtable hashtable)

{

Vector vec = null;

String message = (String) hashtable.get("message");

try

{

RPCMessage handler = null;

// now we must search for the method in the configurations

if (confHandler.bMethodSupported(message))

vec = confHandler.executeTranslet(message, hashtable);

else

{

String classname = "de.folt.rpc.messages." + message;

handler = (RPCMessage) Class.forName(classname).newInstance();

vec = handler.execute(hashtable);

}

}

catch (Exception ex)

{

}

System.runFinalization();

System.gc();

return vec;

}

Page 25: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör25

Example Method Invocation

com.araya.OpenTMS.Interface. runCreateDB(Hashtable hashParams)

public static Vector runCreateDB(Hashtable message){

Vector vec = new Vector();String dataModel = (String) fillParam(message, "dataModel"); // folttmString propFile = (String) message.get("ArayaPropertiesFile");

EMXProperties.getInstance(propFile);if (dataModel.equalsIgnoreCase("TMX"))vec = runCreateTMXDB(message);else if (dataModel.equalsIgnoreCase("TBX"))vec = runCreateTermDB(message);

return vec;}

de.folt.rpc.services.TranslationToolsServices.run(Hashtable hashtable)

de.folt.rpc.webserver.OpenTMSServer

de.folt.rpc.services.TranslationToolsServices transService = new

de.folt.rpc.services.TranslationToolsServices();

transService.initTranslationToolsServices(propertiesFile, configurationHandler);

server.addHandler(transToolsService, transService);

de.folt.rpc.webserver.ConfigurationHandler.executeTranslet("CreateDatasource", Hashtable hashParams)

<translet name="CreateDatasource">

<translet-class>com.araya.OpenTMS.Interface</translet-class>

<translet-method>runCreateDB</translet-method>

<params>

<param name="dataSourceName" map-to="name"/>

<param name="dataSourceType" map-to="type"/>

<param name="dataSourceServer" map-to="server"/>

<param name="dataSourcePort" map-to="port"/>

<param name="dataSourceUser" map-to="user"/>

<param name="dataSourcePassword" map-to="password"/>

</params>

</translet>

call env.bat

call java -Xmx1024m %OPENTMSJAVABASE%

de.folt.rpc.client.OpenTMSClient

"message=CreateOpenTMSDataSource"

"dataSourceName=%OPENTMSTMX%" %EXAMPLEDBSERVER%

dataModel=TMX

Page 26: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör 26

Software

Packages and Structure

Page 27: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör27

Eclipse …

Page 28: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör 28

Implementation

Work Packages

Page 29: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör29

Programming Language et al

� Java

� Java Coding Standards

� Java Documentation Standard

� Delivered as jar files

� Eclipse

Page 30: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör30

General Work Packages

� Linguistic Model

� Core Data Source Model� XLIFF & TMX Handling

� Importer / Exporter

� Security Model� TMX / XLIFF

� Server

� Converters

� Server� XML-RPC� SOAP

� …

� GUI – Editor(s)

Page 31: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör31

Linguistic Model

� Monolingual / Multilingual Objects

� (Segment) / Sentence / Word Segmentation

� Replacement Classes

� Double Detection

Page 32: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör32

Core Data Model

� SQL� Access optimisation

� Other OpenSource databases…� OODBS

� XML database systems

� Xindice

� Apache Lucene

� Other possible data sources� Plain text files

� Csv files

� TMX files / XLIFF files / TBX files

� Spreadsheets

� …

Page 33: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör33

XLIFF & TMX Handling

� Format definition

� For supporting cross document matching

� TMX levels

� XLIFF Library

� Standardised read and write access functions

� TMX Library

� Standardised read and write access functions

� Format Handling & Matching

� Cross document format matching

Page 34: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör34

Converters

� Document Converters� XML

� OpenOffice as central converter for txt, rtf, doc, xls, ppt…

� MIF

� …

� Data Model Converter� Trados

� Star

� Across

� …

Page 35: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör35

GUI

� XLIFF Editor

� Application dependent

� Database Editor

� Directly editing database entries

� TMX Editor

� Editing TMX files

� Web GUI

� Maintenance

� Editor

� Installation

Page 36: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör36

Test Environment

� Daily build

� Stable version

� Development version

� Sealed version

� Broken code

� Documentation

Page 37: Folt Treffen 16122008

1. FOLT Entwickler Treffen 16.12.2008, Böblingen; Dr. Klemens Waldhör37

Contact

Heartsome Europe GmbH

Friedrichstr. 17

D-90574 Roßtal

www.heartsome.de

Dr. Klemens Waldhör

T: +49 9127 579001

F: +49 9127 951178

[email protected]