bilingual terminology extraction
TRANSCRIPT
B i l i ng u a l t e r m i n o l -B i l i ng u a l t e r m i n o l -B i l i ng u a l t e r m i n o l -B i l i ng u a l t e r m i n o l -
o g y e x t r ac t i o no g y e x t r ac t i o no g y e x t r ac t i o no g y e x t r ac t i o n
The bilingual terminology
extractor is an easy to use tool which
allows you to produce your TBX from files
fast and comfortably from relevant bilin-
gual pairs of translation (TMX file). The
automatic terminology extraction pro-
duces term pair suggestions, which are
weighted with a quality estimation. A very
high probability of 1,0 means that the
pair of terms corresponds to a transla-
tion. Not only a word terms can be ex-
tracted, but also multi-word terms.
The extraction tool is also the ideal add-
on for the Araya translation toolset. It
supports the TMX files produced by the
Araya XLIFF or TMX editor optimally. In
addition the TMX variants and formats of
the other manufacturers are supported
too.
AAAARAYARAYARAYARAYA B B B BILINGUALEILINGUALEILINGUALEILINGUALE TERMINOLOGYTERMINOLOGYTERMINOLOGYTERMINOLOGY EXTRACTIONEXTRACTIONEXTRACTIONEXTRACTION
CCCCONTACTONTACTONTACTONTACT::::
H e a r t s o m e E u r o p e Gm b HH e a r t s o m e E u r o p e Gm b HH e a r t s o m e E u r o p e Gm b HH e a r t s o m e E u r o p e Gm b H
Friedrichstr. 17 D-90574 Rosstal T: +49 (0) 9127 579001 F: +49 (0) 9127/951178 [email protected] www.heartsome.de
FN 9098 Amtsgericht Fürth UID: DE225881142
M a n a g em e ntM a n a g em e ntM a n a g em e ntM a n a g em e nt
Dr. Klemens WaldhörDr. Klemens WaldhörDr. Klemens WaldhörDr. Klemens Waldhör Managing director [email protected]
HighlightsHighlightsHighlightsHighlights
� Simple to use editor interface.
� Each suggested pair of terms is
weighted with a quality measure
(quality criterion).
� Colored marking of the individual pairs
of terms depending on quality.
� Export into different formats possible,
e.g. TBX, csv.
� Pairs of terms can be marked as valid
and only the validated entries can be
exported optionally.
� Well-known terminology can be ex-
cluded from the extraction.
� TMX support
� Different parameters to control the
extraction like frequency of the terms,
number of translations which can be
extracted, number of words for a term,
a upper / lower case.
TMX stands for translation memory exchange and is a provider
independent, open standard for storing and exchanging of
translation memories in XML, provided by CAT tools (computer
aided translation). TMX supports the exchange of translation
memory data between programs and/or translators without
losing data in this process. TMX was developed on initiative of
the OSCAR (Open Standard for Container/Content Allowing re-
use) committee - an interest group of in LISA (Localization In-
dustry Standard Association).
A b o ut H e a r t s om e E u r op eA b o ut H e a r t s om e E u r op eA b o ut H e a r t s om e E u r op eA b o ut H e a r t s om e E u r op e
Heartsome Europe GmbH was founded
in 2002. Founder and director, Dr.
Klemens Waldhör, is familiar with
translation technology and CAT tools
since a long time. His experiences stem
from his time in the research labs of TA
Triumph Adler and as founder and di-
rector of EP Electronic Publish Partners
GmbH. Under his guidance the transla-
tion support system EURAMIS was de-
veloped for the translation service of
the European Commission. Later this
development was used by Sun Micro-
system as SunTrans. Based on this
experiences he developed the transla-
tion support system Araya
The core competence of Heartsome is
the customized adoption of Araya ac-
cording to customer needs. In an inten-
sive consulting phase the customer
requirements are determined and opti-
mized, Araya configured and integrated
into the processes of the customer .
T e r m i n o lo g y ex t ra c t i on as T e r m i n o lo g y ex t ra c t i on as T e r m i n o lo g y ex t ra c t i on as T e r m i n o lo g y ex t ra c t i on as
s e r v i c es e r v i c es e r v i c es e r v i c e
Within translation projects it is very im-
portant to use and apply consistent termi-
nology. This terminology must be main-
tained and corrected, enriched and com-
pared in particular with new terminology.
Our terminology extraction service offers
the automatic extraction of bilingual
terms (terms) from TMX files, which is
based on statistical procedures.
The quality of the found translations de-
pends obviously on the number of entries
in your TMX file, the more entries con-
tained, the more and the better results
are obtained.
You will receive the extraction result in a
TBX or csv formatted file, which contains
the extracted bilingual terms of your TMX
file. If required we can offer different
other formats too.
The terminology extraction works very
fast and in most cases we can provide
you with the extracted terms within a day.
If necessary we clean the lists of terms,
which are already stored you in your ter-
minology system.
We offer you a fast and simple method to
extract your terminology from your trans-
lations. By using our service you optimize
and accelerate your terminology work,
free yourselves from routine tasks and
time consuming manual scanning of your
translations.
Translation MEMORY (TM): translation technology, which re-
uses existing translations of segments (sentences, paragraphs
or phrases) of previously translated documents using fuzzy
search to find matching segments.
XLIFF (XML Localization Interchange file format) is an open
XML based standard, which was developed to support the ex-
change of localization information, in particular for document
formats of different manufacturers. XLIFF is based on XML.
S y s t em r e qu i r e m en t sS y s t em r e qu i r e m en t sS y s t em r e qu i r e m en t sS y s t em r e qu i r e m en t s
• Java™ based application.
• Software– requirements: Java >= 1.5.
• Operating systems: Windows™ | Linux
|Solaris™ | Mac™.
Solaris, Java and all Java-based labels are trademarks or registered trademarks of Sun Microsystems, Inc. in the US, other states or in both. UNIX is a registered trademark of Open Group in the US and other states. Windows, WinWord are a registered trademarks of Microsoft. Mac is a registered trademark of Apple Computer, Inc. Oracle is a registered trademark of Oracle Corporation. MySQL ist is a registered trademark of MySQL AB. Other company, product or service labels can be trademarks of others too.
A simple table oriented user interface with col-ored entries representing different extraction
qualities. The last column shows if the entry has been validated.
Only very few mouse clicks are needed to .extract terms from the TMX file to retrieve terms and their translations.
P r i c e s an d l i c e nc e sP r i c e s an d l i c e nc e sP r i c e s an d l i c e nc e sP r i c e s an d l i c e nc e s
• Single user license: € 800,- + VAT.
• Multi user licenses: on request.
• Terminology extraction service: on request
Order Form
I hereby order __ Araya Bilingual Extraction Tool for the price
of € 800 + VAT per license (single user licenses).
Company:
Name:
Street:
City:
E-mail:
Signature:
P l e a s e s e n d y o u r o r d er P l e a s e s e n d y o u r o r d er P l e a s e s e n d y o u r o r d er P l e a s e s e n d y o u r o r d er
t o :t o :t o :t o :
• Fax: +49 9127 95 11 78
• or
Heartsome Europe GmbH
Hr. Dr. Klemens Waldhör
Friedrichstr. 17
D-90574 Roßtal