core focus

Upload: mydays31

Post on 02-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Core Focus

    1/8

    Creating translation-orient

    source documents

    Managing variable text

    in translation

    WRITING FOR THE WORLCORE FOCUOctober/November 2011

  • 7/27/2019 Core Focus

    2/8

    www.multilingual.com October/November 2011 MultiLingual |

    Cor

    eFocus

    Creating translation-orientedsource documents

    Nicole Keller

    HHere are some tips for optimally preparingsource documents for translation during their

    initial creation. A well-prepared and cleanly-formatted source document can save a lot oftime and money during translation with a trans-lation memory (TM) system since the recogni-tion capabilities of the TM system only makesense if the segments to be translated are actu-ally identical or similar. These rules are derivedfrom practical experience. At first glance theymay appear insignificant to the author of a text,

    but for the translator, some of these things pres-ent considerable problems.

    PDF files vs. original file formatsWhenever possible, avoid using PDF files as the source

    document format for translation. Always try to provide theoriginal file format that served as the basis for the creation ofthe PDF files since PDF files cannot currently be edited in someprograms and instead have to be transformed into another for-mat (usually Word) before translation. The transformed docu-ments must generally be edited again before translation sincethe converted text usually contains too many formatting errorsto be able to translate it sensibly with a TM system. This editingis always associated with additional time and cost and delaysthe start of the translation.

    Hard line breaksAvoid hard line breaks (paragraph marks) within sentences;

    otherwise, no sensible segments can be offered for translation.Line breaks should only be used if a new paragraph is actuallystarted. TM systems decide using segment end limiters wherea translation unit (normally a sentence) ends. These charactersare generally ., !, ? and . A line break is always detected as asegment end, and manual editing is required if the line breakis within a sentence and subdivides it into two segments as a

    result. Manual adaptation by the translator requires additional

    time, and the initial analysis will find fewer hits in the TM(matches), which will make the translation unnecessarily expen-sive. Frequently, hard line breaks in PowerPoint or in desktoppublishing programs are put in the wrong place because transla-tors do not have sufficient knowledge about how to work withthese programs. Below is an example of a sentence that containsan incorrect line break (Figure 1) because the text was copiedfrom a PDF file. The sentence is subdivided into two illogicalsegments as a result, and manual adaptation by the translatoris required.

    Soft line breaksSoft line breaks (Ctrl+Enter) should also be avoided. TM sys-

    tems do not interpret them as segment ends, which is why suchunits are not detected correctly and they have to be re-workedmanually by the translator. Below is a text sample that containsa soft line break at the end of each bullet point (Figure 2), whichcauses the whole text to be offered as a single unit for transla-tion. Soft line breaks are frequently inserted unintentionally by

    Nicole Keller, a university lecturer and freelanceAcross trainer, specializes in English > German

    translation and the training of translationmemory tools as well as terminology databases.

    In the middle of this first sentence there is an incorrect line break

    Please check the following:

    - spell check

    - formatting

    - dates and numbers

    - terminology

    Figure 1 (top) and Figure 2 (bottom).

    http://www.multilingual.com/
  • 7/27/2019 Core Focus

    3/8

    Core Focus

    | MultiLingual October/November 2011 [email protected]

    copying texts from various applicationsinto source documents. This happens quitefrequently if the text to be translated iscopied from an e-mail into a Word docu-ment, for example.

    Manual page break

    Very often manual page breaks areinserted for formatting purposes because a headline falls at the bottomof a page, for example. To improve thelayout and the readability of the text,the author inserts a manual page break

    at a specific location. However, duringtranslation, texts usually grow or shrinkin length depending on the languagecombination, so it is very unlikely thatthe manual page breaks from the sourcetext should be placed in the same loca-

    tion as in the target text. Usually themanual page breaks are not translated,but are skipped during translation. Theyare inserted into the final version of thetext after the translation is finished andthe text is converted back into its origi-nal document format.

    Blank spaces and tabsTry to use tabs or indents to indent

    texts and do not use a series of blankspaces to do this. After reading thedocument into a TM system, these char-acters are all displayed. In 99% of cases,the translation with the same blank

    spaces will look different than it doesin the source document. The text thenhas to be reworked after the translationin nearly every case. Compare a sampletext as it might appear in Word (Figure3) with similar sample text as it mightappear in crossDesk (Figure 4). Whileworking, a translator can only assesswith difficulty whether the blank spacesactually have a particular function orwhether they only serve formatting pur-poses. If we assume, for example, thatthe translator does not delete the blankspaces from the translation and tries to

    put them in the same place in the targettext as in the source document, thenafter the export, the translation wouldlook like the data in Figure 5.

    Date, time and number formatsFor the detection of date, time and

    number formats, TM systems orientthemselves according to specified rules.Thus, in the Across system settings, itis specified whether date formats inGerman have the format DD.MM.YYYYor DD.MM.YY. If there is a blank spacebetween the numbers, the date is nolonger recognized as a coherent numbergroup, and it cannot be checked for cor-rect usage in the translation. This hap-pens frequently with dates and numbersin the thousands. See, as an example,08. 10. 2011 vs. 08.10.2011 (Figure 6)and 5.000 vs. 5 000 (Figure 7). The redlines indicate to the translator that thereis a number and which number areashave been detected as coherent units. Ifthe red line is interrupted, several unitswere detected. In this case, no sensiblechecking can be done to ensure the cor-rect takeover of the number formats. Itis therefore recommended that for textsthat contain a lot of date, time andnumber formats, you specify a uniformformat and use it consistently.

    Uniform spellingof specialized terminologyThe uniform spelling of specialized

    terminology is essential for correctterminology detection. In German, forexample, writing a term as one word

    Bulleted list with a series of blank spaces instead of tabs or indents - This sentence is to demonstrate

    what a text looks like if a series of white spaces is inserted instead of tabs or indents

    - This sentence is to demonstrate

    what a text looks like if a series of white spaces is inserted instead of tabs or indents

    Aufzhlung mit Leerzeichen statt Tabula-

    toren oder Einsgen

    - Dieser Satz soll zeigen, wie eine Text

    aussieht, wenn man mehrere Leerzeichen

    statt Tabulatoren oder Einzgen verwen-

    det.

    Der erste Satz enthlt die korrekte

    Schreibweise von Diabetesbehandlung.

    Im zweiten Satz werden zwei andere

    Schreibweisen verwendet: Diabetes

    Behandlung und Diabetes-Behandlung.

    Datumsangabe: 08.10.2011

    vs. 08.10.2011

    Beispiele: 5.000 vs. 5 000

    Figure 3 (top) and Figure 4 (bottom).

    From top, Figures 5, 6, 7, 8 and 9.

    Intelligent Content 2012

    February 22-24, 2012Palm Springs, CA

    Strategies for Reaching CustomersAnywhere, Any Time on Any Device

    www.intelligentcontent2012.com

    mailto:[email protected]://www.intelligentcontent2012.com/
  • 7/27/2019 Core Focus

    4/8

    (Diabetesbehandlung), as two words(Diabetes Behandlung) or connecting twowords with a hyphen (Diabetes-Behand-lung) is a frequent cause of inconsistenttranslation since the automatic terminol-ogy detection is not activated in thesecases.

    Figure 8 shows an example of cor-rect spelling. The red marking indicatesthat the term is present in the crossTermterminology database. The stored trans-lation is suggested to the translator.The translator can then take over thissuggestion directly into the translation.Figure 9, however, shows two examplesof alternative or incorrect spelling. Thesystem does not recognize the special-ized term, and the terminology window

    remains empty. Thus, the translator doesnot know that there is existing terminol-ogy information and may translate theterm inconsistently or incorrectly.

    Usage of correct abbreviationsAlways use the correct spelling of

    abbreviations. If you use different spell-ings for one and the same word, thesegmentation of the sentences will beincorrect. For example, if you look upthe word approximately, you will findapp., approx. and apx. as abbreviations.In Across, app. and approx. are definedas abbreviations for approximately, butif you use apx. the sentence will besubdivided in two segments and manualadaptation by the translator is required.

    Superfluous formattingIf you work frequently with colored

    marking in the text in order to visuallyemphasize text passages, you shouldmake sure that this has been removedcompletely before translation and thatthere are only line breaks and blank spacesremaining. Otherwise, this invisibleformatting is offered to the translator

    as possible formatting and may resultin the translator writing directly in thewrong font or color.

    HyphenationIf you want to use hyphenation in

    your text, make sure that you either usethe automatic hyphenation function orinsert an optional hyphen manually. Manypeople just insert a normal hyphen insteadof using the automatic hyphenation or themanually inserted optional hyphen. In thiscase the translator and the TM system facethe following problems: standard hyphensare recognized as normal characters andadd an additional character to the word inwhich they are placed. This means that thetranslation unit stored in the TM will notbe a 100% match even if exactly the samesentence appears again without hyphen-ation. Addtionally, terminology recogni-tion for this specific term will not worksince the term is separated by an incor-rect character. Compare the first samplesentence with correct hyphenation in

    Word (Figure 10) with the second samplesentence, with incorrect hyphenation in

    Word (Figure 11). M

    The first sentence contains an

    optional hyphen ation that was

    inserted manually.

    The second sentence just contains

    a hyphen in the middle of the

    word hyphen-ation..

    Figure 10 (top) and Figure 11 (bottom).

    www.multilingual.com October/November 2011 MultiLingual |

    Core Focus Showcase: Writing for the World

    Technical TranslationsSince 1971, Teknotrans has delivered millions

    of words in many different technical areas suchas automotive, IT, engineering and automation.Based in Sweden, we specialize in Scandinavianlanguages such as Swedish, Danish, Finnish andNorwegian. We are certified according to ISO9001:2000 and 14001:2004. Please visit ourwebsite for more information.

    Teknotrans ABGothenburg and Malm, Sweden

    [email protected] www.teknotrans.com

    High-quality MT

    for International SuccessSYSTRAN is the market-leading provider of

    machine translation (MT) solutions for thedesktop, enterprise and internet covering 52+language pairs and 20 domains. Powered byour new hybrid MT engine, SYSTRAN EnterpriseServer 7 combines the strengths of rule-based andstatistical MT. The self-learning techniques allowusers to independently train the software to anydomain to achieve publishable-quality translations.SYSTRAN solutions are used by Symantec, Cisco,

    Ford and other enterprises to support internationalbusiness operations. For more information, visitwww.systransoft.com.

    SYSTRANSan Diego, California USA Paris, France

    [email protected] www.systransoft.com

    Global Social Media

    Content StrategyThe Content Wrangler exists to serve global

    organizations that value content as a businessasset worthy of being managed efficientlyand effectively. We help content-heavyorganizations develop content strategiesdesigned to optimize and streamline theircontent creation, management and deliveryprocesses by adopting content manufacturingtechniques, technologies, standards and bestpractices. Services include content analysis,strategy development, tools selection, and

    processing reengineering, as well as socialmedia campaign planning, content creationand curation.

    The Content Wrangler, Inc.Palm Springs, California USA

    [email protected] www.thecontentwrangler.com

    http://www.multilingual.com/http://www.teknotrans.com/mailto:[email protected]://www.thecontentwrangler.com/mailto:[email protected]://www.systransoft.com/mailto:[email protected]://www.systransoft.com/
  • 7/27/2019 Core Focus

    5/8

    | MultiLingual October/November 2011 [email protected]

    CoreFocus

    Managing variable textin translation

    Peter Argondizzo

    II recently took a summer vacation to Disney

    World in Orlando, Florida, and a specifc scene inthe Spaceship Earth ride in Epcot really caughtmy attention. The ride is centered on a voyagethrough time and space to explore how humanshave communicated throughout the ages. It cov-

    ers everything rom ancient cave drawings totodays internet age. The scene that really stuckwith me involved monks painstakingly copyingmanuscripts with quills in intricate penmanship.It took days or even weeks to create a single copy

    o a book. The ride quickly progresses to the mod-

    ern age with the creation o the printing pressand then the advent o the personal computer.

    I am sure those aorementioned monks would have loved tohave had any type o shortcut to get through the creation otheir books, which were undoubtedly tedious despite the originalcratsmanship put into them. Fortunately, most popular authoringenvironments oer some shortcuts or technical writers creatinguser documentation possibly the digital equivalent o copy-ing manuscripts. This help comes in the orm o conditional text,

    variables and cross-reerences.I believe the original intent o these tools was to promote con-

    sistency through multiple documents and to allow the author tochange small bits o text once in order to update many instanceso the same text. Sounds simple enough. The problem occurswhen the text needs to be translated into multiple languages.Other languages have dierent grammatical and capitalizationrules and generally give technical writers migraines.

    I technical writers andtranslation project managers

    keep a ew suggestions inmind, however, I think bothparties can peaceully coexistand still use all three ormso generated text success-ully. My goal is to illustratehow a writer sees each o thethree types o text and how alinguist sees the same text. Itis important to keep in mindthat most translation envi-ronments separate this typeo text into a distinct fle ordistinct part o the fle (Figure1). Linguists typically do notget to see the completely-built sentence in contextrom within their translationenvironment.

    The easiest way to plowthrough this discussion is to tackle each o the three text typesindividually.

    VariablesVariables are typically numbers, product names, company

    names, dates, version numbers or anything that might changerom release to release. The obvious advantage is that some-thing as simple as a revision letter or sotware version can bechanged once and quickly applied across an entire manual orhelp system. See Figure 2 or an example o a variable used ora company name. The translator sees the termXYZ Corporationas a standalone segment as in Figure 3, and the actual sentenceappears as in Figure 4.

    Most customers use variables in a way that is easy enoughto understand. Just as long as linguists can properly understandwhat text will be inserted into each sentence, they can typi-cally make any appropriate adjustments in the copy to properlyhandle the inserted text. As you can see, clearly-named variablesare extremely helpul.

    Peter Argondizzo has spent the last16 years as the managing director ofArgo Translation, Inc., in Chicago.

    Figure 1: Example of how variables,cross-references and numbering for-mats are segregated in a document

    in a translation environment.

    mailto:[email protected]:[email protected]
  • 7/27/2019 Core Focus

    6/8

    Core Focus

    www.multilingual.com October/November 2011 MultiLingual |

    Cross-references

    Cross-references are intended to accurately cite other sections

    of a document or help system. These references are meant to

    assist users by pointing them to different sections that might be

    applicable to the information they need. Obviously, link-enabled

    PDFs and HTML-based help systems make this type of text useful,

    since users are just a click away from the information they need.

    This is where things start to get a little tricky for the linguist.

    Cross-references are typically used inside of sentences as in

    Figure 5. In the translation environment, the sentence appears

    as in Figure 6. It is important to note how the linguist has to

    change the order of the sentence and move the cross-references

    to the end of the sentence.

    The actual heading looks something like Figure 7 in the

    translation environment. The actual cross-reference structure

    is seen in Figure 8. As you can see, there is nothing to trans-

    late here other than confirming that you want the dash to

    appear between the page number and the chapter number. The

    linguist could choose to invert the codes or add something if

    required, but in this case we dont need to do anything.

    In thinking of this type of text you can see a pattern emerg-

    ing. We started with the smallest building block, variables, and

    moved to cross-references, which can be longer phrases or even

    sentences that are sometimes made of multiple variables. Now,

    we move on to what I would consider the largest of the three

    types of text, even though some people conditionalize tiny

    Left column, top down: Figure 2 (example of a variable used fora company name), Figure 3 (the company name as a standalonesegment), Figure 4 (the sentence using the company name in thesource and target languages) and Figure 5 (cross-references in asource language sentence). Right column, top down: Figure 6 (cross-references in the source and target languages), Figure 7 (translationenvironment heading), Figure 8 (cross-reference structure), Figure 9and Figure 10 (incorrect handling of conditional text), and Figure 11and Figure 12 (correct handling of conditional text).

    http://www.multilingual.com/
  • 7/27/2019 Core Focus

    7/8

    Core Focus

    | MultiLingual October/November 2011 [email protected]

    chunks of text, which I dont agreewith.

    Conditional text

    Conditional text is typically usedwhen an author needs to cover multiplerequirements for a given block of content.

    An example would be a series of prod-ucts that start with a basic model withfew features and moves up to a higherend product with many features. As youcan imagine, much of the documentationwould be identical in all models. Typi-cal product warnings, caution statementsand some of the user instructions wouldbe the same across all models.

    Conditional text, if handled properly,would be a handy thing for the author touse. Lets start with the incorrect way ofhandling conditional text by looking at anexample (Figure 9).

    This seems easy enough the Model400 has the ability to store 50 sets ofpre-configured settings, and the Model450 can store 100 sets. However, in a

    translation environment that same sen-tence is a little murky (Figure 10). It isreally difficult to tell which conditiongoes with which number. The risk here isan incorrect translation. Conditionalizingcomplete sentences or thoughts is a farbetter way to handle this, as shown in

    Figure 11. Now the linguist will have aneasy understanding of the content as well.

    You can see in Figure 12 that the segmentsare now two distinct sentences.

    Some might argue that this willinflate the word count and the cost ofthe final translation. I dont believe thiswould be a significant cost. Once thefirst sentence is translated and in thetranslation memory, subsequent ver-sions of the sentence will come up asfuzzy matches, resulting in discounts.

    Also, this simple change in how youbuild your conditional text will prevent

    errors in the final translation.The examples I provided were some-

    what simplistic in nature and didnteven come close to all of the uses of

    variables, cross-references and condi-tional text in your documentation. Butthis article should give you the abil-ity to speak to your language serviceprovider (LSP) about how to provide

    your content in a way that works withtheir translation environment and their

    linguists.I would suggest these key points to

    remember when working with this typeof text and your translation projects:

    Keep your content as flexible aspossible. Understand that your linguistmay need to move the tags around inthe structure of each segment to makethe generated text work properly in agiven language. Some languages mayeven need to add declensions to theend of some variables but not others.The linguist must have the ability tounderstand how the content will come

    out of your system. Post-formattingchecks of the documents are essential,especially in the beginning of yourrelationship.

    Remember the one-to-many re-lationship. I cant tell you how manytimes I have heard Well, there shouldbe some sort of workaround to make thegenerated text work properly in transla-tion. Yes, if a problem presents itself instep 1, the LSP can open the target filesand make a manual adjustment. But letssay a specific issue occurs ten times permanual, and your project is in 28 lan-guages. Do you really want to have yourLSP make 280 manual adjustments?

    What does that do to your timeline andcost? It is far better to make things workproperly in step 1.

    Dont include your developmentnotebook. We have seen many clientsinclude content in conditional text thatis not meant for translation. In oneinstance we found conditional contentthat was in development for the nextproduct release and not intended fortranslation at all. I would make it apractice to prepare your files for trans-lation by excluding that type of contententirely so that it doesnt get includedin the scope of your project.

    Communicate regularly about howprojects are going. It is important tohave an open line of communicationbetween the LSP and the client. Ratherthan struggle through issues it is farbetter to discuss what types of changescan make the process smoother foreither party. M

    H IGHER

    STANDARDS

    FOCUS ON CEE LANGUAGES

    ON MARKET SINCE 1995

    ISO 9001:2008, EN 15038 [email protected]

    www.aspena.com

    mailto:[email protected]://www.aspena.com/mailto:[email protected]
  • 7/27/2019 Core Focus

    8/8

    Core Focus: Localization

    The ever-growing, easy international access to informa-

    tion, services and goods underscores the importance

    of language and cultural awareness. What issues are

    involved in reaching an international audience? Are

    there technologies to help? Who provides services in this area?

    Where do I start?

    Savvy people in todays world use MultiLingualto answer

    these questions and to help them discover what other questions

    they should be asking.

    MultiLinguals eight issues a year are filled with news, techni-

    cal developments and language information for people who are

    interested in the role of language, technology and translation inour twenty-first-century world. A ninth issue, the annual Resource

    Directory and Index, provides valuable resources companies in

    the language industry who can help you go global. There is also a

    valuable index to the previous years magazine editorial content.

    Two issues each year include a Core Focus such as this one,

    which are primers for moving into new territories both geo-

    graphically and professionally.

    The magazine itself covers a multitude of topics including

    those below.

    TranslationTranslators are vital to the development of international and

    localized software. Those who specialize in technical documents,

    such as manuals for computer hardware and software, industrial

    equipment and medical products, use sophisticated tools along

    with professional expertise to translate complex text clearly and

    precisely. Translators and people who use translation services track

    new developments through articles and news items inMultiLingual.

    LocalizationHow can you make your product look and feel as if it were

    built in another culture for local users? Will the pictures and

    colors you select for a user interface in France be suitable for

    users in Brazil? How do you choose what markets to enter?

    What sort of sales effort is appropriate for those markets? How

    do you choose a localization service vendor? How do you man-

    age a localization project? Managers, developers and localizers

    offer their ideas and relate their experiences with practical advice

    that will save you time and money in your localization projects.

    InternationalizationMaking content ready for the international market requires

    more than just a good idea. How does an international developer

    prepare a product to be easily adaptable for multiple locales?

    Youll find sound ideas and practical help in every issue.

    Language technologyFrom systems that recognize your handwriting or your

    speech in any language to automated translation on your

    phone language technology is changing day by day. And

    this technology is also changing the way in which people com-

    municate on a personal level affecting the requirements for

    international products and changing how business is done al

    over the world.

    MultiLingual is your source for the best information and

    insight into these developments and how they affect you and

    your business.

    Global webEvery website is a global website because it can be accessed

    from anywhere in the world. Experienced web professionalsexplain how to create a site that works for users everywhere

    how to attract those users to your site and how to keep the site

    current. Whether you use the internet for purchasing services, for

    promoting your business or for conducting fully internationa

    e-commerce, youll benefit from the information and ideas in

    each issue ofMultiLingual.

    Managing contentHow do you track all the words and the changes that occu

    in your documents? How do you know whos modifying your

    online content and in what language? How do you respond to

    customers and vendors in a prompt manner and in their own

    languages? The growing and changing field of content man-

    agement, customer relations management and other manage-

    ment disciplines is increasingly important as systems become

    more complex. Leaders in the development of these systems

    explain how they work and how they interface to control and

    streamline content management.

    And theres much moreAuthors with in-depth knowledge summarize changes in the

    language industry and explain its financial side, describe the

    challenges of communicating in various languages and cul-

    tures, detail case histories that are instructional and applicable

    to your situation, and evaluate technology products and new

    books. Other articles focus on particular countries or regions

    specific languages; translation and localization training pro-

    grams; the uses of language technology in specific industries

    a wide array of current topics from the world of multilingua

    language, technology and business.

    If you are interested in reaching an international audience

    in the best way possible, you need to subscribe to MultiLingual

    An invitation to subscribe to

    Subscribe to MultiLingual atwww.multilingual.com/subscribe

    www.multilingual.com October/November 2011 MultiLingual |

    http://www.multilingual.com/subscribehttp://www.multilingual.com/http://www.multilingual.com/http://www.multilingual.com/subscribe