Transcript
Page 1: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

Vir

tual

co

rpo

ra a

s d

ocu

men

tati

on

res

ou

rces

:

Tra

nsl

atin

g t

rave

l in

sura

nce

do

cum

ents

(En

gli

sh-S

pan

ish

)*1

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

Un

iver

sida

d de

Mál

aga

(Spa

in)

�e

incl

usio

n o

f doc

umen

tati

on a

s a

core

sub

ject

in t

he c

urri

culu

m o

f Tra

nsl

a-ti

on a

nd

Inte

rpre

tati

on d

egre

es c

lear

ly u

nde

rlin

es it

s im

port

ance

to t

ran

slat

ors.

T

rain

ing

in t

his

disc

iplin

e is

con

side

red

esse

ntia

l for

a t

ran

slat

or g

iven

tha

t on

ly

su�

cien

t an

d co

nsc

ient

ious

wor

k on

doc

umen

tati

on w

ill a

llow

an

ade

quat

e tr

ansl

atio

n o

f a s

peci

alis

ed te

xt. �

e so

urce

s of

info

rmat

ion

tha

t may

be

utili

sed

by t

he t

ran

slat

or a

re e

xtre

mel

y va

ried

, ran

gin

g fr

om a

n o

ral c

onsu

ltat

ion

wit

h an

exp

ert t

o a

sear

ch u

sin

g sp

ecia

lised

glo

ssar

ies

and

dict

ion

arie

s. H

owev

er, i

n

the

�el

d of

tra

nsl

atio

n p

erha

ps t

he m

ost r

elev

ant d

ocum

enta

tion

act

ivit

y to

day

invo

lves

the

use

of t

he I

nter

net

an

d, c

lose

ly r

elat

ed to

thi

s, t

he c

ompi

lati

on a

nd

man

agem

ent o

f vir

tual

cor

pora

.

In t

his

chap

ter,

we

pres

ent a

sys

tem

atic

met

hodo

logy

for

corp

us c

ompi

lati

on

base

d on

ele

ctro

nic

res

ourc

es a

vaila

ble

on t

he I

nter

net

. �e

met

hodo

logy

is il

-lu

stra

ted

thro

ugh

the

crea

tion

of a

vir

tual

cor

pus

of t

rave

l in

sura

nce

in E

ngl

ish

and

Span

ish,

who

se r

epre

sent

ativ

enes

s is

sub

sequ

entl

y de

term

ined

by

usin

g a

com

pute

r pr

ogra

mm

e-ca

lled

ReC

or s

peci

�ca

lly d

esig

ned

for

this

pur

pose

. Fi-

nal

ly, s

ome

spec

i�c

exam

ples

of p

ossi

ble

uses

in d

irec

t an

d in

vers

e tr

ansl

atio

ns

of t

his

type

of d

ocum

ent a

re g

iven

.

Key

wo

rds:

Cor

pus

com

pila

tion

an

d re

pres

enta

tive

nes

s, s

peci

aliz

ed c

orpo

ra,

lega

l tra

nsl

atio

n.

* �

e re

sear

ch r

epor

ted

in t

his

pape

r ha

s be

en c

arri

ed o

ut i

n t

he f

ram

ewor

k of

the

R&

D

proj

ects

BFF

2003

-046

16 (

Span

ish

Min

istr

y of

Sci

ence

an

d Te

chn

olog

y/E

U E

RD

F, 2

003–

2006

) an

d H

UM

-892

(A

nda

lusi

an M

inis

try

of E

duca

tion

, Sci

ence

an

d Te

chn

olog

y, 2

006–

2009

).

Page 2: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

76

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

1.

Intr

od

uct

ion

Sin

ce th

e to

uris

t in

dust

ry is

on

e of

the

prin

cipl

e dr

ivin

g fo

rces

beh

ind

the

Span

ish

econ

omy,

1 2 it

is h

ardl

y su

rpri

sin

g th

at t

here

is

a la

rge

dem

and

for

tran

slat

ion

s of

in

sura

nce

pol

icie

s in

the

tour

ism

sec

tor

both

from

Spa

nis

h in

to E

ngl

ish

and

from

E

ngl

ish

into

Spa

nis

h (c

f. A

CT

200

5).

Alt

houg

h th

is e

con

omic

rea

lity

cou

ld b

e tr

ansi

tory

, the

rig

hts

of E

urop

ean

con

sum

ers

to d

eman

d tr

ansl

atio

ns

of t

his

type

of

doc

umen

t un

der

the

ausp

ices

of

Eur

opea

n d

irec

tive

s2 3 on

in

sura

nce

mat

ters

an

d th

eir

resp

ecti

ve n

atio

nal

tra

nsp

osit

ion

s3 4 sho

uld

als

o be

tak

en i

nto

acco

unt.

�es

e di

rect

ives

rec

ogn

ise

the

righ

t of

the

par

ty t

akin

g ou

t in

sura

nce

to r

ecei

ve a

co

ntra

ct4 5 w

ritt

en n

ot o

nly

in t

he o

�ci

al la

ngu

age

of t

he m

embe

r st

ate

whe

re t

he

agre

emen

t is

mad

e, b

ut a

lso

in a

lan

guag

e w

hich

the

y m

ay s

peci

fy.

Subs

eque

nt

dire

ctiv

es, s

uch

as 2

002/

92/C

E,5 6 h

ave

also

incr

ease

d de

man

d fo

r tr

ansl

atio

ns

of a

ll th

e fo

rmal

doc

umen

ts th

at c

onst

itut

e th

e co

ntra

ct. I

n th

e fo

llow

ing

page

s, w

e sh

all

1.

Tour

ism

is

resp

onsi

ble

for

a hu

ge v

olum

e of

bus

ines

s in

the

int

ern

atio

nal

eco

nom

y w

ith

Eur

ope

occu

pyin

g a

priv

ilege

d po

siti

on a

t the

top

of th

e w

orld

sca

le. I

n 2

006

Eur

ope

gen

erat

ed

$6,4

66.2

bill

ion

in th

is s

ecto

r, e

quiv

alen

t to

10.3

% o

f the

wor

ld’s

gros

s do

mes

tic

prod

uct (

GD

P),

fo

reca

st t

o ri

se t

o 11

% b

y 20

11, a

ccou

ntin

g fo

r 8.

7% o

f to

tal e

mpl

oym

ent

(cf.

WT

TC

200

6a).

A

lso

see

stud

ies

by t

he W

TT

C c

once

rnin

g th

e U

nit

ed K

ingd

om (

2006

b),

Ire

lan

d (2

006c

) an

d Sp

ain

(20

06d

) fo

r a

mor

e de

taile

d an

alys

is o

f the

�gu

res

for

thes

e co

untr

ies

in t

his

sect

or.

2.

We

refe

r to

the

!ir

d E

C D

irec

tive

on

Non

-Lif

e In

sura

nce

(92

/49/

EE

C)

and

the

!ir

d E

C

Dir

ecti

ve o

n L

ife

Ass

ura

nce

(92

/96/

EE

C).

3.

�es

e tr

ansp

osit

ion

s, w

hich

are

pri

mar

ily a

imed

at

con

sum

er p

rote

ctio

n a

nd

fost

erin

g lin

-gu

isti

c pl

ural

ity

in E

urop

e, a

re g

iven

exp

ress

ion

, in

the

cas

e of

Spa

in,

in t

he L

ey 1

8/19

97,

de

13 d

e m

ayo,

de

mod

i"ca

cion

es d

el a

rtíc

ulo

8 d

e la

Ley

de

Con

trat

o d

e Se

guro

, par

a g

aran

tiza

r la

plen

a u

tili

zaci

ón d

e to

da

s la

s le

ngu

as

o"ci

ales

en

la

red

acci

ón d

e lo

s co

ntr

atos

, (B

OE

, 14t

h M

ay

1997

); in

the

case

of t

he U

nit

ed K

ingd

om, i

n S

tatu

tory

In

stru

men

t 20

04, n

.º 3

53. I

nsu

rers

(R

eor-

gan

isat

ion

an

d W

ind

ing

Up

) R

egu

lati

ons

2004

; an

d, �

nal

ly, i

n th

e ca

se o

f the

Rep

ublic

of I

rela

nd,

in

the

Insu

ran

ce A

ct 2

000.

4.

�e

poli

cy (

póli

za, i

n S

pan

ish)

is t

he d

ocum

ent

whi

ch g

ives

phy

sica

l for

m t

o th

e in

sura

nce

co

ntra

ct. I

n a

ddit

ion

, it

is w

here

the

obl

igat

ion

s an

d ri

ghts

of

both

the

insu

rer

and

the

insu

red

pers

on a

re s

et o

ut, w

here

the

pers

ons

or o

bjec

ts th

at a

re in

sure

d ar

e de

�n

ed a

nd

the

guar

ante

es

and

com

pen

sati

on i

n t

he c

ase

of d

amag

e ar

e es

tabl

ishe

d. I

t al

so r

epre

sent

s th

e fo

rmal

isat

ion

an

d cu

lmin

atio

n o

f th

e w

hole

pro

cess

of

cont

ract

ing

the

insu

ran

ce. A

s a

resu

lt, i

n m

any

case

s th

e in

sura

nce

pol

icy

may

be

refe

rred

to a

s th

e co

ntr

ato

(con

trac

t) (

cf. L

ey 5

0/19

80; I

nsu

ran

ce A

ct

2000

; !e

Fin

anci

al S

ervi

ces

and

Mar

kets

Act

200

0).

5.

We

refe

r sp

eci�

cally

to D

irec

tive

200

2/92

/EC

of

the

Eu

rope

an P

arli

amen

t an

d o

f th

e C

oun

cil

of 9

Dec

embe

r 20

02 o

n i

nsu

ran

ce m

edia

tion

. In

Art

icle

13

of t

his

dire

ctiv

e, u

nde

r “I

nfo

rmat

ion

co

ndi

tion

s”, i

t is

spec

i�ed

that

“All

info

rmat

ion

to b

e pr

ovid

ed to

cus

tom

ers

in a

ccor

dan

ce w

ith

Art

icle

12

shal

l be

com

mun

icat

ed: (

a) o

n p

aper

or

on a

ny o

ther

dur

able

med

ium

ava

ilabl

e an

d ac

cess

ible

to th

e cu

stom

er; (

b) in

a c

lear

an

d ac

cura

te m

ann

er, c

ompr

ehen

sibl

e to

the

cust

omer

;

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

77

pres

ent

a sy

stem

atic

met

hodo

logy

for

the

cre

atio

n o

f a

virt

ual

corp

us o

f tr

avel

in

sura

nce

in

En

glis

h a

nd

Span

ish

bas

ed o

n e

lect

ron

ic r

esou

rces

ava

ilabl

e on

the

In

tern

et. �

e re

pres

enta

tive

nes

s of

thi

s co

rpus

will

sub

sequ

entl

y be

det

erm

ined

by

usi

ng

a co

mpu

ter

prog

ram

me

spec

i�ca

lly d

esig

ned

for

this

pur

pose

.

2.

Co

rpo

ra i

n t

ran

slat

ion

tra

inin

g

�e

adva

ntag

es o

f us

ing

corp

ora

in t

ran

slat

ion

hav

e be

en s

how

n b

y va

riou

s st

udie

s (c

f. L

avio

sa 1

998;

Bow

ker

2002

; Bow

ker

and

Pear

son

200

2; Z

anet

tin

et a

l. 20

03, a

mon

gst

othe

rs).

Som

e of

the

pri

nci

pal a

dvan

tage

s of

usi

ng

them

are

the

ir

obje

ctiv

ity,

the

ir r

eusa

bilit

y an

d m

ult

iple

usa

ge o

f a

sin

gle

reso

urce

. In

add

itio

n,

they

are

use

r-fr

ien

dly

and

allo

w a

cces

s to

an

d m

anag

emen

t of

hug

e qu

anti

ties

of

info

rmat

ion

in a

lmos

t no

tim

e. F

urth

erm

ore,

we

mus

t con

side

r th

at t

he d

evel

op-

men

t of

our

cur

rent

in

form

atio

n s

ocie

ty h

as b

roug

ht a

bout

a d

eman

d th

at d

id

not

exi

st p

revi

ousl

y fo

r te

xts

wri

tten

in a

var

iety

of l

angu

ages

. Tog

ethe

r w

ith

eco

-n

omic

glo

balis

atio

n, t

his

has

resu

lted

in a

gro

win

g in

tere

st6 7 in

the

use

of b

ilin

gual

an

d m

ult

ilin

gual

cor

pora

by

rese

arch

ers

wor

kin

g in

the

�el

ds o

f au

tom

atic

an

d as

sist

ed t

ran

slat

ion

, la

ngu

age

teac

hin

g, t

erm

inol

ogy

and

spec

ialis

ed l

angu

age,

n

atur

al l

angu

age

proc

essi

ng

and

info

rmat

ion

rec

over

y as

wel

l as

, mor

e re

cent

ly,

in t

rain

ing

and

docu

men

tati

on a

s ap

plie

d to

tra

nsl

atio

n.

On

this

last

sub

ject

, des

pite

the

rem

it o

f the

Eur

opea

n p

roje

ct L

ET

RA

C7 8 (

Lan

-

guag

e E

ngi

nee

rin

g fo

r T

ran

slat

ors

Cu

rric

ula

), t

he u

se o

f co

rpor

a ha

s on

ly r

eally

co

me

to t

he a

tten

tion

of

rese

arch

ers

wor

kin

g in

the

�el

d of

tra

nsl

atio

n t

rain

ing

rela

tive

ly r

ecen

tly.

Exa

mpl

es o

f st

udie

s th

at s

tan

d ou

t ar

e: K

enny

(20

01)

on t

he

subj

ect

of li

tera

ry t

ran

slat

ion

bas

ed o

n p

aral

lel c

orpo

ra i

n G

erm

an a

nd

En

glis

h;

(c)

in a

n o

�ci

al l

angu

age

of t

he M

embe

r St

ate

of t

he c

omm

itm

ent

or i

n a

ny o

ther

lan

guag

e ag

reed

by

the

part

ies.”

6.

�er

e ha

s be

en s

uch

a !o

od o

f co

mpi

lers

in E

urop

e th

at w

e ar

e fo

rced

to

list

only

som

e of

th

e m

ore

impo

rtan

t exa

mpl

es: A

CL

(A

ssoc

iati

on fo

r C

ompu

tati

onal

Lin

guis

tics

); E

CI

(Eu

rope

an

Cor

pus

Init

iati

ve);

LD

C (

Lin

guis

tic

Dat

a C

onso

rtiu

m);

IC

AM

E (

Inte

rnat

ion

al C

ompu

ter

Arc

hiv

e

of M

oder

n a

nd

Med

ieva

l E

ngl

ish

); A

CL

/DC

I (A

ssoc

iati

on f

or C

ompu

tati

onal

Lin

guis

tics

Dat

a

Col

lect

ion

In

itia

tive

) an

d E

LR

A (

Eu

rope

an L

angu

age

Res

ourc

es A

ssoc

iati

on).

7.

See

<ht

tp:/

/ww

w.ia

i.un

i-sb

.de/

docs

/D3.

pdf>

. In

the

ir �

nal

rep

ort,

whi

ch w

as p

rese

nted

to

the

Eur

opea

n C

omm

issi

on D

G X

II, t

he L

ET

RA

C p

roje

ct s

tres

sed

the

impo

rtan

ce o

f int

rodu

c-in

g th

e fo

llow

ing

elem

ents

to

the

curr

icu

lum

of

tran

slat

ion

deg

rees

: ap

plie

d IT

, te

rmin

olog

y m

anag

emen

t pr

ogra

mm

es,

CA

T a

nd

AT

sys

tem

s, I

CTs

an

d lin

guis

tic

engi

nee

rin

g as

wel

l as

le

avin

g ti

me

for

publ

ishi

ng

prog

ram

mes

, the

Int

ern

et, c

ontr

olle

d la

ngu

ages

, pro

ject

man

age-

men

t, tr

ansl

atio

n m

emor

ies

and

corp

us li

ngu

isti

cs.

Page 3: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

78

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

Cor

pas

Pas

tor

(200

1, 2

003b

, 20

04a,

b a

nd

c) o

n l

egal

an

d m

edic

al t

ran

slat

ion

s ba

sed

on m

ult

ilin

gual

cor

pora

com

pile

d fr

om t

he I

nter

net

; an

d Sá

nch

ez-G

ijón

(2

003a

: NP

) on

the

sub

ject

of

virt

ual a

d h

oc c

orpo

ra f

or s

cien

ti�c

tra

nsl

atio

ns

in

the

En

glis

h-Sp

anis

h la

ngu

age

pair

. Oth

er e

xam

ples

of s

tudi

es a

re: B

ern

ardi

ni a

nd

Zan

etti

n (

2000

); B

owke

r an

d Pe

arso

n (

2002

); Z

anet

tin

, Ber

nar

din

i an

d St

ewar

t (2

003)

on

the

pos

sibi

litie

s o#

ered

by

corp

ora

for

spec

ialis

ed l

angu

age

teac

hin

g.

Tw

o st

udie

s th

at d

eal w

ith

the

pote

ntia

l use

of c

orpo

ra in

lan

guag

e te

achi

ng,

nat

u-ra

l lan

guag

e pr

oces

sin

g an

d tr

ansl

atio

n a

re A

ston

(20

01)

and

Gra

nge

r an

d Pe

tch-

Tys

on (

2003

). F

inal

ly, i

n th

e R

&D

pro

ject

des

crib

ed in

Cor

pas

Pas

tor

(200

3a)

the

corp

us w

as u

sed

as a

fun

dam

enta

l doc

umen

tati

on r

esou

rce

for

the

tran

slat

ion

of

lega

l tex

ts –

this

new

ven

ue o

f res

earc

h w

as fu

rthe

r de

velo

ped

som

e ye

ars

late

r by

Se

ghir

i (20

06).

Bot

h re

sear

cher

s an

d te

ache

rs a

re i

n a

gree

men

t ov

er t

he i

mpo

rtan

ce o

f co

rpor

a in

tra

nsl

atio

n t

rain

ing

and

prac

tice

. Som

e au

thor

s ha

ve g

one

even

fur

-th

er a

nd

spec

i�ca

lly i

ndi

cate

vir

tual

cor

pora

(cf

. Pe

arso

n 1

998;

Ber

nar

din

i an

d Z

anet

tin

200

0; C

orpa

s P

asto

r 20

01 a

nd

2004

a; Z

anet

tin

200

2a a

nd

b; S

ánch

ez-

Gijó

n 2

003a

an

d b)

as

one

of th

e tr

ansl

ator

’s m

ost i

mpo

rtan

t aid

s w

hen

face

d w

ith

a sp

ecia

lised

text

. By

virt

ual

cor

pus

we

refe

r to

a c

orpu

s co

mpi

led

from

ele

ctro

nic

so

urce

s ex

clus

ivel

y in

ord

er to

car

ry o

ut a

spe

ci�

c tr

ansl

atio

n in

any

dir

ecti

on (

di-

rect

, inv

erse

or

indi

rect

8 ).9 It

s pr

inci

pal o

bjec

tive

is to

con

stru

ct a

rel

iabl

e re

sour

ce

quic

kly

and

at m

inim

al c

ost,

base

d on

text

s m

ined

from

the

Inte

rnet

, to

sati

sfy

the

tran

slat

or’s

docu

men

tati

on n

eeds

.V

irtu

al c

orpo

ra m

ay a

lso

be r

efer

red

to a

s ad

hoc

(C

orpa

s P

asto

r 20

01: 1

64;

Sán

chez

-Gijó

n 2

003a

: 3),

dis

posa

ble

(Zan

etti

n 2

002a

), d

o-it

-you

rsel

f/D

IY (Z

anet

tin

20

02a)

, dom

ain

-spe

ci"

c (C

orpa

s P

asto

r 20

04a:

226

), w

eb (

Flet

cher

200

4), e

lect

ron

-

ic (

Cor

pas

Pas

tor

2001

; Var

anto

la 2

003)

, eph

emer

al (

Cor

pas

Pas

tor

2004

a: 2

26),

pr

ecis

ion

(V

aran

tola

199

7); a

nd

spec

ial

purp

ose

(Jen

nif

er P

ears

on 1

998;

Sán

chez

- G

ijón

200

3a).

Tran

slat

ors

turn

to th

e In

tern

et in

sea

rch

of s

olut

ion

s to

info

rmat

ion

an

d do

c-um

enta

tion

pro

blem

s be

caus

e th

ey a

re n

ot o

nly

tra

nsl

atin

g be

twee

n l

angu

ages

(f

or w

hich

a g

ood

dict

ion

ary,

whe

ther

on

line

or n

ot,

wou

ld s

u�ce

), b

ut a

lso

betw

een

dis

cour

se c

omm

unit

ies

or c

ult

ures

. In

thi

s co

ntex

t, th

e co

mpi

lati

on o

f co

rpor

a an

d th

e In

tern

et a

ppea

r to

be

two

of t

he m

ost

impo

rtan

t do

cum

enta

tion

re

sour

ces

in th

e pr

acti

ce a

nd

rese

arch

of s

peci

alis

ed tr

ansl

atio

n. W

hen

faci

ng

this

8.

A “

dire

ct t

ran

slat

ion”

is

tran

slat

ion

don

e di

rect

ly f

rom

the

ori

gin

al i

nto

tran

slat

or’s

na-

tive

lan

guag

e, w

itho

ut a

n in

term

edia

ry t

ext;

an “

inve

rse

tran

slat

ion”

, als

o ca

lled

“oth

er t

ongu

e tr

ansl

atio

n (

OT

T)”

, is

a tr

ansl

atio

n fr

om th

e tr

ansl

ator

’s n

ativ

e la

ngu

age

into

an

othe

r la

ngu

age;

nal

ly, a

n “

indi

rect

tra

nsl

atio

n”, a

lso

den

omin

ated

“m

edia

ted

tran

slat

ion”

, is

a tr

ansl

atio

n d

one

via

an in

term

edia

ry t

ran

slat

ion

in a

thi

rd la

ngu

age,

not

dir

ectl

y fr

om t

he o

rigi

nal

.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

79

kin

d of

ass

ign

men

t, th

e m

ain

pro

blem

tha

t tr

ansl

ator

s co

me

up a

gain

st i

s th

at a

co

rpus

for

the

part

icu

lar

spec

ialit

y is

not

ava

ilabl

e fo

r co

nsu

ltat

ion

on

the

Inte

rnet

or

, if o

ne

alre

ady

exis

ts, i

t o*

en d

oes

not

cov

er a

ll th

e in

form

atio

n r

equi

rem

ents

of

the

sour

ce t

ext.

In o

ther

wor

ds, “

one

prob

lem

wit

h th

ese

typi

cally

sm

all a

nd

do-

mai

n s

peci

�c c

orpo

ra is

the

lim

ited

ran

ge o

f to

pics

an

d te

xt t

ypes

for

whi

ch t

hey

are

avai

labl

e” (

Zan

etti

n 2

002a

: NP

). F

aced

wit

h th

is s

itu

atio

n, t

ran

slat

ors

have

no

alte

rnat

ive

othe

r th

an to

com

pile

the

ir o

wn

vir

tual

cor

pora

for

the

spec

i�c

tran

s-la

tion

tha

t has

bee

n c

omm

issi

oned

in e

ach

cas

e.It

is a

lso

impo

rtan

t to

tak

e in

to a

ccou

nt t

hat

any

set

of t

exts

doe

s n

ot, i

n a

nd

of it

self

, con

stit

ute

a co

rpus

. In

ord

er f

or a

col

lect

ion

of

text

s to

be

con

side

red

a co

rpus

in

the

str

ict

sen

se o

f th

e te

rm, i

t m

ust

mee

t a

set

of c

lear

des

ign

cri

teri

a

and

abid

e by

a s

peci

�c

com

pila

tion

pro

toco

l so

that

the

col

lect

ion

may

be

deem

ed

repr

esen

tati

ve o

f the

�el

d of

spe

cial

isat

ion

or

the

part

icu

lar

type

of d

ocum

ent t

hat

is b

ein

g tr

ansl

ated

.

3.

Gu

idel

ines

fo

r co

rpu

s cr

eati

on

In t

his

sect

ion

we

will

out

line

the

desi

gn p

aram

eter

s th

at t

he c

reat

ion

of a

vir

tual

co

rpus

dem

ands

. Fo

llow

ing

this

we

will

pro

pose

a c

ompi

lati

on p

roto

col

in t

he

form

of g

uide

lines

. �is

con

sist

s of

four

dis

tin

ct p

hase

s: (

1) lo

cati

ng

and

acce

ssin

g re

sour

ces,

(2)

dow

nlo

adin

g da

ta (

3) te

xt fo

rmat

tin

g an

d (4

) da

ta s

tora

ge.

3.1

Des

ign

cri

teri

a

Bef

ore

mov

ing

on t

o de

al s

peci

�ca

lly w

ith

how

the

doc

umen

tati

on r

esou

rces

n

eces

sary

to

crea

te a

vir

tual

cor

pus

are

loca

ted,

it

is e

ssen

tial

for

the

tra

nsl

ator

-co

mpi

ler

to �

rst

of a

ll es

tabl

ish

a s

et o

f cl

ear

desi

gn c

rite

ria.

In

thi

s ca

se, t

he o

b-je

ctiv

e is

to

crea

te a

cor

pus

of t

rave

l in

sura

nce

pol

icie

s in

Spa

nis

h a

nd

En

glis

h

com

pile

d ex

clus

ivel

y fr

om t

ouri

sm l

aw r

esou

rces

ava

ilabl

e on

the

Int

ern

et. �

is

bilin

gual

cor

pus

mus

t be

diat

opic

ally

res

tric

ted

due

to t

he la

rge

num

ber

of c

oun

-tr

ies

in w

hich

bot

h E

ngl

ish

an

d Sp

anis

h a

re o

�ci

al l

angu

ages

. In

ord

er t

o ill

us-

trat

e th

e m

etho

dolo

gy p

ut f

orw

ard,

the

cor

pus

will

be

rest

rict

ed t

o le

gisl

atio

n in

fo

rce

(whe

ther

it b

e co

mm

unit

ary,

nat

ion

al o

r fr

om a

uton

omou

s au

thor

itie

s) a

nd

to t

he f

orm

al e

lem

ents

of

the

cont

ract

(pr

inci

pally

in

sura

nce

quo

tes,

pro

posa

l fo

rms,

cer

ti�c

ates

of

insu

ran

ce a

nd

insu

ran

ce p

olic

ies9 )10

that

hav

e be

en d

raw

n

9.

An

othe

r do

cum

ent

is t

he d

upl

icad

o d

e la

pól

iza

(a

dupl

icat

e of

the

pol

icy)

, whi

ch is

dra

wn

up

in w

riti

ng

by th

e in

sure

r if

req

uest

ed b

y th

e pe

rson

who

take

s ou

t the

insu

ran

ce, t

he in

sure

d

Page 4: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

80

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

up i

n S

pain

, the

Rep

ublic

of

Irel

and

and

the

Un

ited

Kin

gdom

(Sc

otla

nd,

Wal

es,

En

glan

d an

d N

orth

ern

Ire

lan

d).

In a

ddit

ion

, it

will

be

nec

essa

ry t

o co

mpi

le a

co

mpa

rabl

e co

rpus

, mad

e up

of t

wo

subc

orpo

ra, o

ne

in S

pan

ish

and

the

othe

r in

E

ngl

ish,

whi

ch w

ill i

ncl

ude

the

orig

inal

tex

ts o

f th

e to

uris

m c

ontr

acts

. �is

will

be

a te

xtua

l cor

pus,

i.e.

a fu

ll-te

xt c

orpu

s, s

ince

it w

ill in

clud

e co

mpl

ete

text

s, a

nd

a sp

ecia

lised

cor

pus,

in

the

sen

se t

hat

it i

ncl

udes

spe

ci�

c te

xt t

ypes

dea

ling

wit

h co

mm

unic

atio

n b

etw

een

spe

cial

ists

an

d se

mi-

spec

ialis

ts o

r la

ymen

.A

tra

vel i

nsu

ran

ce c

orpu

s co

mpi

led

in a

ccor

dan

ce w

ith

thes

e de

sign

cri

teri

a w

ill b

e es

sent

ially

un

bala

nce

d,10

11 s

ince

qu

alit

y ta

kes

prio

rity

ove

r qu

anti

ty (

Cor

pas

Pas

tor

2004

a: 2

36)

in t

his

type

of v

irtu

al c

orpu

s w

hich

has

bee

n c

ompi

led

ad h

oc.

It is

, how

ever

, ext

rem

ely

hom

ogen

ous

give

n t

hat

it h

as b

een

cre

ated

for

a s

peci

�c

purp

ose.

3.2

Com

pila

tion

pro

toco

l

On

ce t

he p

relim

inar

y de

sign

par

amet

ers

have

bee

n e

stab

lishe

d th

e tr

ansl

ator

-co

mpi

ler

shou

ld fo

llow

a p

roto

col f

or t

he c

reat

ion

of

the

corp

us c

ompr

isin

g fo

ur

stag

es w

hich

will

now

be

desc

ribe

d.

3.2.

1 L

ocat

ing

and

acc

essi

ng

reso

urc

es

�e

�rs

t sta

ge o

f the

pro

toco

l con

sist

s of

loca

tin

g an

d ac

cess

ing

info

rmat

ion

ava

il-ab

le o

n t

he I

nter

net

. In

ord

er t

o do

thi

s th

e tr

ansl

ator

-com

pile

r w

ill h

ave

to d

e-ve

lop

and/

or p

ut h

is/h

er k

now

ledg

e of

ele

ctro

nic

res

ourc

es in

to p

ract

ice.

On

ce t

he

typ

e of

ele

ctro

nic

cor

pus

has

bee

n d

esig

ned

th

e qu

esti

on o

f ac

cess

to

th

e re

leva

nt d

ocu

men

ts a

rise

s. V

ario

us p

ossi

bilit

ies

exis

t fo

r ac

cess

ing

thes

e te

xts.

Acc

ordi

ng

to A

uste

rmü

hl (

2001

: 52

et s

eq.)

, th

ere

are

basi

cally

th

ree

typ

es

of s

earc

hes

th

at m

ay b

e ca

rrie

d ou

t on

th

e In

tern

et:

inst

itu

tion

al s

earc

hes

, ca

r-ri

ed o

ut o

n th

e w

eb s

ites

of i

nter

nat

ion

al o

rgan

isat

ion

s an

d in

stit

utio

ns;

them

atic

sear

ches

, nor

mal

ly c

arri

ed o

ut u

sin

g di

rect

orie

s an

d, la

stly

, key

wor

d s

earc

hes

us-

ing

a se

arch

en

gin

e.

pers

on o

r th

e be

ne�

ciar

y. �

e in

sure

r is

obl

iged

to

prov

ide

a du

plic

ate

or c

opy

of t

he p

olic

y if

th

e or

igin

al is

mis

laid

, the

cop

y m

ust b

e id

enti

cal a

nd

have

the

sam

e va

lidit

y as

the

ori

gin

al. I

n

addi

tion

, the

re is

als

o a

docu

men

t kn

own

as

the

bole

tín

de

adh

esió

n (

a jo

inin

g fo

rm),

a d

ocu-

men

t whi

ch g

ives

pro

of o

f the

insu

ran

ce a

nd

has

not

bee

n in

clud

ed h

ere

beca

use

it o

nly

app

lies

to li

fe in

sura

nce

pol

icie

s.

10.

Un

laba

ced

bec

ause

of

the

dist

ribu

tion

of

lan

guag

es o

n t

he I

nter

net

. Acc

ordi

ng

to t

he “

Top

Ten

Lan

guag

es U

sed

in t

he W

eb (

Nov

embe

r 20

07)”

pub

lishe

d by

Int

ern

et W

orld

Sta

ts (

http

://

ww

w.in

tern

etw

orld

stat

s.co

m/s

tats

7.ht

m),

the

Spa

nis

h la

ngu

age

repr

esen

ts 9

.0 %

of

all t

he I

n-

tern

et u

sers

in t

he w

orld

, whi

le E

ngl

ish

repr

esen

ts 3

0.1

%.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

81

We

shal

l beg

in w

ith

an

inst

itu

tion

al s

earc

h,11

12 o

ne

of th

e m

ost p

rodu

ctiv

e ty

pes

of s

earc

h fo

r co

nst

ruct

ing

corp

ora.

�is

is

due

not

on

ly t

o th

e gr

eat

quan

tity

of

docu

men

ts t

hat

thes

e ty

pes

of in

stit

utio

ns,

org

anis

atio

ns

or a

ssoc

iati

ons

stor

e on

th

e In

tern

et to

day,

but

als

o be

caus

e th

ey c

an b

e as

sum

ed to

be

of a

hig

h s

tan

dard

in

ter

ms

of b

oth

qua

lity

and

relia

bilit

y be

caus

e th

e w

rite

rs a

re s

peci

alis

ts i

n t

he

�eld

. �is

inst

itut

ion

al s

earc

h w

ill b

e m

ain

ly, t

houg

h n

ot e

xclu

sive

ly, c

arri

ed o

ut

from

inst

itut

ion

al, r

egu

lato

ry a

nd

legi

slat

ive

sour

ces.

In o

rder

to lo

cate

legi

slat

ion

th

e w

eb s

ites

an

d w

eb p

ages

tha

t fol

low

may

be

used

.In

ter

ms

of o

�ci

al o

rgan

ism

s an

d in

stit

utio

ns,

legi

slat

ive

info

rmat

ion

can

be

take

n f

rom

the

hea

dqu

arte

rs o

f th

e A

BI

(Ass

ocia

tion

of

Bri

tish

In

sure

rs),

1213 t

he

AB

TA

(A

ssoc

iati

on o

f B

riti

sh T

rave

l A

gen

ts)13

14 o

r th

e F

SA (

Fin

anci

al S

ervi

ces

Au

-

thor

ity)

1415 f

or t

he U

nit

ed K

ingd

om a

nd

Irel

and.

For

Spa

in,

info

rmat

ion

can

be

min

ed f

rom

the

Mes

a d

el T

uri

smo,

1516 p

arti

cula

rly

the

sect

ion

cal

led

“leg

isla

ción

ge

ner

al”

whi

ch in

clud

es r

egu

lato

ry la

ws

and

law

s sp

eci�

cally

rel

ated

to

the

tour

-is

m s

ecto

r.A

not

her

outs

tan

din

g w

eb s

ite

is t

hat

of t

he W

TO

(W

orld

Tou

rism

Org

anis

a-

tion

)1617 w

hich

con

tain

s on

e of

the

pri

nci

pal d

ocum

enta

tion

res

ourc

es f

or le

gisl

a-ti

ve m

ater

ial,

Lex

tou

r.17

18 �

is is

the

WT

O’s

data

base

of

tour

ism

legi

slat

ion

whi

ch

has

links

to

web

sit

es,

data

base

s, a

nd

exte

rnal

ser

vers

con

cern

ed w

ith

tou

rism

le

gisl

atio

n s

et u

p by

par

liam

ents

, go

vern

men

tal

orga

nis

atio

ns,

un

iver

siti

es a

nd

prof

essi

onal

ass

ocia

tion

s. W

e ha

ve a

lso

take

n i

nfo

rmat

ion

fro

m o

ther

dat

abas

es

to o

btai

n c

omm

unit

ary

legi

slat

ion

, suc

h a

s th

e w

ell r

espe

cted

Wes

tlaw

.1819 H

owev

er,

11.

On

num

erou

s oc

casi

ons,

it

may

be

nec

essa

ry t

o pe

rfor

m a

key

wor

d se

arch

to

�n

d th

e n

ames

of

mor

e or

gan

isat

ion

s to

be

used

in

the

in

stit

utio

nal

sea

rch.

�is

can

usu

ally

be

per-

form

ed b

y in

trod

ucin

g de

scri

ptor

s to

geth

er w

ith

Boo

lean

tec

hniq

ues

in a

sea

rch

engi

ne

such

as

Goo

gle.

For

exa

mpl

e, i

ntro

duci

ng

orga

nis

mo

OR

tu

rism

o, o

rgan

ism

o A

ND

tu

rism

o O

R “

or-

gan

ism

o tu

ríst

ico”

will

incr

ease

the

num

ber

of n

ames

of o

rgan

isat

ion

s co

nn

ecte

d w

ith

tour

ism

, w

hose

web

sit

es c

an t

hen

be

visi

ted

in o

rder

to

extr

act

info

rmat

ion

tha

t m

ay b

e su

itab

le f

or

incl

usio

n in

the

tra

vel i

nsu

ran

ce c

orpu

s.

12.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

bi.o

rg.u

k>.

13.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

bta.

com

>.

14.

Ava

ilabl

e at

<ht

tp:/

/ww

w.fs

a.go

v.u

k/co

nsu

mer

>.

15.

Ava

ilabl

e at

<ht

tp:/

/ww

w.m

esad

eltu

rism

o.co

m>

.

16.

Ava

ilabl

e at

<ht

tp:/

/ww

w.w

orld

-tou

rism

.org

>.

17.

Ava

ilabl

e at

<ht

tp:/

/ww

w.w

orld

-tou

rism

.org

/doc

/S/l

exto

ur.h

tm>

.

18.

Ava

ilabl

e at

<ht

tp:/

/web

2.w

estl

aw.c

om/s

ign

on/d

efau

lt.w

l?bh

cp=

1>.

Page 5: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

82

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

our

mos

t si

gni�

cant

sou

rce

has

been

EU

R-L

ex,19

20 t

he p

orta

l to

Eur

opea

n U

nio

n

law

, whi

ch is

cur

rent

ly t

he b

est d

atab

ase

for

Eur

opea

n U

nio

n la

w.

Pra

ctic

ally

all

the

docu

men

ts in

volv

ed in

the

proc

ess

of m

akin

g a

cont

ract

for

trav

el in

sura

nce

may

be

foun

d on

the

web

sit

es o

f the

big

insu

ran

ce c

ompa

nie

s. In

ad

diti

on, a

ltho

ugh

less

fre

quen

tly,

the

web

sit

es o

f nu

mer

ous

onlin

e tr

avel

age

n-

cies

con

tain

the

tex

ts o

f th

eir

polic

ies,

whi

ch t

hey

sell

on f

rom

var

ious

insu

ran

ce

com

pan

ies,

for

thei

r cu

stom

ers’

info

rmat

ion

. Sim

ilar

rich

sou

rces

of i

nfo

rmat

ion

ar

e al

so t

he w

eb s

ites

of

inte

rnat

ion

al i

nsu

ran

ce c

ompa

nie

s su

ch a

s M

ond

ial

As-

sist

ance

2021 o

r E

uro

p A

ssis

tan

ce,21

22 B

riti

sh a

nd

Iris

h in

sura

nce

com

pan

ies

such

as

AT

Bel

l In

sura

nce

Bro

kers

Ltd

,2223 R

oyal

an

d S

un

Alli

ance

2324 o

r L

loyd

s of

Lon

don

;2425 o

r Sp

anis

h in

sura

nce

com

pan

ies,

suc

h as

Alli

anz,

2526 M

AP

FR

E26

27 o

r O

caso

,2728 t

o m

en-

tion

on

ly a

few

of t

he m

ost r

epre

sent

ativ

e ex

ampl

es.

�e

nex

t st

ep is

to

mov

e on

to

mak

ing

them

atic

sea

rch

es28

29 u

sin

g w

ell k

now

n

dir

ecto

ries

. In

thi

s ca

se, a

pro

blem

wit

h lo

cati

ng

info

rmat

ion

may

ari

se a

s a

resu

lt

of t

he s

truc

ture

of

the

dire

ctor

ies

them

selv

es w

hich

can

eve

n h

inde

r th

e pr

oces

s of

doc

umen

tati

on e

xtra

ctio

n.

Spec

ialis

t di

rect

orie

s st

and

out

as e

xcel

lent

res

ourc

es f

or l

ocat

ing

com

mu-

nit

ary,

nat

ion

al a

nd

auto

nom

ous

legi

slat

ion

, esp

ecia

lly w

hen

the

res

ourc

es t

hey

cont

ain

are

als

o ev

alu

ated

an

d co

mm

ente

d up

on. �

is is

the

case

for

the

com

pila

-ti

on o

f th

e Sp

anis

h su

bcor

pus,

usi

ng

the

sect

ion

cal

led

“Dre

t” in

the

“In

dice

s” o

f

19.

Ava

ilabl

e at

<ht

tp:/

/eur

-lex

.eur

opa.

eu>

.

20.

Ava

ilabl

e at

<ht

tp:/

/ww

w.m

ondi

al-a

ssis

tan

ce.c

om/e

n/a

bout

us/h

omep

age.

htm

>.

21.

Ava

ilabl

e at

<ht

tp:/

/ww

w.e

urop

-ass

ista

nce

.es/

>.

22.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

tbel

l.co.

uk>

.

23.

Ava

ilabl

e at

<ht

tp:/

/ww

w.r

oyal

sun

allia

nce

.com

/roy

alsu

n>

.

24

. A

vaila

ble

at <

http

://w

ww

.lloy

ds.c

om>

.

25.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

llian

z.es

>.

26.

Ava

ilabl

e at

<ht

tp:/

/ww

w.m

apfr

e.co

m/p

map

fre/

es/i

nde

x.ht

ml>

.

27.

Ava

ilabl

e at

<ht

tp:/

/ww

w.o

caso

.es>

.

28.

As

wit

h th

e in

stit

utio

nal

sea

rch,

the

the

mat

ic s

earc

h m

ay b

e co

mpl

emen

ted

by a

key

wor

d se

arch

if

it i

s n

eces

sary

to

augm

ent

the

nam

es o

f th

emat

ic d

irec

tori

es c

onn

ecte

d to

the

par

-ti

cula

r sp

ecia

lisat

ion

tha

t is

bei

ng

sear

ched

. For

exa

mpl

e, t

o lo

cate

lega

l dir

ecto

ries

we

wou

ld

nor

mal

ly g

o to

Goo

gle

and

by u

sin

g de

scri

ptor

s co

mbi

ned

wit

h B

oole

an o

pera

tors

int

rodu

ce

prod

ucti

ve s

earc

h eq

uati

ons

such

as

“dir

ecto

rio

jurí

dic

o” o

r d

irec

tori

o A

ND

jurí

dic

o.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

83

the

Un

iver

sita

t d

e B

arce

lon

a29

30 a

nd

the

Un

iver

sita

t A

utò

nom

a d

e B

arce

lon

a.30

31 �

e di

rect

orie

s of

!e

Arg

us

Cle

arin

ghou

se31

32 a

nd

Sear

ch t

he

Law

3233 (

part

icu

larl

y th

e se

ctio

n “

Trav

el”)

are

sim

ilarl

y us

efu

l for

the

En

glis

h su

bcor

pus.

In g

ener

al, t

hem

atic

sea

rche

s ba

sed

on in

dice

s or

dir

ecto

ries

are

the

mos

t pro

-du

ctiv

e fo

r ex

trac

tin

g le

gisl

atio

n r

athe

r th

an i

nsu

ran

ce c

ontr

acts

. In

ord

er t

o do

th

is it

is n

eces

sary

to

take

a f

urth

er s

tep

and

carr

y ou

t a

key

wor

d s

earc

h. F

or t

his

type

of

sear

ch a

gen

eric

sea

rch

en

gin

e su

ch a

s G

oogl

e m

ay b

e us

ed. A

ccor

din

g to

a

grea

t num

ber

of a

nal

ysts

Goo

gle

is th

e be

st s

earc

h e

ngi

ne

in te

rms

of th

e qu

alit

y of

sea

rch

resu

lts

(cf.

Rad

ev e

t al.

2005

: 580

).A

lon

gsid

e vi

sits

to

insu

ran

ce c

ompa

nie

s’ w

eb s

ites

, ke

y w

ord

sear

ches

hav

e pr

oved

to

be (

cf. S

eghi

ri 2

006)

the

eas

iest

an

d qu

icke

st w

ay t

o re

cove

r th

e do

cu-

men

ts t

hat

mak

e up

in

sura

nce

con

trac

ts. �

e be

st r

esu

lts

will

be

obta

ined

fro

m

sear

ch e

ngi

nes

if k

now

ledg

e of

the

faci

litie

s th

ey o

#er

is u

tilis

ed. A

s w

ell a

s de

�n

-in

g th

e se

arch

app

ropr

iate

ly, t

echn

ique

s su

ch a

s us

ing

Boo

lean

ope

rato

rs, t

run

ca-

tion

an

d ph

rase

sea

rche

s sh

ould

be

con

side

red.

On

this

poi

nt, i

t is

clea

rly

esse

ntia

l to

est

ablis

h de

scri

ptor

s. A

pra

ctic

al e

xam

ple

(cf.

Tabl

es 1

an

d 233

)34 is

giv

en t

o il-

lust

rate

how

sea

rche

s ar

e m

ade

to lo

cate

the

tex

ts t

hat

will

com

pris

e th

e co

rpus

. In

ord

er t

o do

thi

s, t

he t

ext

type

s an

d th

e �

eld

of in

sura

nce

in w

hich

the

des

ired

in

form

atio

n is

to b

e fo

und

(tra

vel i

nsu

ran

ce)

are

take

n a

s de

scri

ptor

s an

d B

oole

an

sear

ch t

echn

ique

s ar

e ap

plie

d us

ing

the

user

fri

endl

y in

terf

ace

o#er

ed b

y, f

or in

-st

ance

, Goo

gle’s

adv

ance

d se

arch

.3435

29.

Ava

ilabl

e at

<ht

tp:/

/ww

w.b

ib.u

b.es

/bub

/int

ern

et.h

tm>

.

30.

Ava

ilabl

e at

<ht

tp:/

/ww

w.b

ib.u

ab.e

s/in

tern

et.h

tm>

.

31.

Ava

ilabl

e at

<ht

tp:/

/ww

w.c

lear

ingh

ouse

.net

>.

32.

Ava

ilabl

e at

<ht

tp:/

/ww

w.s

earc

h-th

e-la

w.c

om>

.

33.

In th

is ta

ble

only

the

desc

ript

ors

that

hav

e pr

oduc

ed th

e gr

eate

st n

umbe

r of

doc

umen

ts fo

r th

e te

xt ty

pe w

e re

quir

ed in

the

two

spec

i�c

lan

guag

es (

En

glis

h an

d Sp

anis

h) a

re s

how

n. H

ow-

ever

, it s

hou

ld b

e po

inte

d ou

t tha

t in

rea

lity

a va

st n

umbe

r of

sea

rch

crit

eria

wer

e us

ed a

nd

here

w

e ha

ve o

nly

sho

wn

a s

ampl

e by

way

of i

llust

rati

on.

34.

In o

rder

to

min

e th

e Sp

anis

h co

ntra

ctu

al d

ocum

ents

, th

e ve

rsio

n o

f G

oogl

e fo

r Sp

ain

(<

http

://w

ww

.goo

gle.

es>

) w

as u

sed.

By

sele

ctin

g th

e op

tion

“pá

gin

as d

e E

spañ

a” it

is

poss

ible

to

�lt

er o

ut a

ny d

ocum

ents

tha

t co

me

from

oth

er S

pan

ish

spea

kin

g co

untr

ies.

�e

sam

e pr

o-ce

dure

may

be

follo

wed

to

sear

ch f

or i

nfo

rmat

ion

in

En

glis

h, i

.e. t

he u

ser

goes

to

the

vers

ion

of

Goo

gle

for

the

Un

ited

Kin

gdom

(<

http

://w

ww

.goo

gle.

co.u

k>)

and

for

Irel

and

(<ht

tp:/

/ww

w.

goog

le.ie

>)

and

sele

cts

the

opti

ons

“pag

es f

rom

the

UK

” an

d “p

ages

fro

m I

rela

nd”

res

pect

ivel

y in

ord

er t

o av

oid

the

pres

ence

of

docu

men

ts t

hat

com

e fr

om o

ther

cou

ntri

es.

Occ

asio

nal

ly,

how

ever

, thi

s �

lter

ing

will

not

be

su�

cien

t so

tha

t, in

add

itio

n t

o se

arch

ing

by c

ount

ry, i

t m

ay

be n

eces

sary

in c

ases

of

doub

t as

to

the

orig

in o

f a

docu

men

t lo

cate

d by

usi

ng

Goo

gle,

to

refe

r to

the

dom

ain

in o

rder

to v

erif

y th

eir

sour

ce. �

e kn

owle

dge

that

the

dom

ain

s .e

s fo

r Sp

ain

, .u

k

Page 6: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

84

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

Tab

le 1

. D

escr

ipto

rs fo

r th

e �

ndi

ng

of t

he fo

rmal

ele

men

ts o

f tra

vel i

nsu

ran

ce c

ontr

acts

(S

pan

ish)

.

Tex

t ty

pe

Des

crip

tors

Sea

rch

eq

uat

ion

Pól

iza

Póliz

a, s

egur

o tu

ríst

ico,

as

iste

nci

a en

via

je35

póliz

a A

ND

“se

guro

turí

stic

o”pó

liza

AN

D “

asis

ten

cia

en v

iaje

Solic

itud

Solic

itud

de

póliz

a, s

egur

o tu

ríst

ico,

asi

sten

cia

en v

iaje

so

licit

ud A

ND

pól

iza

AN

D “

segu

ro tu

ríst

ico”

Solic

itud

AN

D p

óliz

a A

ND

“as

iste

nci

a en

vi

aje”

Pro

pues

ta

Pro

pues

ta, p

ropo

sici

ón,

segu

ro tu

ríst

ico,

asi

sten

cia

en v

iaje

póliz

a A

ND

pro

pues

ta O

R p

ropo

sici

ón “

se-

guro

turí

stic

o”pó

liza

AN

D p

ropu

esta

OR

pro

posi

ción

“as

is-

ten

cia

en v

iaje

s”

Car

ta d

e G

aran

tía

Car

ta d

e ga

rant

ía, s

egur

o tu

ríst

ico,

asi

sten

cia

en v

iaje

“car

ta d

e ga

rant

ía”

AN

D “

asis

ten

cia

en v

iaje

”“c

arta

de

gara

ntía

” A

ND

“se

guro

turí

stic

o”

Tab

le 2

. D

escr

ipto

rs fo

r th

e �

ndi

ng

of t

he fo

rmal

ele

men

ts o

f tra

vel i

nsu

ran

ce c

ontr

acts

(E

ngl

ish)

Tex

t ty

pe

Des

crip

tors

Sea

rch

eq

uat

ion

Polic

yPo

licy,

tra

vel i

nsu

ran

cepo

licy

AN

D “

trav

el in

sura

nce

Quo

teQ

uote

, tra

vel i

nsu

ran

ceQ

uote

AN

D p

olic

y A

ND

“tr

avel

insu

ran

ce”

Pro

posa

l For

mP

ropo

sal F

orm

, tra

vel i

nsu

ran

ce“p

ropo

sal f

orm

” A

ND

pol

icy

AN

D “

trav

el

insu

ran

ce”

Cer

ti�

cate

of

Insu

ran

ceC

erti

�ca

te o

f In

sura

nce

, In

sura

nce

Cer

ti�

cate

, tra

vel

insu

ran

ce

“cer

ti�

cate

of i

nsu

ran

ce O

R“i

nsu

ran

ce c

erti

�cat

e” A

ND

pol

icy

for

the

Un

ited

Kin

gdom

an

d .ie

for

Irel

and

will

ther

efor

e be

of u

se. I

n a

ddit

ion

pag

es in

Spa

nis

h w

ith

the

dom

ain

.ar

indi

cati

ng

Arg

enti

na,

or

.mx

indi

cati

ng

Mex

ico

and

page

s in

En

glis

h w

ith

the

dom

ain

.au

in

dica

tin

g A

ustr

alia

or

.us

indi

cati

ng

the

Un

ited

Sta

tes

will

be

auto

mat

ical

ly

rule

d ou

t bec

ause

the

y ar

e n

ot a

ppro

pria

te fo

r ou

r co

rpus

.

35.

We

refe

r m

ain

ly t

o se

guro

tu

ríst

ico

or t

rave

l in

sura

nce

in

acc

ord

ance

wit

h t

he

pos

itio

n

take

n b

y A

uri

oles

(cf

. Au

riol

es M

artí

n (

2005

[20

02])

y a

nd

Au

riol

es M

artí

n e

t al

. (20

04)

be-

caus

e w

e be

lieve

it to

mor

e ac

cura

te th

an th

e Sp

anis

h c

alqu

e, a

sist

enci

a e

n v

iaje

of t

he

orig

inal

E

ngl

ish

, si

nce

tra

vel

assi

stan

ce i

s on

ly o

ne

pos

sibl

e pa

rt o

f tr

avel

in

sura

nce

wh

ich

may

als

o in

clu

de c

over

age

for

hol

iday

can

cella

tion

or

med

ical

att

enti

on, t

o ci

te o

nly

som

e of

th

e m

ost

com

mon

exa

mpl

es. F

or a

wid

er p

ersp

ecti

ve o

n t

his

qu

esti

on s

ee t

he

tril

ingu

al (

Span

ish

-En

g-lis

h-I

tali

an)

clas

si�

cati

on o

f tr

avel

insu

ran

ce p

olic

ies

in r

elat

ion

to

cove

rage

out

lined

by

Seg-

hir

i (20

06: 2

79–2

81).

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

85

�e

mai

n d

i�cu

lty

wit

h k

ey w

ord

sear

ches

cen

tres

on

the

choi

ce o

f the

mos

t pre

-ci

se d

escr

ipto

rs fo

r th

e in

ten

ded

sear

ch, g

iven

that

wit

hout

this

a la

rge

amou

nt o

f ir

rele

vant

info

rmat

ion

will

be

retu

rned

. It i

s up

to th

e tr

ansl

ator

-com

pile

r to

�lt

er

out a

ll th

is “

noi

se”

from

eac

h o

f the

pag

es t

hat w

ill b

e in

clud

ed in

the

cor

pus.

3.2.

2 D

own

load

ing

dat

a

Whe

n t

he d

ocum

ents

hav

e be

en lo

cate

d an

d ac

cess

ed, t

he n

ext

stag

e is

to

dow

n-

load

the

dat

a. U

sual

ly, t

his

stag

e is

per

form

ed m

anua

lly, a

ltho

ugh

occ

asio

nal

ly it

is

pos

sibl

e to

aut

omat

e th

e ta

sk w

hen

dea

ling

wit

h a

grou

p of

web

pag

es w

hich

ha

ve b

een

acc

esse

d us

ing

the

prog

ram

me

GN

U W

get,

3636 w

hich

allo

ws

dow

nlo

ad-

ing

in b

atch

es.

�is

dow

nlo

adin

g ph

ase

may

be

ham

pere

d by

the

in

here

nt s

truc

ture

of

the

Inte

rnet

itse

lf. O

n th

e on

e ha

nd,

we

are

face

d w

ith

a m

ark-

up la

ngu

age

or H

TM

L,

in o

ther

wor

ds, t

he i

nfo

rmat

ion

is o

rgan

ised

in

hyp

erte

xt n

odes

whi

ch a

re o

*en

di

�cu

lt t

o ac

cess

. �is

is

usu

ally

as

a re

sult

of

the

cont

ent

bein

g in

appr

opri

atel

y la

belle

d or

bec

ause

the

loca

tion

of

the

info

rmat

ion

is d

i�cu

lt t

o se

e on

the

pag

e.

On

the

oth

er h

and,

the

wid

e va

riet

y of

for

mat

s th

at t

he in

form

atio

n m

ay a

ppea

r in

sho

uld

als

o n

ow b

e co

nsi

dere

d.

3.2.

3 T

ext

form

atti

ng

In t

he c

ases

of

both

legi

slat

ion

an

d co

ntra

cts

rela

ted

to t

rave

l in

sura

nce

a n

otic

e-ab

le p

redi

lect

ion

for

HT

ML

(.h

tml)

an

d P

DF

(.pd

f) e

xist

s. �

e �

rst

of t

hese

doe

s n

ot i

nvol

ve m

any

prob

lem

s in

ter

ms

of c

onve

rsio

n s

ince

the

in

form

atio

n m

ay

sim

ply

be c

opie

d an

d pa

sted

into

a te

xt d

ocum

ent.

Goo

gle

will

als

o al

low

the

ma-

jori

ty o

f PD

F do

cum

ents

to b

e se

en in

.htm

l for

mat

, the

reby

per

mit

tin

g th

e sa

me

proc

edur

e to

be

carr

ied

out.

Whe

n t

his

is n

ot p

ossi

ble,

con

vers

ion

pro

gram

mes

su

ch a

s So

lid

Con

vert

er37

37 m

ay b

e us

ed. H

ence

, thi

s th

ird

stag

e of

dow

nlo

adin

g is

co

mpl

eted

by

wha

t m

ight

be

calle

d n

orm

alis

atio

n,

sin

ce a

ll th

e do

cum

ents

will

be

con

vert

ed t

o an

ASC

II o

r pl

ain

tex

t fo

rmat

. In

oth

er w

ords

, the

y ar

e st

ripp

ed

36.

�is

fre

e so

*w

are

toge

ther

wit

h it

s in

stru

ctio

n m

anu

al m

ay b

e do

wn

load

ed f

rom

the

fol

-lo

win

g w

eb s

ite:

<ht

tp:/

/ww

w.g

nu.o

rg/s

o*w

are/

wge

t/>

.

37.

A t

rial

ver

sion

of

Soli

d C

onve

rter

may

be

dow

nlo

aded

fre

e of

cha

rge

from

<ht

tp:/

/ww

w.

solid

pdf.c

om>

. Giv

en t

hat

it is

a f

ree

tria

l ver

sion

, it

has

a nu

mbe

r of

lim

itat

ion

s: it

on

ly f

unc-

tion

s fo

r a

two

wee

k pe

riod

an

d pe

rmit

s co

nver

sion

of a

max

imum

of t

en p

ages

per

doc

umen

t, al

thou

gh it

is p

ossi

ble

to c

onve

rt a

com

plet

e te

xt o

ver

a nu

mbe

r of

ope

rati

ons

by s

peci

fyin

g a

di#

eren

t set

of p

ages

eac

h ti

me.

�er

e ar

e ot

her

free

pro

gram

s av

aila

ble

onlin

e lik

e P

df

to W

ord

con

vert

er 3

.0

(<ht

tp:/

/ww

w.g

eom

undo

s.co

m/d

esca

rgas

/baj

ar-p

df-t

o-w

ord-

conv

erte

r-30

_233

.ht

ml>

),

PD

F

Con

vert

er

(<ht

tp:/

/ww

w.fr

eepd

fcon

vert

.com

/con

vert

_pdf

_to_

sour

ce.a

sp>

) or

E

asy

PD

F t

o W

ord

Con

vert

er (

<ht

tp:/

/ww

w.p

df-t

o-ht

ml-

wor

d.co

m/

>),

for

inst

ance

.

Page 7: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

86

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

of t

he H

TM

L o

r co

de o

f an

y ot

her

kin

d, in

acc

orda

nce

wit

h th

e cl

ean

-tex

t po

licy

de

scri

bed

by S

incl

air

(199

1: 2

1).

3.2.

4 D

ata

sto

rage

�e

last

sta

ge is

to s

tore

the

data

. �is

con

sist

s of

sto

rin

g th

e do

cum

ents

that

hav

e be

en d

own

load

ed a

nd

corr

ectl

y id

enti

fyin

g an

d ar

ran

gin

g th

em.

On

e po

ssib

le

way

of d

oin

g th

is is

thr

ough

the

use

of s

ub-�

les

depe

ndi

ng

on w

heth

er t

he d

ocu-

men

ts a

re i

n t

heir

ori

gin

al f

orm

at o

r in

ASC

II f

orm

at.

�es

e su

b-�

les

are

then

su

bdiv

ided

acc

ordi

ng

to t

he la

ngu

age,

text

typ

es a

nd

text

form

ats

of t

he c

orpu

s.In

thi

s st

udy,

we

have

ext

ract

ed t

wo

subc

orpo

ra f

rom

the

mu

lti-

lingu

al T

u-

rico

r co

rpus

of

trav

el a

nd

tour

ism

law

, whi

ch is

des

crib

ed a

nd

fully

doc

umen

ted

at t

he w

ebsi

te h

ttp

://t

uri

cor.

com

. �e

two

subc

orpo

ra a

re a

bili

ngu

al c

ompa

rabl

e co

rpus

whi

ch c

onsi

sts

of a

Spa

nis

h su

bcor

pus

wit

h 25

9 te

xts38

38 (

1,83

7,86

9 w

ords

) an

d an

En

glis

h su

bcor

pus

wit

h 30

2 do

cum

ents

(3,

202,

118

wor

ds).

4.

Det

erm

inin

g c

orp

us

rep

rese

nta

tive

nes

s

Des

pite

rep

eate

d re

fere

nce

by

the

expe

rts

to t

he q

ual

ity

of b

ein

g “r

epre

sent

ativ

e”,

con

stit

utin

g a

“sam

ple”

an

d so

for

th a

s di

stin

guis

hin

g fe

atur

es o

f co

rpor

a as

op-

pose

d to

oth

er k

inds

of

text

ual

col

lect

ion

s, t

here

app

ears

to

be n

o co

nse

nsu

s on

th

is c

ruci

al is

sue.

�e

size

of t

he c

orpu

s is

a d

ecis

ive

fact

or in

det

erm

inin

g w

heth

er t

he s

ampl

e is

rep

rese

ntat

ive

in r

elat

ion

to

the

nee

ds o

f th

e re

sear

ch p

roje

ct (

cf. L

avid

200

5).

38.

On

the

subj

ect o

f the

legi

slat

ive

docu

men

ts th

at fo

rm p

art o

f the

cor

pus

(17

text

s in

En

glis

h an

d 2

text

s in

Spa

nis

h) i

t is

im

port

ant

to p

oint

out

tha

t tr

avel

in

sura

nce

is

not

reg

ula

ted

by

subs

tant

ive

legi

slat

ion

. In

stea

d it

com

es u

nde

r th

e re

gula

tion

s th

at a

pply

to

all i

nsu

ran

ce o

ther

th

an li

fe in

sura

nce

thr

ough

var

ious

com

mun

itar

y di

rect

ives

suc

h as

73/

239/

EE

C, 7

3/24

0/E

EC

, 76

/580

/EE

C, 7

8/47

3/ E

EC

, 84/

641/

EE

C, 8

7/34

3/ E

EC

, 87/

344/

EE

C, 8

8/35

7/E

EC

, 90/

618/

EE

C,

92/4

9/E

EC

, 95

/26/

EE

C,

2000

/26/

EC

, 20

00/6

4/E

C a

nd

2002

/13/

EC

. In

Spa

in,

trav

el i

nsu

ran

ce

cont

ract

s ar

e al

so c

urre

ntly

reg

ula

ted

by th

e L

ey 5

0/19

80, d

e 8

de

octu

bre,

de

Con

trat

o d

e Se

guro

, [A

ct 5

0/19

80,

8th

Oct

ober

, In

sura

nce

Con

trac

ts]

as w

ell a

s th

e L

ey 3

0/19

95,

de

8 d

e n

ovie

mbr

e,

de

ord

enac

ión

y s

upe

rvis

ión

de

los

Segu

ros

Pri

vad

os [

Act

30/

1995

, 8t

h N

ovem

ber,

Pla

nn

ing

and

Supe

rvis

ion

of P

riva

te I

nsu

ran

ce].

In

Ire

lan

d, in

sura

nce

con

trac

ts a

re r

egu

late

d by

the

Insu

ran

ce

Act

, 20

00,

as w

ell

as t

he E

uro

pean

Com

mu

nit

ies

(Non

-Lif

e In

sura

nce

) F

ram

ewor

k R

egu

lati

ons,

1994

(S.

I. N

o. 3

59 o

f 19

94).

In

the

Un

ited

Kin

gdom

, the

y ar

e re

gula

ted

by t

he F

inan

cial

Ser

v-

ices

an

d M

arke

ts A

ct 2

000

(Sta

tuto

ry I

nst

rum

ent

2003

N.º

147

6), s

peci

�cal

ly A

men

dm

ent,

Nº.

2, O

rder

200

3. I

n r

elat

ion

to

polic

ies,

the

cen

tral

doc

umen

t in

thi

s ty

pe o

f ag

reem

ent,

it w

as

poss

ible

to

incl

ude

101

docu

men

ts (

1,00

0,06

7 w

ords

) in

the

Spa

nis

h po

licie

s co

mpo

nen

t an

d 17

6 do

cum

ents

(1,

903,

661

wor

ds)

in t

he p

olic

ies

com

pon

ent

in E

ngl

ish.

�e

rem

ain

der

of t

he

form

al e

lem

ents

of t

he c

ontr

act a

re in

clud

ed in

the

res

t of t

he c

orpu

s.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

87

How

ever

, ev

en t

oday

the

con

cept

of

repr

esen

tati

ven

ess

is s

till

surp

risi

ngl

y im

-pr

ecis

e co

nsi

deri

ng

its

acce

ptan

ce a

s a

cent

ral

char

acte

rist

ic t

hat

dist

ingu

ishe

s a

corp

us f

rom

any

oth

er k

ind

of c

olle

ctio

n.39

39 A

s B

iber

, w

ho i

s on

e of

the

mos

t pr

oli�

c w

rite

rs o

n th

e su

bjec

t of c

orpu

s re

pres

enta

tive

nes

s, e

mph

asis

es, “

a co

rpus

is

not

sim

ply

a co

llect

ion

of

text

s. R

athe

r, a

cor

pus

seek

s to

rep

rese

nt a

lan

guag

e or

som

e pa

rt o

f a la

ngu

age”

(B

iber

et a

l. 19

98: 2

46).

Nev

erth

eles

s, a

t the

sam

e ti

me

Bib

er r

emai

ns

con

scio

us o

f th

e di

�cu

ltie

s in

volv

ed i

n c

ompi

ling

a co

rpus

tha

t co

uld

be

de�

ned

as

“rep

rese

ntat

ive”

(cf

. Bib

er e

t al.

1998

: 246

–247

).It

is th

eref

ore

com

mon

plac

e to

com

e up

aga

inst

que

stio

ns

over

the

min

imum

nu

mbe

r of

text

s n

eede

d to

gua

rant

ee th

at a

sam

ple

is s

cien

ti�c

ally

val

id, a

s w

ell a

s de

bate

s ov

er h

ow to

spe

cify

a s

u�ci

ent n

umbe

r of

text

s an

d nu

mbe

r of

wor

ds fo

r a

corp

us (

San

ahuj

a an

d Si

lva

2001

).�

ere

have

bee

n m

any

atte

mpt

s to

set

the

size

, or

at le

ast e

stab

lish

a m

inim

um

num

ber

of t

exts

, fro

m w

hich

a s

peci

alis

ed c

orpu

s m

ay b

e co

mpi

led.

Som

e of

the

m

ost

impo

rtan

t ar

e th

ose

put

forw

ard

by H

eaps

(19

78),

4040 Y

oun

g-M

i (1

995)

an

d Sá

nch

ez P

érez

an

d C

anto

s G

ómez

(19

97).

How

ever

, sub

sequ

entl

y, s

ome

of t

hese

au

thor

s, s

uch

as

Can

tos

(Yan

g et

al.

2000

: 21)

, rec

ogn

ised

som

e sh

ortc

omin

gs i

n

thes

e w

orks

, sug

gest

ing

that

the

y m

ight

be

attr

ibut

ed t

o th

e us

e of

Zip

f’s

law

.4141

Zip

f’s

law

4242 c

an g

ive

us a

n i

dea

of t

he b

read

th o

f vo

cabu

lary

use

d, b

ut i

t is

not

lim

ited

to

a pa

rtic

ula

r or

app

roxi

mat

e nu

mbe

r be

caus

e th

is w

ill d

epen

d on

how

th

e co

nst

ant i

s de

term

ined

(B

raun

200

5 [1

996]

an

d C

arra

sco

Jim

énez

200

3: 3

).

39.

�er

e ar

e a

surp

risi

ng

num

ber

of r

esea

rch

proj

ects

tha

t, w

hils

t en

deav

ouri

ng

to c

ompi

le a

“r

epre

sent

ativ

e” c

orpu

s, h

ardl

y se

em t

o to

uch

on t

his

con

cept

. Usu

ally

, it

is n

otic

eabl

e th

at t

he

avai

labi

lity

of m

ater

ial

in t

he p

arti

cula

r �

eld

of s

tudy

det

erm

ines

the

�n

al s

ize

of t

he c

orpu

s (G

iou

li y

Pip

erid

is 2

002)

.

40

. In

deed

, out

of

this

wor

k ca

me

the

rule

kn

own

as

Hea

ps’ l

aw. B

oth

Zip

f’s

and

Hea

ps’ l

aws

are

used

to g

rasp

the

var

iabi

lity

of c

orpo

ra: H

eaps

’ law

is a

n e

mpi

rica

l law

whi

ch e

xam

ines

the

re

lati

onsh

ip b

etw

een

voc

abu

lary

siz

e, o

r in

oth

er w

ords

, the

num

ber

of d

i#er

ent

wor

ds (

type

s)

and

the

tota

l num

ber

of w

ords

in a

text

(to

ken

s). I

n th

is w

ay a

seq

uent

ial i

ncr

ease

of v

ocab

ula

ry

in r

elat

ion

to

text

typ

e ca

n b

e ob

serv

ed. �

e pr

ogra

mm

e R

eCor

has

bee

n v

alid

ated

usi

ng

this

la

w (

cf. S

eghi

ri 2

006:

399

–403

).

41.

Con

scio

us o

f th

ese

de�

cien

cies

, Yan

g et

al.

(200

0) a

ttem

pted

to

over

com

e th

em b

y ta

kin

g a

new

app

roac

h: a

mat

hem

atic

al t

ool c

apab

le o

f pr

edic

tin

g th

e re

lati

onsh

ip b

etw

een

lin

guis

tic

elem

ents

in a

text

(ty

pes)

an

d th

e si

ze o

f the

cor

pus

(tok

ens)

. How

ever

, at t

he e

nd

of th

eir

stud

y,

the

auth

ors

re!e

cted

on

som

e of

its

limit

atio

ns,

“th

e cr

itic

al p

robl

em is

, how

ever

, how

to d

eter

-m

ine

the

valu

e of

tole

ran

ce e

rror

for

posi

tive

pre

dict

ion

s” (

Yan

g et

al.

2000

: 30)

.

42.

For

a h

isto

rica

l pe

rspe

ctiv

e on

how

Zip

f’s

law

was

dev

elop

ed s

ee M

orei

ro G

onzá

lez

(200

2).

Page 8: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

88

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

Num

erou

s st

udie

s ha

ve b

een

bas

ed o

n th

e la

w, b

ut th

e co

ncl

usio

ns

they

rea

ch

do n

ot s

peci

fy, n

ot e

ven

thr

ough

the

use

of

grap

hs, t

he n

umbe

r of

tex

ts t

hat

are

nec

essa

ry to

com

pile

a c

orpu

s fo

r a

part

icu

lar

spec

ialis

ed �

eld

(Alm

ahan

o G

üeto

20

02: 2

81).

A p

ossi

ble

solu

tion

cou

ld b

e to

an

alys

e th

e le

xica

l den

sity

of a

cor

pus

in r

ela-

tion

to th

e in

crea

se in

doc

umen

tary

mat

eria

l in

clud

ed. I

n o

ther

wor

ds, i

f the

rat

io

betw

een

the

act

ual

num

ber

of d

i#er

ent

wor

ds i

n a

tex

t an

d th

e to

tal

num

ber

of

wor

ds (

type

s/to

ken

s) is

an

indi

cato

r of

lexi

cal d

ensi

ty o

r ri

chn

ess,

it m

ay b

e po

s-si

ble

to c

reat

e a

form

ula

tha

t ca

n r

epre

sent

lexi

cal d

ensi

ty a

s th

e co

rpus

incr

ease

s on

a d

ocum

ent

by d

ocum

ent

basi

s: o

nce

a c

erta

in n

umbe

r of

tex

ts h

ave

been

in

clud

ed, t

he n

umbe

r of

typ

es d

oes

not

incr

ease

in p

ropo

rtio

n t

o th

e nu

mbe

r of

w

ords

the

cor

pus

cont

ain

s.�

is f

orm

ula

may

mak

e it

pos

sibl

e to

det

erm

ine

the

min

imum

siz

e th

at a

co

rpus

mus

t re

ach

for

it t

o be

gin

to

be r

epre

sent

ativ

e. W

ith

the

help

of

grap

hs,

it s

hou

ld b

e po

ssib

le t

o es

tabl

ish

whe

ther

the

cor

pus

is r

epre

sent

ativ

e an

d ap

-pr

oxim

atel

y ho

w m

any

docu

men

ts a

re n

eces

sary

to

achi

eve

this

. �is

the

ory

has

beco

me

a pr

acti

cal r

ealit

y in

the

sha

pe o

f a

so*

war

e ap

plic

atio

n, R

eCor

,4343 w

hich

en

able

s ac

cura

te e

valu

atio

n o

f cor

pus

repr

esen

tati

ven

ess.

It s

hou

ld b

e m

ade

clea

r th

at t

he

met

hod

for

eva

luat

ing

the

hom

ogen

eity

of

a ve

ry s

pec

ialis

ed c

orpu

s as

sum

es t

hat

th

e ta

rget

pop

ula

tion

is k

now

n a

nd

avai

l-ab

le t

o th

e re

sear

cher

. �is

cle

arly

invo

lves

car

efu

l des

ign

of

the

corp

us in

ter

ms

of c

omp

onen

ts,

text

typ

es t

o be

in

clu

ded,

dia

syst

emat

ic l

imit

s (d

iaph

asic

, di

-as

trat

ic, d

iach

ron

ic a

nd

diat

opic

), a

s w

ell a

s ty

pe

of c

orpu

s (c

ompa

rabl

e, p

aral

lel,

etc.

), n

um

ber

and

stat

us o

f la

ngu

ages

, tex

t do

cum

enta

tion

for

DT

Ds

and

hea

d-

ers,

inte

r al

ia.

On

ce t

he q

uest

ion

of

qual

ity

is e

nsur

ed i

n t

erm

s of

cor

pus

desi

gn a

nd

docu

-m

ent s

elec

tion

, thi

s pr

ogra

mm

e ca

n b

e us

ed to

det

erm

ine

a po

ster

iori

whe

ther

the

size

rea

ched

by

a gi

ven

cor

pus

is s

u�ci

entl

y re

pres

enta

tive

of t

his

part

icul

ar s

ecto

r of

the

tour

ist i

ndu

stry

. For

furt

her

info

rmat

ion

, the

tech

nol

ogy

and

the

theo

reti

cal

pres

uppo

siti

ons

behi

nd

the

ReC

or P

rogr

amm

e ar

e ex

plai

ned

in

det

ail

in S

eghi

ri

(200

6), C

orpa

s P

asto

r an

d Se

ghir

i (20

06a,

200

6b, 2

007a

, 200

7b a

nd

fort

hcom

ing)

.

4.1

e R

eCor

inte

rfac

e

ReC

or’s

inte

rfac

e is

sim

ple,

intu

itiv

e an

d us

er-f

rien

dly

(see

Fig

ure

1).

Fir

stly

, an

in-

put �

le m

ay b

e se

lect

ed; t

his

cou

ld b

e an

yth

ing

from

a p

arti

cula

r cl

ause

in a

pol

icy

43.

ReC

or is

an

acr

onym

der

ived

fro

m t

he f

unct

ion

it w

as d

esig

ned

for:

the

rep

rese

ntat

iven

ess

of c

orpo

ra.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

89

to th

e en

tire

cor

pus.

�er

e is

als

o an

opt

ion

: “F

iltr

o d

e en

tra

da

”, w

hic

h �

lter

s out

all

thos

e w

ords

that

the

user

wan

ts to

exc

lude

from

the

anal

ysis

, lik

e ad

dres

ses,

pro

p-er

nam

es o

r ev

en H

TM

L t

ags,

in t

he

case

th

at t

he

corp

us h

as n

ot b

een

“cl

ean

ed”.

Nex

t, t

hre

e ou

tput

�le

s ar

e cr

eate

d. �

e �

rst,

“A

nál

isis

est

ad

ísti

co”

or s

tati

stic

al

anal

ysis

, col

late

s th

e re

sult

s fro

m tw

o di

stin

ct a

nal

yses

; �rs

tly,

wit

h th

e �

les o

rder

ed

alph

abet

ical

ly b

y n

ame

and

seco

nd

ly w

ith

th

e �

les

in r

ando

m o

rder

. �e

docu

-m

ent

that

app

ears

is

stru

ctu

red

into

�ve

col

um

ns

wh

ich

sh

ow t

he

num

ber

of

typ

es,

the

num

ber

of t

oken

s, t

he

rati

o be

twee

n t

he

num

ber

of d

i#er

ent

wor

ds

and

the

tota

l nu

mbe

r of

wor

ds (

typ

es/t

oken

s), t

he

num

ber

of w

ords

th

at a

ppea

r on

ly o

nce

(V

1) a

nd

the

num

ber

of w

ords

that

app

ear

only

twic

e (V

2). �

e se

con

d ou

tput

�le

, “P

alab

ras

ord

. alf

a.”,

gen

erat

es tw

o co

lum

ns;

the

�rs

t sh

ows

the

wor

ds

in a

lph

abet

ical

ord

er w

ith

thei

r co

rres

pon

din

g nu

mbe

r of

occ

urr

ence

s ap

pea

rin

g in

the

seco

nd

colu

mn

. �e

sam

e in

form

atio

n is

sh

own

in th

e th

ird

�le

, “P

alab

ras

ord

. fr

ec.”,

but

th

is t

ime

the

wor

ds a

re o

rder

ed a

ccor

din

g to

th

eir

freq

uen

cy, o

r in

oth

er w

ords

, by

thei

r ra

nk.

�e

appl

icat

ion

als

o al

low

s th

e us

er t

o w

ork

wit

h

grou

ps o

f up

to te

n w

ords

(n

-gra

ms)

4444 a

nd

phra

seol

ogy,

as

wel

l as

allo

win

g nu

m-

bers

to

be �

lter

ed o

ut.

44

. In

thi

s st

udy

we

used

the

2.1

ver

sion

of R

eCor

. We

are

curr

entl

y w

orki

ng

on a

new

ver

sion

(R

eCor

3.0

) w

hich

has

an

im

prov

ed c

apac

ity

for

wor

kin

g w

ith

mu

ltip

le a

nd

very

lar

ge �

les

quic

kly

and

also

allo

ws

phra

seol

ogic

al u

nit

s to

be

iden

ti�e

d on

the

basi

s of

an

alys

is o

f n-g

ram

s (n

≥ 1

an

d n

≤ 1

0) o

f the

cor

pus.

Fig

ure

1.

�e

ReC

or in

terf

ace

Page 9: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

90

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

4.2

G

raph

ical

rep

rese

ntat

ion

of d

ata

�e

prog

ram

me

illus

trat

es t

he le

vel o

f re

pres

enta

tive

nes

s of

a c

orpu

s in

a s

impl

e gr

aph

form

, whi

ch s

how

s lin

es t

hat

grow

exp

onen

tial

ly a

t �

rst

and

then

sta

bilis

e as

the

y ap

proa

ch z

ero.

4545

In t

he �

rst

pres

enta

tion

of

the

corp

us g

ener

ated

by

the

prog

ram

me

in g

raph

fo

rm –

Est

ud

io g

rá"

co A

– th

e nu

mbe

r of

�le

s se

lect

ed is

sho

wn

on

the

hori

zont

al

axis

, whi

le th

e ve

rtic

al a

xis

show

s th

e ty

pe/t

oken

rat

io. �

e re

sult

s of

two

di#

eren

t op

erat

ion

s ar

e sh

own

, on

e w

ith

the

�le

s or

dere

d al

phab

etic

ally

(th

e re

d lin

e), a

nd

the

othe

r w

ith

the

�le

s in

trod

uced

at

ran

dom

(th

e bl

ue li

ne)

. In

thi

s w

ay t

he p

ro-

gram

me

doub

le-c

heck

s to

ver

ify

that

the

ord

er in

whi

ch t

he t

exts

are

intr

oduc

ed

does

not

hav

e re

perc

ussi

ons

on t

he r

epre

sent

ativ

enes

s of

the

cor

pus.

Bot

h op

-er

atio

ns

show

an

exp

onen

tial

dec

reas

e as

the

num

ber

of t

exts

sel

ecte

d in

crea

ses.

H

owev

er, a

t th

e po

int

whe

re b

oth

the

red

and

blue

lin

es s

tabi

lise,

it is

pos

sibl

e to

st

ate

that

the

cor

pus

is r

epre

sent

ativ

e, a

nd

at p

reci

sely

thi

s po

int

it i

s po

ssib

le t

o se

e ap

prox

imat

ely

how

man

y te

xts

will

pro

duce

thi

s re

sult

.A

t th

e sa

me

tim

e an

othe

r gr

aph

is g

ener

ated

– E

stu

dio

grá

"co

B –

in

whi

ch

the

num

ber

of t

oken

s is

sho

wn

on

the

hor

izon

tal a

xis.

�is

gra

ph c

an b

e us

ed t

o de

term

ine

the

tota

l num

ber

of w

ords

tha

t sh

ould

be

set

for

the

min

imum

siz

e of

th

e co

llect

ion

.O

nce

thes

e st

eps

have

bee

n ta

ken

, it i

s po

ssib

le to

che

ck w

heth

er t

he n

umbe

r of

trav

el in

sura

nce

doc

umen

ts th

at h

ave

been

ass

embl

ed in

the

two

lan

guag

es in

-vo

lved

– E

ngl

ish

and

Span

ish

– is

su�

cien

t to

enab

le u

s to

a�

rm t

hat o

ur c

orpu

s is

rep

rese

ntat

ive.

See

Fig

ures

2 a

nd

3 be

low

whi

ch s

how

the

rep

rese

ntat

iven

ess

of

the

two

lan

guag

es in

volv

ed.

�e

resu

lts

gen

erat

ed b

y R

eCor

allo

w u

s to

con

clud

e th

at th

e Sp

anis

h su

bcor

-pu

s of

tra

vel

insu

ran

ce (

cf. F

igur

e 2)

can

be

con

side

red

repr

esen

tati

ve f

rom

140

do

cum

ents

an

d 1

mill

ion

wor

ds o

nwar

ds, w

here

as t

he E

ngl

ish

subc

orpu

s n

eeds

al

mos

t do

uble

the

num

ber

of d

ocum

ents

(27

5) a

nd

wor

ds (

2.5

mill

ion

) in

ord

er

to r

each

rep

rese

ntat

iven

ess

(cf.

Figu

re 3

). �

e re

sult

s re

mai

n la

rgel

y th

e sa

me

even

w

hen

the

an

alys

is i

s pe

rfor

med

on

a t

wo-

wor

d ba

sis

(2-g

ram

s). I

n o

ther

wor

ds,

the

En

glis

h su

bcor

pus

of tr

avel

insu

ran

ce (

cf. F

igur

e 5)

mus

t con

tain

twic

e th

e to

-ta

l num

ber

of d

ocum

ents

an

d to

ken

s th

at a

re n

eces

sary

for

the

Span

ish

subc

orpu

s to

be

deem

ed r

epre

sent

ativ

e (c

f. Fi

gure

4).

45.

It s

hou

ld b

e n

oted

her

e th

at 0

(=

zero

) is

un

achi

evab

le b

ecau

se o

f the

exi

sten

ce in

the

text

of

vari

able

s th

at a

re im

poss

ible

to

cont

rol s

uch

as a

ddre

sses

, pro

per

nam

es o

r nu

mbe

rs, t

o n

ame

only

som

e of

the

mor

e fr

eque

ntly

en

coun

tere

d.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

91

Furt

herm

ore,

the

quan

tita

tive

dat

a pr

oduc

ed b

y R

eCor

per

mit

s us

to c

oncl

ude

that

, de

spit

e th

e ab

sen

ce o

f su

bsta

ntiv

e le

gisl

atio

n o

n i

nsu

ran

ce i

n t

he t

ouri

sm

indu

stry

in

eit

her

of t

he l

egal

sys

tem

s in

volv

ed,

Span

ish

tra

vel

insu

ran

ce d

ocu-

men

ts te

nd

to b

e m

ore

hom

ogen

ous

than

the

En

glis

h te

xt fo

rms.

In

oth

er w

ords

, it

is p

ossi

ble

to in

fer

that

tha

t the

Spa

nis

h d

ocum

ents

pre

sent

sup

er-,

mac

ro-

and

mic

rost

ruct

ures

that

are

ver

y si

mila

r to

eac

h o

ther

in a

ddit

ion

to u

sin

g a

nar

row

er

term

inol

ogic

al r

ange

.

Fig

ure

2.

Rep

rese

ntat

iven

ess

of t

he S

pan

ish

trav

el in

sura

nce

sub

corp

us (

1-gr

am)

Fig

ure

3.

Rep

rese

ntat

iven

ess

of t

he E

ngl

ish

trav

el in

sura

nce

sub

corp

us (

1-gr

am)

Page 10: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

92

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

5.

Usi

ng

th

e co

rpu

s to

tra

nsl

ate

A w

ell-

con

stru

cted

vir

tual

cor

pus

faci

litat

es d

iver

se s

tudi

es o

n tr

ansl

atio

n a

s bo

th

prod

uct

and

proc

ess.

Fur

ther

mor

e, o

ne

of t

he m

ost

prom

isin

g us

es o

f co

rpor

a is

in

tra

nsl

atio

n t

each

ing

and

lear

nin

g to

tra

nsl

ate.

Rep

rese

ntat

ive

virt

ual

cor

pora

Fig

ure

5.

Rep

rese

ntat

iven

ess

of t

he E

ngl

ish

trav

el in

sura

nce

sub

corp

us (

2-gr

ams)

Fig

ure

4.

Rep

rese

ntat

iven

ess

of t

he S

pan

ish

trav

el in

sura

nce

sub

corp

us (

2-gr

ams)

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

93

prov

ide

tran

slat

ors

(tra

iner

s, t

rain

ees

and

prof

essi

onal

s) w

ith

a �

rst-

rate

doc

u-m

enta

tion

res

ourc

e fo

r re

nde

rin

g so

urce

text

s (S

Ts)

into

the

tar

get l

angu

age.

In a

ddit

ion

, the

com

pila

tion

of

a vi

rtua

l co

rpus

cal

ls f

or a

tho

roug

h u

nde

r-st

andi

ng

of e

lect

ron

ic r

esou

rces

, sea

rch

ski

lls a

nd

data

min

ing

tech

niq

ues

from

th

e In

tern

et,

ther

eby

prom

otin

g th

e de

velo

pmen

t of

the

tra

nsl

ator

-com

pile

r’s

heur

isti

c su

b-co

mpe

ten

ce.

Mor

eove

r, w

hen

a c

orpu

s ha

s be

en a

ppro

pria

tely

de

sign

ed a

nd

impl

emen

ted,

we

can

ass

ume

that

the

com

pile

r ha

s ca

rrie

d ou

t a

prel

imin

ary

eval

uati

on o

f in

form

atio

n r

esou

rces

, in

ord

er t

o en

sure

the

ove

rall

qual

ity

of t

he t

extu

al c

olle

ctio

n. E

valu

atio

n a

nd

sele

ctio

n o

f th

e do

cum

ents

to

be

incl

uded

in

a g

iven

cor

pus

will

usu

ally

spe

ed u

p th

e tr

ansl

atio

n a

nd/

or r

evis

ion

pr

oces

s. A

s a

resu

lt,

tran

slat

ors

can

dev

ote

extr

a ti

me

to d

ecis

ion

-mak

ing

and

prob

lem

-sol

vin

g an

d fo

cus

on t

hese

mor

e de

man

din

g ta

sks,

in

stea

d of

rep

eat-

edly

rev

iew

ing

the

refe

ren

ce m

ater

ial.

Hen

ce,

usin

g co

rpor

a as

an

aid

may

als

o en

han

ce p

oten

tial

use

rs’ o

vera

ll co

mpe

ten

ce a

s tr

ansl

ator

s.

5.1

Sour

ce te

xt s

ampl

es

Com

para

ble

corp

ora

are

part

icu

larl

y us

efu

l for

mee

tin

g tr

ansl

ator

s’ i

nfo

rmat

ion

n

eeds

. In

the

follo

win

g su

bsec

tion

s w

e w

ill il

lust

rate

the

valu

e of

cor

pora

for

�n

d-in

g in

form

atio

n o

n t

erm

inol

ogy,

phr

aseo

logy

, con

cept

s an

d di

scou

rse

for

dire

ct

and

inve

rse

tran

slat

ion

of a

n e

xtra

ct fr

om a

trav

el in

sura

nce

pol

icy.

In o

rder

to d

o th

is, w

e ha

ve s

elec

ted

two

extr

acts

fro

m t

rave

l in

sura

nce

pol

icie

s, o

ne

in E

ngl

ish

an

d th

e ot

her

in S

pan

ish

as

sour

ce te

xt (

ST)

sam

ples

.

Ext

ract

1 (

ST):

4646

Impo

rtan

t

�is

is

your

tra

vel

insu

ran

ce p

olic

y. I

t co

ntai

ns

deta

ils o

f co

ver,

co

ndi

tion

s an

d ex

clus

ion

s re

lati

ng

to e

ach

insu

red

pers

on a

nd

is t

he

basi

s on

whi

ch a

ll cl

aim

s w

ill b

e se

ttle

d.

46

. �

e ex

trac

t com

es fr

om a

trav

el in

sura

nce

pol

icy

from

the

Bri

tish

insu

ran

ce c

ompa

ny D

irec

t

Tra

vel I

nsu

ran

ce: <

http

://w

ww

.dir

ect-

trav

el.c

o.uk

/FA

Q/W

ordi

ngs/

polic

ywor

ding

0105

06.p

df>

.

Page 11: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

94

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

Ext

ract

2 (

ST):

4747

CO

ND

ICIO

NE

S G

EN

ER

AL

ES

Art

ícu

lo P

relim

inar

.-E

l C

ontr

ato

de S

egur

o.-E

l pr

esen

te C

ontr

ato

de S

egur

o se

rig

e po

r lo

dis

pues

to e

n l

a L

ey 5

0/19

80, d

e 8

de o

ctub

re,

de C

ontr

ato

de s

egur

o, e

n l

a L

ey 3

0/19

95, d

e 8

de N

ovie

mbr

e, d

e O

r-de

nac

ión

y S

uper

visi

ón d

e lo

s Se

guro

s P

riva

dos.

5.2

Doc

umen

tati

on n

eeds

Eve

n t

wo

shor

t ST

fra

gmen

ts li

ke t

hose

cho

sen

in 5

.1 o

#er

abu

nda

nt e

vide

nce

to

argu

e in

favo

ur o

f the

use

of c

ompa

rabl

e co

rpor

a in

the

actu

al tr

ansl

atio

n p

roce

ss.

We

are

mai

nly

con

cern

ed w

ith

the

term

inol

ogic

al a

nd

phra

seol

ogic

al n

eeds

of

tran

slat

ors,

the

ext

ract

ion

of

con

cept

ual

or d

omai

n i

nfo

rmat

ion

, an

d th

e co

m-

pari

son

of t

extu

al a

nd

disc

ours

e fe

atur

es in

the

sou

rce

and

targ

et la

ngu

ages

.

5.2.

1 T

erm

inol

ogy

and

Ph

rase

olog

y

�e

�rs

t pr

oble

m t

hat

a t

ran

slat

or m

ay c

ome

up a

gain

st i

s h

ow t

o tr

ansl

ate

the

term

tra

vel

insu

ran

ce p

olic

y (c

f. E

xtra

ct 1

). O

n t

his

poi

nt it

sh

ould

be

not

ed t

hat

th

e te

rm s

egu

ro t

urí

stic

o h

as a

lon

g tr

adit

ion

in o

ur

lega

l sys

tem

sin

ce t

he

publ

i-ca

tion

in 1

964

of t

he

Span

ish

Pre

sid

enti

al D

ecre

e 33

04/6

4 on

in

sura

nce

con

tra

cts

for

fore

ign

tou

rist

s. H

owev

er, t

his

all

chan

ged

wh

en t

he

text

of t

he

Cou

nci

l Dir

ec-

tive

84/

641/

EE

C o

f 10

Dec

emb

er 1

984

amen

din

g, p

arti

cula

rly

as

rega

rds

tou

rist

as-

sist

ance

, th

e F

irst

Dir

ecti

ve (

73/2

39/E

EC

) on

th

e co

-ord

inat

ion

of

law

s, r

egu

lati

ons

and

ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he

taki

ng-

up

an

d p

urs

uit

of

the

busi

nes

s

of d

irec

t in

sura

nce

oth

er t

han

lif

e a

ssu

ran

ce w

as t

ran

spos

ed t

o th

e Sp

anis

h l

egal

sy

stem

th

rou

gh t

he

Min

iste

rial

Ord

er o

f 27

Jan

uar

y 19

88 w

hic

h d

escr

ibes

cov

er-

age

of a

ssis

tan

ce w

hil

e tr

avel

lin

g a

s p

art

of p

riva

te i

nsu

ran

ce. �

is m

inis

teri

al o

r-de

r em

ploy

ed t

he

term

tra

vel

ass

ista

nce

wh

ich

was

tra

nsl

ated

int

o Sp

anis

h w

ith

the

o�ci

ally

acc

epte

d n

eolo

gica

l cal

que

asi

sten

cia

en

via

je. S

ince

th

en, t

his

neo

-lo

gica

l ca

lqu

e fr

om i

nter

nat

ion

al/E

uro

En

glis

h h

as b

een

in

corp

orat

ed i

nto

the

Span

ish

leg

al s

yste

m a

nd

has

sup

plan

ted

the

orig

inal

seg

uro

tu

ríst

ico,

wh

ich

is

mu

ch m

ore

corr

ect

give

n t

hat

tra

vel a

ssis

tan

ce is

on

ly o

ne

pos

sibl

e pa

rt o

f tra

vel

insu

ran

ce c

over

age.

Oth

er a

spec

ts w

hic

h m

ay b

e co

vere

d in

clu

de c

over

age

for

47.

�e

extr

act

com

es f

rom

a t

rave

l in

sura

nce

pol

icy

from

Agr

upa

ción

Ast

es,

Segu

ro T

urí

stic

o pu

blis

hed

on t

he w

eb s

ite

of t

he t

rave

l ag

ents

, C

ond

or V

acac

ion

es S

.A:

<ht

tp:/

/ww

w.s

peci

al-

tour

s.co

m/�

cher

os/S

egur

o_E

urop

a_E

S.pd

f>.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

95

can

cella

tion

of

the

hol

iday

or

med

ical

ass

ista

nce

, to

men

tion

on

ly s

ome

of t

he

mos

t fr

equ

ent.

�e

Span

ish

cor

pus

also

con

tain

s tw

o sy

non

yms

for

the

term

tra

vel

insu

r-

ance

: se

guro

tu

ríst

ico

and

segu

ro d

e a

sist

enci

a e

n v

iaje

, al

thou

gh t

he f

requ

ency

w

ith

whi

ch t

hey

appe

ar v

arie

s.A

s m

ay b

e se

en, s

egu

ro t

urí

stic

o (c

f. Fi

gure

6)

prod

uces

on

ly 1

5 co

nco

rdan

c-es

,4848 a

s co

mpa

red

wit

h 2

6 fo

r se

guro

de

asi

sten

cia

en

via

je (

cf. F

igur

e 7)

. It

shou

ld

be p

oint

ed o

ut t

hat

asi

sten

cia

en

via

je a

ppea

rs 1

07 t

imes

. �

is c

lear

ly d

emon

-st

rate

s th

e pr

efer

ence

in S

pan

ish

for

the

En

glis

h c

alqu

e w

hen

dra

win

g up

this

type

of

doc

umen

t as

wel

l as

the

in!u

ence

of E

ngl

ish

as

the

lingu

a fr

anca

par

exc

elle

nce

(o

*en

ref

erre

d to

as

“int

ern

atio

nal

lega

l En

glis

h”)

and

its

impa

ct o

n le

gisl

atio

n in

th

e �

eld

of t

rave

l in

sura

nce

in p

enin

sula

r Sp

anis

h.

Sim

ilar

prob

lem

s ar

ise

for

tran

slat

ors

whe

n fa

ced

wit

h tr

ansl

atin

g E

l Con

trat

o

de

Segu

ro (

cf. E

xtra

ct 2

) in

to E

ngl

ish

as

ther

e ap

pear

s to

be

two

poss

ibili

ties

: as-

sura

nce

con

trac

t or

insu

ran

ce c

ontr

act.

A s

earc

h fo

r co

ntr

act

in t

he c

orpu

s re

veal

s a

pref

eren

ce in

En

glis

h fo

r co

ntr

act

of in

sura

nce

(cf

. Fig

ure

8). I

n a

ddit

ion

, whe

n it

ap

pear

s in

this

par

ticu

lar

posi

tion

in th

e te

xt, a

�xe

d ex

pres

sion

(!

is i

s yo

ur

con

-

trac

t of

in

sura

nce

) ca

n b

e id

enti

�ed

whi

ch s

hou

ld b

e re

prod

uced

in t

ran

slat

ion

.

Fig

ure

6.

Con

cord

ance

s fo

r ‘s

egur

o tu

ríst

ico’

48

. �

e an

alys

is o

f con

cord

ance

s w

as c

arri

ed o

ut u

sin

g W

ordS

mit

h To

ols

4.0.

Page 12: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

96

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

�e

nex

t pr

oble

m t

hat

cou

ld a

rise

for

the

tra

nsl

ator

is

how

to

tran

slat

e th

e E

ngl

ish

cove

r, c

ond

itio

ns

and

exc

lusi

ons

(cf.

Ext

ract

1)

into

Spa

nis

h. A

sea

rch

in

the

Span

ish

corp

us fo

r th

e lit

eral

tra

nsl

atio

n c

ond

icio

nes

, cob

ertu

ras

y ex

clu

sion

es

show

s on

ly o

ne

con

cord

ance

. On

this

poi

nt it

is im

port

ant t

o re

mem

ber

that

lega

l

Fig

ure

8.

Con

cord

ance

s fo

r ‘c

ontr

act’

Fig

ure

7.

Con

cord

ance

s fo

r ‘s

egur

o de

asi

sten

cia

en v

iaje

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

97

lan

guag

e is

cha

ract

eris

ed n

ot o

nly

by

its

prec

isio

n, b

ut a

lso

by it

s fo

rmu

laic

an

d ex

trem

ely

con

serv

ativ

e st

yle.

�e

tran

slat

or s

hou

ld b

e aw

are

of t

he a

bun

dan

ce o

f ve

rbos

e an

d o*

en r

edun

dant

phr

aseo

logi

cal u

nit

s an

d ot

her

�xe

d ex

pres

sion

s an

d th

e ar

chai

c or

con

vent

ion

al fo

rms

that

thes

e te

xts

cont

ain

, o*

en w

ith

the

sole

pur

-po

se o

f mak

ing

them

app

ear

mor

e gr

andi

ose.

Fin

ally

, the

Spa

nis

h c

orpu

s re

veal

ed

that

the

term

exc

lusi

ones

is a

lway

s fo

und

as p

art o

f the

phr

aseo

logi

cal u

nit

lím

ites

y ex

clu

sion

es (

or, e

lse,

as

gara

ntí

as,

lím

ites

y e

xclu

sion

es),

as

can

be

infe

rred

by

the

resu

lts

pres

ente

d by

the

pro

gram

whe

n w

riti

ng

excl

usi

ones

(cf

. Fig

ure

9).

Fig

ure

9.

Con

cord

ance

s fo

r ‘e

xclu

sion

es’

Fig

ure

10

. C

onco

rdan

ces

for

‘con

diti

ons’

Page 13: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

98

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

A s

imila

r pr

oble

m m

ay b

e en

coun

tere

d by

the

tra

nsl

ator

whe

n t

ran

slat

ing

CO

ND

ICIO

NE

S G

EN

ER

AL

ES

(cf.

Ext

ract

2)

into

En

glis

h. A

sea

rch

in th

e co

rpus

fo

r co

nd

itio

ns

show

s th

at i

n E

ngl

ish

the

con

stru

ctio

n G

ener

al T

erm

s an

d C

ond

i-

tion

s (c

f. Fi

gure

10)

wit

h ca

pita

l let

ters

is p

refe

rred

in m

ost c

ases

.

5.2.

2 C

once

ptu

al i

nfo

rmat

ion

In E

ngl

ish

the

polic

ies

alw

ays

refe

r to

the

insu

red

per

son

(cf

. Ext

ract

1),

whe

reas

th

e Sp

anis

h le

gal s

yste

m r

ecog

nis

es v

ario

us �

gure

s. A

s a

resu

lt, i

t m

ay b

e be

ne�

-ci

al t

o di

stin

guis

h be

twee

n t

he a

segu

rad

o (t

he in

sure

d pe

rson

), t

he t

omad

or (

the

pers

on w

ho t

akes

out

the

in

sura

nce

) an

d th

e be

ne"

ciar

io (

the

ben

e�ci

ary

of t

he

insu

ran

ce).

�e

ase

gura

do

is t

he p

erso

n (

eith

er p

hysi

cal o

r le

gal)

who

is e

xpos

ed

to a

par

ticu

lar

risk

, eit

her

to h

is p

erso

n o

r hi

s pr

oper

ty o

r as

sets

. In

oth

er w

ords

, th

e a

segu

rad

o is

the

sub

ject

of

the

cont

ract

whe

ther

in h

is p

erso

n (

in t

he c

ase

of

life

insu

ran

ce o

r pe

nsi

ons

for

exam

ple)

or

his

prop

erty

(in

the

cas

e of

hou

se i

n-

sura

nce

or

insu

ran

ce a

gain

st �

re a

mon

gst o

ther

s). �

e to

mad

or is

the

pers

on w

ho

take

s ou

t th

e in

sura

nce

an

d pa

ys t

he p

rem

ium

s, b

ut m

ay n

ot n

eces

sari

ly b

e th

e be

ne�

ciar

y. �

e be

ne"

ciar

io i

s th

e pe

rson

spe

ci�e

d in

the

pol

icy

as t

he r

ecip

ient

of

the

ass

ista

nce

or

com

pen

sati

on c

over

ed b

y th

e in

sura

nce

.�

e co

rpus

may

the

refo

re a

lso

be u

sed

to c

lari

fy c

once

pts

and,

as

a re

sult

, id

enti

fy w

hich

per

son

is b

ein

g re

ferr

ed t

o in

Spa

nis

h. H

ence

, a s

earc

h in

the

cor

-pu

s ba

sed

on t

he e

xpre

ssio

n in

sure

d p

erso

n (

cf. F

igur

e 11

) sh

ows

de�

nit

ion

s su

ch

as “

Insu

red

pers

on, y

ou, y

our

– ea

ch p

erso

n w

ho a

n in

sura

nce

pre

miu

m h

as b

een

pa

id fo

r as

sho

wn

on

the

pol

icy

sche

dule

”.It

may

, the

refo

re, b

e co

ncl

uded

tha

t th

e E

ngl

ish

term

in

sure

d p

erso

n s

hou

ld

be t

ran

slat

ed a

s A

segu

rad

o w

ith

a ca

pita

l le

tter

as

illus

trat

ed b

y th

e in

form

atio

n

show

n fr

om th

e co

rpus

(cf

. Fig

ure

12).

�e

opti

on p

erso

na

ase

gura

da

, wit

h 20

oc-

curr

ence

s, m

ay b

e ru

led

out i

n fa

vour

of A

segu

rad

o or

Ase

gura

dos

wit

h 5,

692

and

646

occu

rren

ces

resp

ecti

vely

.In

the

case

of t

he S

pan

ish

frag

men

t (cf

. Ext

ract

2),

the

mai

n p

robl

em is

roo

ted

in t

he d

i�cu

ltie

s of

ren

deri

ng

the

legi

slat

ion

in t

ran

slat

ion

: Ley

50/

1980

, d

e 8

de

octu

bre,

de

Con

trat

o d

e se

guro

, en

la

Ley

30/

1995

, d

e 8

de

Nov

iem

bre,

de

Ord

e-

nac

ión

y S

upe

rvis

ión

de

los

Segu

ros

Pri

vad

os. H

ere

it m

ay b

e he

lpfu

l to

rem

embe

r th

at a

ltho

ugh

ther

e is

no

subs

tant

ive

com

mun

itar

y le

gisl

atio

n o

n t

he s

ubje

ct o

f tr

avel

in

sura

nce

, th

e co

ntra

ct m

ay b

e su

bjec

t to

the

nat

ion

al r

egu

lati

ons

of t

he

coun

trie

s th

at t

he p

arti

es m

akin

g th

e ag

reem

ent

com

e fr

om.

If t

he c

usto

mer

w

ants

an

ada

ptat

ion

of

the

tran

slat

ion

to

the

Bri

tish

leg

al s

yste

m, t

he t

ran

slat

or

can

use

the

cor

pus

to �

nd

the

info

rmat

ion

nec

essa

ry t

o pe

rfor

m t

his

task

. �

e re

sult

s of

a s

earc

h in

the

En

glis

h su

bcor

pus

(cf.

Figu

re 1

3) fo

r la

w (

legi

slat

ion

was

al

so s

earc

hed,

but

pro

duce

d n

o oc

curr

ence

s) s

how

a s

ubst

anti

al d

i#er

ence

fro

m

the

way

that

legi

slat

ion

is e

xpre

ssed

in S

pan

ish.

Whe

reas

in S

pan

ish

ther

e is

muc

h

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

99

mor

e pr

ecis

ion

, in

En

glis

h a

mor

e ge

ner

ic m

ean

s of

exp

ress

ion

is p

refe

rred

, wit

h

refe

ren

ce m

ade

sole

ly t

o E

ngl

ish

Law

an

d n

o m

enti

on o

f th

e sp

eci�

c re

gula

tion

s th

at a

pply

. In

add

itio

n, o

n th

e su

bjec

t of l

egis

lati

on, i

t may

be

seen

that

in E

ngl

ish

th

e op

enin

g fo

rmu

la, L

aw a

ppli

cabl

e, d

oes

not

coi

nci

de w

ith

the

Span

ish

Art

ícu

lo

prel

imin

ar. �

is q

uest

ion

will

be

deal

t wit

h in

the

follo

win

g se

ctio

n (

cf. 5

.2.3

).

Fig

ure

11

. D

e�n

itio

n o

f ‘in

sure

d pe

rson

Fig

ure

12

. C

onco

rdan

ces

for

‘ase

gura

do’

Page 14: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

100

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

5.2.

3 T

extu

al c

onve

nti

ons

Fin

ally

, the

pre

limin

ary

docu

men

tati

on w

ork

invo

lves

car

ryin

g ou

t se

arch

es f

o-cu

sin

g on

the

typ

olog

y of

the

tex

t to

be

tran

slat

ed. I

n t

his

case

our

inte

ntio

n w

as

to �

nd

typi

cal o

pen

ing

form

ula

s in

the

trav

el in

sura

nce

pol

icie

s in

Spa

nis

h eq

uiv-

alen

t to

the

En

glis

h Im

port

ant

(cf.

Ext

ract

1).

We

ther

efor

e se

arch

ed f

or c

onco

r-da

nce

s in

Spa

nis

h ba

sed

on Im

port

ante

. �e

resu

lts

show

that

the

typi

cal o

pen

ing

form

ula

for

thi

s se

ctio

n i

n S

pan

ish

is n

ot I

mpo

rtan

te b

ut M

UY

IM

PO

RT

AN

TE

w

ith

the

who

le s

eque

nce

in c

apit

al le

tter

s (c

f. Fi

gure

14)

.In

th

e ca

se o

f th

e Sp

anis

h t

ext

(cf.

Ext

ract

2),

th

e ty

pica

l op

enin

g fo

rmu

la

con

sist

s of

a p

relim

inar

y ar

ticl

e (A

rtíc

ulo

Pre

lim

inar

) w

hic

h c

onta

ins

refe

ren

ces

to t

he

rele

vant

legi

slat

ion

. How

ever

, th

e co

rpus

sh

ows

that

th

e E

ngl

ish

con

ven

-ti

on h

as i

ts o

wn

op

enin

g fo

rmu

la i

n t

rave

l in

sura

nce

pol

icie

s, L

aw a

ppli

cabl

e,

wh

ich

, fu

rth

erm

ore,

gen

eral

ly a

ppea

rs i

n t

he

last

par

agra

ph o

f th

e p

olic

y an

d th

eref

ore

con

stit

utes

a c

losi

ng

form

ula

rat

her

th

an t

he

open

ing

form

ula

fou

nd

in S

pan

ish

.

Fig

ure

13

. C

onco

rdan

ces

for

‘law

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

101

5.3

Targ

et te

xt s

ampl

es

On

ce a

ll th

e n

eces

sary

in

form

atio

n h

as b

een

gat

here

d fr

om t

he t

rave

l in

sura

nce

co

rpus

, the

tran

slat

or is

in a

pos

itio

n to

o#

er a

tran

slat

ion

of b

oth

ext

ract

s. It

is e

s-se

ntia

l to

take

into

acc

ount

all

the

poin

ts th

at h

ave

been

out

lined

so

far

give

n th

eir

impo

rtan

ce w

hen

it c

omes

to s

egm

enti

ng

and

reor

gan

isin

g th

e in

form

atio

n in

the

targ

et te

xt (

TT

). �

e fo

llow

ing

are

sugg

este

d tr

ansl

atio

ns

of E

xtra

cts

1 an

d 2.

Ext

ract

1 (

TT

):

MU

Y I

MP

OR

TA

NT

EE

sta

es s

u pó

liza

de a

sist

enci

a en

via

je.

En

ella

se

incl

uyen

las

ga

rant

ías,

lím

ites

y e

xclu

sion

es d

e lo

s A

segu

rado

s y

a pa

rtir

de

las

cual

es

podr

á ef

ectu

arse

cua

lqui

er r

ecla

mac

ión

.

Ext

ract

2 (

TT

):

Gen

eral

Ter

ms

and

Con

diti

ons

�is

is y

our

trav

el in

sura

nce

con

trac

t.L

aw a

pplic

able

: �is

pol

icy

is s

ubje

ct to

Spa

nis

h la

w.

Fig

ure

14

. C

onco

rdan

ces

for

‘impo

rtan

te’

Page 15: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

102

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

6.

Co

ncl

usi

on

We

wou

ld li

ke to

beg

in o

ur c

oncl

udin

g re

mar

ks b

y qu

otin

g Z

anet

tin

(200

2a: N

P):

Rec

ent

rese

arch

in

tra

nsl

atio

n s

tudi

es h

as s

tres

sed

the

cont

ribu

tion

whi

ch c

or-

pora

of

elec

tron

ic t

exts

can

bri

ng

to t

ran

slat

ors.

By

usin

g ap

prop

riat

e so

*w

are

tran

slat

ors

can

loo

k up

wor

ds i

n a

mat

ter

of s

econ

ds, a

nd

high

light

pat

tern

s by

so

rtin

g co

ntex

ts a

roun

d se

arch

wor

ds. I

f a c

orpu

s is

app

ropr

iate

ly d

esig

ned

, it c

an

prov

ide

relia

ble

evid

ence

of

auth

enti

c lin

guis

tic

beha

viou

r an

d te

xt-s

truc

turi

ng

conv

enti

ons

by h

igh

light

ing

recu

rren

t pat

tern

s. T

erm

inol

ogic

al a

nd

collo

cati

onal

in

form

atio

n c

an b

e es

peci

ally

use

ful.

As

we

have

see

n, i

t is

pos

sibl

e to

mee

t a

larg

e pa

rt o

f th

e tr

ansl

ator

’s do

cum

enta

-ti

on n

eeds

thr

ough

the

com

pila

tion

an

d/or

man

agem

ent

of c

ompa

rabl

e vi

rtu

al

corp

ora.

As

a re

sult

, tra

nsl

ator

s ga

in a

gre

at d

eal t

hrou

gh b

ecom

ing

both

cor

pus

com

pile

rs a

nd

user

s. �

e he

uris

tic

task

s n

eces

sary

in s

elec

tin

g sy

stem

s to

be

used

fo

r m

inin

g th

e in

form

atio

n, a

s w

ell a

s th

e pa

ralle

l tas

k of

�n

din

g th

e in

form

atio

n

that

will

be

take

n f

rom

the

Int

ern

et,

are

an a

uthe

ntic

exe

rcis

e in

app

lied

docu

-m

enta

tion

. Sim

ult

aneo

usly

, thi

s le

ads

to th

e de

velo

pmen

t of d

ocum

enta

tion

com

-pe

ten

ce a

nd,

as

a re

sult

, lin

guis

tic-

text

ual c

ompe

ten

ce fo

r th

e tr

ansl

ator

.A

t th

e sa

me

tim

e, a

wel

l pla

nn

ed v

irtu

al c

orpu

s th

at c

ompl

ies

wit

h ap

prop

ri-

ate

desi

gn c

rite

ria

and

whi

ch i

s re

pres

enta

tive

in

ter

ms

of t

he t

ype

of t

arge

t te

xt

that

is

requ

ired

may

con

trib

ute

to t

he d

evel

opm

ent

of t

ran

slat

ors’

ove

rall

com

-pe

ten

ce. �

e pr

epar

ator

y ta

sks

invo

lved

in

sel

ecti

ng

and

eval

uati

ng

info

rmat

ion

so

urce

s le

ad to

obv

ious

sav

ings

in te

rms

of t

ime

and

e#or

t th

at a

llow

the

tra

nsl

a-to

r to

foc

us o

n o

ther

issu

es t

hat

requ

ire

mor

e at

tent

ion

, suc

h as

tak

ing

deci

sion

s or

eva

luat

ing

di#

eren

t tra

nsl

atio

n o

ptio

ns.

In t

his

arti

cle

we

have

foc

used

on

the

use

of

virt

ual

corp

ora

as t

he d

ocu-

men

tati

on r

esou

rce

par

exce

llen

ce in

spe

cial

ist t

ran

slat

ion

trai

nin

g. H

owev

er, t

he

met

hodo

logy

beh

ind

corp

us c

ompi

lati

on is

not

alw

ays

very

cle

ar a

nd

all t

oo o

*en

th

e av

aila

bilit

y of

doc

umen

ts o

n t

he I

nter

net

is t

he c

ruci

al c

rite

rion

whi

ch d

eter

-m

ines

the

siz

e of

the

col

lect

ion

of

text

s. A

s a

resu

lt, i

f th

e co

llect

ion

of

text

s is

to

qual

ify

as a

“co

rpus

” an

d be

con

side

red

as r

epre

sent

ativ

e of

a p

arti

cula

r �

eld,

it is

es

sent

ial t

hat

it c

onfo

rms

to c

lear

des

ign

par

amet

ers

that

are

set

out

fro

m t

he b

e-gi

nn

ing

follo

wed

by

a sp

eci�

c co

mpi

lati

on p

roto

col.

�is

pro

toco

l is

divi

ded

into

fo

ur d

isti

nct

pha

ses:

(a)

loca

tin

g an

d ac

cess

ing

reso

urce

s; (

b) d

own

load

ing

data

; (c

) te

xt fo

rmat

tin

g; a

nd

(d)

data

sto

rage

.C

orpu

s re

pres

enta

tive

nes

s m

ay a

lso

be m

easu

red

a p

oste

rior

i us

ing

ReC

or, a

co

mpu

ter

prog

ram

me

that

cal

cula

tes

the

min

imum

num

ber

of d

ocum

ents

an

d w

ords

that

sho

uld

be

incl

uded

in s

peci

alis

ed la

ngu

age

corp

ora,

in o

rder

that

they

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

103

may

be

con

side

red

repr

esen

tati

ve. I

t sh

ould

be

poin

ted

out

that

it is

not

pos

sibl

e to

est

ablis

h th

e m

inim

um n

umbe

r of

doc

umen

ts fo

r a

give

n c

orpu

s a

pri

ori,

as th

e si

ze w

ill d

epen

d on

the

lan

guag

e an

d te

xt ty

pes

invo

lved

, as

wel

l as

on th

e re

stri

c-ti

ons

of a

par

ticu

lar

spec

ialis

ed �

eld

and

any

othe

r di

asys

tem

atic

lim

itat

ion

s.V

irtu

al c

ompa

rabl

e co

rpor

a, c

onst

ruct

ed i

n a

ccor

dan

ce w

ith

the

pro

toco

l ou

tlin

ed i

n t

his

stud

y, a

re e

xtre

mel

y us

efu

l for

the

stu

dy o

f di

scou

rse

wit

hin

the

�e

ld o

f sp

ecia

lisat

ion

un

der

exam

inat

ion

, the

way

thi

s di

scou

rse

man

ifes

ts i

tsel

f in

the

res

pect

ive

docu

men

ts a

s w

ell a

s th

e fo

rms

thes

e te

xts

take

in p

ract

ice.

�is

ut

ility

may

be

seen

fro

m a

mon

olin

gual

an

d m

onoc

ult

ural

per

spec

tive

as

wel

l as

from

the

poin

t of v

iew

of t

ran

slat

ion

, com

pari

son

an

d in

terl

ingu

isti

c an

d in

terc

ul-

tura

l med

iati

on. A

s a

resu

lt, t

he v

irtu

al c

orpu

s m

ay b

e vi

ewed

as

a hi

ghly

e#

ecti

ve

tool

in s

peci

alis

ed tr

ansl

atio

n tr

ain

ing

sin

ce it

pro

mot

es a

uton

omou

s pr

oces

ses

of

teac

hin

g-le

arn

ing

by e

stab

lishi

ng

appr

opri

ate

mec

han

ism

s fo

r sp

ecia

lisat

ion

an

d di

vers

i�ca

tion

for

the

tran

slat

or. I

n a

ddit

ion

, it

enco

urag

es t

he s

tudy

of t

exts

tha

t st

uden

ts h

ave

tran

slat

ed w

ith

the

obje

ctiv

e of

cor

rect

ing

and

valid

atin

g tr

ansl

atio

n

assi

gnm

ents

, as

wel

l as

man

y ot

her

poss

ible

use

s th

at a

re s

till

to b

e di

scov

ered

.

Ref

eren

ces

AC

T. 2

005.

Pri

mer

est

ud

io d

e m

erca

do

de

los

serv

icio

s d

e tr

adu

cció

n p

rofe

sion

al e

n E

spañ

a d

e la

Aso

ciac

ión

de

Em

pres

as

de

Tra

du

cció

n (

AC

T).

Mad

rid:

AC

T.A

lmah

ano

Güe

to, I

. 200

2. E

l co

ntr

ato

de

viaj

e co

mbi

nad

o en

ale

mán

y e

spañ

ol: L

as

con

dic

ion

es

gen

eral

es. U

n e

stu

dio

ba

sad

o en

cor

pus.

PhD

�es

is. M

álag

a: U

niv

ersi

dad

de M

álag

a.A

ston

, G. (

ed.)

. 200

1. L

earn

ing

wit

h C

orpo

ra. B

olon

ia: C

LUE

B.

Aur

iole

s M

artí

n, A

. 200

5 [2

002]

. In

trod

ucc

ión

al

Der

ech

o T

urí

stic

o (D

erec

ho

Pri

vad

o d

el T

uri

s-

mo)

. Mad

rid:

Tec

nos

. A

urio

les

Mar

tín

, A.,

Ben

avid

es V

elas

co, P

. G. a

nd

Gon

zále

z Fe

rnán

dez,

M. B

. 200

4. C

ontr

ata

-

ción

Tu

ríst

ica

. Tec

hnic

al d

ocum

ent

BFF

2003

-046

16 M

CY

T/T

I-D

T-2

004-

1. 1

–12.

<ht

tp:/

/tu

rico

r.co

m/p

riva

da/d

ocum

ento

s/T

I-D

T-2

004-

1.pd

f>. [

14/0

3/20

07].

Aus

term

üh

l, F.

200

1. E

lect

ron

ic T

ools

for

Tra

nsl

ator

s. M

anch

este

r: S

t. Je

rom

e.B

ern

ardi

ni,

S. a

nd

Zan

etti

n, F

. (ed

s). 2

000.

I co

rpor

a n

ella

did

atti

ca d

ella

tra

du

zion

e. C

orpu

s U

se

and

Lea

rnin

g to

Tra

nsl

ate.

Bol

onia

: CLU

EB

.B

iber

, D.,

Con

rad,

S. a

nd

Rep

pen

, R. 1

998.

Cor

pus

Lin

guis

tics

: In

vest

igat

ing

Lan

guag

e St

ruct

ure

and

Use

. Cam

brid

ge: C

ambr

idge

Un

iver

sity

Pre

ss.

Bow

ker,

L.

2002

. C

ompu

ter-

Aid

ed T

ran

slat

ion

Tec

hn

olog

y: A

Pra

ctic

al I

ntr

odu

ctio

n.

Ott

awa:

U

niv

ersi

ty o

f Ott

awa

Pre

ss.

Bow

ker,

L. a

nd

Pear

son

, J. 2

002.

Wor

kin

g w

ith

Spe

cial

ized

Lan

guag

e: A

pra

ctic

al g

uid

e to

usi

ng

corp

ora

. Lon

don

: Rou

tled

ge.

Bra

un, E

. 200

5 [1

996]

. “E

l cao

s or

den

a la

lin

güís

tica

. La

ley

de Z

ipf.”

In

Cao

s fr

acta

les

y co

sas

rara

s, E

. Bra

un (

ed.)

. Mex

ico

D.F

.: Fo

ndo

de

Cu

ltur

a E

con

ómic

a. <

http

://o

meg

a.ilc

e.ed

u.m

x:30

00/s

ites

/cie

nci

a/vo

lum

en3/

cien

cia3

/150

/htm

/cao

s.ht

m>

[14

/03/

2007

].

Page 16: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

104

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

Car

rasc

o Ji

mén

ez, R

. C. 2

003.

La

ley

de

Zip

f en

la B

ibli

otec

a M

igu

el d

e C

erva

nte

s. A

lican

te: U

ni-

vers

idad

de

Alic

ante

. <ht

tp:/

/ww

w.d

lsi.u

a.es

/asi

gnat

uras

/aa/

Zip

f.pdf

> [

14/0

3/20

07].

CO

RIS

/CO

DIS

. 20

06.

“Pro

gett

azio

ne

e co

stru

zion

e di

un

Cor

pus

di I

talia

no

Scri

tto.

” C

O-

RIS

/CO

DIS

. B

olog

na:

C

ILT

A.

<ht

tp:/

/cor

pus.

cilt

a.un

ibo.

it:8

080/

cori

s_it

aPro

gett

.htm

l>

[14/

03/2

007]

.C

orpa

s P

asto

r, G

. 200

1. “

Com

pila

ción

de

un c

orpu

s ad

hoc

par

a la

en

señ

anza

de

la t

radu

cció

n

inve

rsa

espe

cial

izad

a.” T

ran

s: R

evis

ta d

e T

rad

uct

olog

ía 5

: 155

–184

.C

orpa

s P

asto

r, G

. (ed

.) 2

003a

. Rec

urs

os d

ocu

men

tale

s y

técn

icos

par

a l

a t

rad

ucc

ión

del

dis

curs

o

jurí

dic

o (e

spañ

ol, a

lem

án, i

ngl

és, i

tali

ano,

ára

be).

Gra

nad

a: C

omar

es.

Cor

pas

Pas

tor,

G. 2

003b

. “D

iseñ

o de

un

tip

olog

izad

or p

ara

la t

radu

cció

n ju

rídi

ca: D

el c

orpu

s al

pro

toti

po t

extu

al.”

In R

ecu

rsos

doc

um

enta

les

y té

cnic

os p

ara

la

tra

du

cció

n d

el d

iscu

rso

jurí

dic

o (e

spañ

ol,

alem

án,

ingl

és,

ital

ian

o, á

rabe

), G

. Cor

pas

Pas

tor

(ed.

), 3

3–58

. Gra

nad

a:

Com

ares

.C

orpa

s P

asto

r, G

. 200

4a. “

Loc

aliz

ació

n d

e re

curs

os y

com

pila

ción

de

corp

us v

ía I

nter

net

: Apl

i-ca

cion

es p

ara

la d

idác

tica

de

la t

radu

cció

n m

édic

a es

peci

aliz

ada.”

In

Man

ual

de

doc

um

en-

taci

ón y

ter

min

olog

ía p

ara

la

tra

du

cció

n e

spec

iali

zad

a,

C.

Gon

zalo

Gar

cía

and

V.

Gar

cía

Yebr

a (e

ds),

223

–257

. Mad

rid:

Arc

o/L

ibro

s.

Cor

pas

Pas

tor,

G. 2

004b

. “�

e Tu

rico

r P

roje

ct: W

ork

in P

rogr

ess.”

Rev

ista

Eu

rope

a d

e D

erec

ho

de

la N

aveg

ació

n M

arít

ima

y A

ren

onáu

tica

xx

: 1–1

4. <

http

://t

uric

or.c

om/p

df/c

orpa

s200

4b.

pdf>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. 200

4c. “

La

trad

ucci

ón d

e te

xtos

méd

icos

esp

ecia

lizad

os a

tra

vés

de r

ecur

sos

elec

trón

icos

y c

orpu

s vi

rtua

les.”

In

La

s pa

labr

as

del

tra

du

ctor

. A

cta

s d

el I

I C

ongr

eso

Inte

r-

nac

ion

al «

El e

spañ

ol, l

engu

a d

e tr

adu

cció

n»,

20

y 21

de

may

o, T

oled

o 20

04, L

. Gon

zále

z an

d P.

Her

núñ

ez (

eds)

, 137

–164

. Bru

ssel

s: C

omis

ión

Eur

opea

/ESL

ET

RA

. <ht

tp:/

/ww

w.tu

rico

r.co

m/p

df/c

orpa

s200

4c.p

df>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. 200

6a.

El

con

cept

o d

e re

pres

enta

tivi

dad

en

la

Lin

güís

tica

del

Cor

pus:

Apr

oxim

acio

nes

teó

rica

s y

met

odol

ógic

as.

Tec

hnic

al d

ocum

ent

BFF

2003

-046

16

MC

YT

/TI-

DT

-200

6-1.

C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. 20

06b.

“R

ecur

sos

docu

men

tale

s pa

ra l

a tr

aduc

ción

de

se-

guro

s tu

ríst

icos

en

el

par

de l

engu

as i

ngl

és-e

spañ

ol.”

In I

nve

stig

ació

n y

tra

du

cció

n:

Un

a

mir

ada

al p

rese

nte

en

la la

bor

inve

stig

ador

a y

en

el e

jerc

icio

de

la p

rofe

sión

de

la li

cen

ciat

ura

Tra

du

cció

n e

In

terp

reta

ción

, E. P

osti

go P

inaz

o (e

d.).

Mál

aga:

Un

iver

sida

d de

Mál

aga.

Cor

pas

Pas

tor,

G. a

nd

Segh

iri,

M. 2

007a

. “Sp

ecia

lized

Cor

pora

for

Tra

nsl

ator

s: A

Qu

anti

tati

ve

Met

hod

to D

eter

min

e R

epre

sent

ativ

enes

s.” T

ran

slat

ion

Jour

nal

11

(3).

< h

ttp:

//tr

ansl

atio

n-

jour

nal

.net

/jou

rnal

/41c

orpu

s.ht

m>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. 200

7b. “

Det

erm

inac

ión

del

um

bral

de

repr

esen

tati

vida

d de

un

cor

pus

med

iant

e el

alg

orit

mo

N-C

or.”

Pro

cesa

mie

nto

del

Len

guaj

e N

atu

ral 3

9: 1

65–1

72.

<ht

tp:/

/ww

w.s

epln

.org

/rev

ista

SEP

LN

/rev

ista

/39/

20.p

df>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. For

thco

min

g. E

l co

nce

pto

de

repr

esen

tati

vid

ad e

n l

ingü

ísti

ca

de

corp

us:

Apr

oxim

acio

nes

teó

rica

s y

con

secu

enci

as

para

la t

rad

ucc

ión

. Mál

aga:

Ser

vici

o de

P

ublic

acio

nes

de

la U

niv

ersi

dad.

Cou

nci

l Dir

ecti

ve 7

3/24

0/E

EC

of

24 J

uly

197

3 ab

olis

hin

g re

stri

ctio

ns

on f

reed

om o

f es

tabl

ish-

men

t in

the

bus

ines

s of

dir

ect i

nsu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.C

oun

cil D

irec

tive

76/

580/

EE

C o

f 29

Jun

e 19

76 a

men

din

g D

irec

tive

73/

239/

EE

C o

n t

he c

oor-

din

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he t

akin

g up

an

d pu

rsui

t of t

he b

usin

ess

of d

irec

t in

sura

nce

oth

er t

han

life

ass

uran

ce.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

105

Cou

nci

l D

irec

tive

78/

473/

EE

C o

f 30

May

197

8 on

the

coo

rdin

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to C

omm

unit

y co

-in

sura

nce

.C

oun

cil D

irec

tive

84/

641/

EE

C o

f 10

Dec

embe

r 19

84 a

men

din

g, p

arti

cula

rly

as r

egar

ds t

ouri

st

assi

stan

ce, t

he F

irst

Dir

ecti

ve (

73/2

39/E

EC

) on

the

coo

rdin

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he t

akin

g-up

an

d pu

rsui

t of

the

bus

ines

s of

dir

ect

insu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.C

oun

cil D

irec

tive

87/

343/

EE

C o

f 22

Jun

e 19

87 a

men

din

g, a

s re

gard

s cr

edit

insu

ran

ce a

nd

sure

-ty

ship

insu

ran

ce, F

irst

Dir

ecti

ve 7

3/23

9/E

EC

on

the

coo

rdin

atio

n o

f law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

the

tak

ing-

up a

nd

purs

uit

of t

he b

usin

ess

of d

irec

t in

sura

nce

oth

er t

han

life

ass

uran

ce.

Cou

nci

l D

irec

tive

87/

344/

EE

C o

f 22

Jun

e 19

87 o

n t

he c

oord

inat

ion

of

law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

lega

l exp

ense

s in

sura

nce

.C

oun

cil D

irec

tive

90/

618/

EE

C o

f 8

Nov

embe

r 19

90, a

men

din

g, p

arti

cula

rly

as r

egar

ds m

otor

ve

hicl

e lia

bilit

y in

sura

nce

, �rs

t C

oun

cil D

irec

tive

73/

239/

EE

C a

nd

seco

nd

Cou

nci

l Dir

ec-

tive

88/

357/

EE

C o

n t

he c

oord

inat

ion

of

law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

dir

ect i

nsu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.C

oun

cil D

irec

tive

92/

49/E

EC

of 1

8 Ju

ne

1992

on

the

coo

rdin

atio

n o

f law

s, r

egu

lati

ons

and

ad-

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to d

irec

t in

sura

nce

oth

er th

an li

fe a

ssur

ance

an

d am

endi

ng

Dir

ecti

ves

73/2

39/E

EC

an

d 88

/357

/EE

C (

thir

d n

on-l

ife

insu

ran

ce D

irec

tive

).C

oun

cil

Dir

ecti

ve 9

2/96

/EE

C o

f 10

Nov

embe

r 19

92 o

n t

he c

oord

inat

ion

of

law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

dir

ect

life

assu

ran

ce a

nd

amen

din

g D

irec

tive

s 79

/267

/EE

C a

nd

90/6

19/E

EC

(th

ird

life

assu

ran

ce D

irec

tive

).D

irec

tive

200

0/26

/EC

of

the

Eur

opea

n P

arlia

men

t an

d of

the

Cou

nci

l of

16

May

200

0 on

the

ap

prox

imat

ion

of t

he la

ws

of t

he M

embe

r St

ates

rel

atin

g to

insu

ran

ce a

gain

st c

ivil

liabi

lity

in r

espe

ct o

f the

use

of m

otor

veh

icle

s an

d am

endi

ng

Cou

nci

l Dir

ecti

ves

73/2

39/E

EC

an

d 88

/357

/EE

C.

Dir

ecti

ve 2

000/

64/E

C o

f th

e E

urop

ean

Par

liam

ent

and

of t

he C

oun

cil

of 7

Nov

embe

r 20

00

amen

din

g C

oun

cil D

irec

tive

s 85

/611

/EE

C, 9

2/49

/EE

C, 9

2/96

/EE

C a

nd

93/2

2/E

EC

as

re-

gard

s ex

chan

ge o

f in

form

atio

n w

ith

thir

d co

untr

ies.

Dir

ecti

ve 2

002/

13/E

C o

f the

Eur

opea

n P

arlia

men

t an

d of

the

Cou

nci

l of 5

Mar

ch 2

002

amen

d-in

g C

oun

cil D

irec

tive

73/

239/

EE

C a

s re

gard

s th

e so

lven

cy m

argi

n r

equi

rem

ents

for

non

-lif

e in

sura

nce

un

dert

akin

gs.

Dir

ecti

ve 2

002/

92/E

C o

f th

e E

urop

ean

Par

liam

ent

and

of t

he C

oun

cil o

f 9

Dec

embe

r 20

02 o

n

insu

ran

ce m

edia

tion

.E

urop

ean

Par

liam

ent

and

Cou

nci

l D

irec

tive

95/

26/E

C o

f 29

Jun

e 19

95 a

men

din

g D

irec

tive

s 77

/780

/EE

C a

nd

89/6

46/E

EC

in th

e �

eld

of c

redi

t in

stit

utio

ns,

Dir

ecti

ves

73/2

39/E

EC

an

d 92

/49/

EE

C in

the

�el

d of

non

- lif

e in

sura

nce

, Dir

ecti

ves

79/2

67/E

EC

an

d 92

/96/

EE

C in

the

�el

d of

life

ass

uran

ce, D

irec

tive

93/

22/E

EC

in t

he �

eld

of in

vest

men

t �

rms

and

Dir

ecti

ve

85/6

11/E

EC

in th

e �

eld

of u

nde

rtak

ings

for

colle

ctiv

e in

vest

men

t in

tran

sfer

able

sec

urit

ies

(Uci

ts),

wit

h a

view

to r

ein

forc

ing

prud

enti

al s

uper

visi

on.

Firs

t C

oun

cil

Dir

ecti

ve 7

3/23

9/E

EC

of

24 J

uly

197

3 on

the

coo

rdin

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he t

akin

g-up

an

d pu

rsui

t of

the

bus

ines

s of

di-

rect

insu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.Fl

etch

er, W

. H. 2

004.

“Fa

cilit

atin

g th

e C

ompi

lati

on a

nd

Dis

sem

inat

ion

of

Ad-

Hoc

Web

Cor

-po

ra.”

In !

e F

ith

In

tern

atio

nal

Con

fere

nce

on

Tea

chin

g an

d L

angu

age

Cor

pora

, G. A

ston

, S.

Ber

nar

din

i an

d D

. Ste

war

t (e

ds),

1–1

8. A

mst

erda

m: B

enja

min

s. <

http

://w

ww

.kw

ic�

nd-

Page 17: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

106

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

er.c

om/F

acili

tati

ng_

Com

pila

tion

_an

d_D

isse

min

atio

n_o

f_A

d-H

oc_W

eb_C

orpo

ra.p

df>

[1

4/03

/200

7].

Gio

uli,

V. a

nd

Pip

erid

is, S

. 200

2. C

orpo

ra a

nd

HLT

. Cu

rren

t tr

end

s in

cor

pus

proc

essi

ng

and

an

-

not

atio

n. B

ulg

aria

: In

situ

te fo

r L

angu

age

and

Spee

ch P

roce

ssin

g. <

http

://w

ww

.lar!

ast.b

as.

bg/b

alri

c/en

g_�

les/

corp

ora1

.php

> [

14/0

3/20

07].

Gra

nge

r, S

. an

d Pe

tch-

Tys

on, S

. (ed

.). 2

003.

Ext

end

ing

the

Scop

e of

Cor

pus-

Ba

sed

Res

earc

h: N

ew

App

lica

tion

s, N

ew C

hal

len

ges.

Am

ster

dam

an

d A

tlan

ta: R

odop

i.H

eaps

, H

. S.

197

8. I

nfo

rmat

ion

Ret

riev

al:

Com

puta

tion

al a

nd

!eo

reti

cal

Asp

ects

. N

ew Y

ork:

A

cade

mic

Pre

ss.

Insu

ran

ce A

ct 2

000.

K

enny

, D

. 20

01.

Lex

is a

nd

Cre

ativ

ity

in T

ran

slat

ion

. A

Cor

pus-

base

d S

tud

y. M

anch

este

r: S

t. Je

rom

e.L

avid

Lóp

ez, J

. 200

5. L

engu

aje

y n

uev

as

tecn

olog

ías:

nu

eva

s pe

rspe

ctiv

as,

mét

odos

y h

erra

mie

nta

s

para

el l

ingü

ista

del

sig

lo X

XI.

Mad

rid:

Cát

edra

.L

avio

sa, S

. (ed

.). 1

998.

L’a

ppro

che

basé

e su

r le

cor

pus

/ !

e C

orpu

s-ba

sed

App

roac

h, M

eta

43 (

4).

Ley

18/

1997

, de

13 d

e m

ayo,

de

mod

i�ca

cion

es d

el a

rtíc

ulo

8 d

e la

Ley

de

Con

trat

o de

Seg

uro,

pa

ra g

aran

tiza

r la

ple

na

utili

zaci

ón d

e to

das

las

len

guas

o�c

iale

s en

la

reda

cció

n d

e lo

s co

ntra

tos.

BO

E. 0

115

de 1

4 de

may

o de

199

7.

Ley

30/

1995

, de

8 de

nov

iem

bre,

de

orde

nac

ión

y s

uper

visi

ón d

e lo

s Se

guro

s P

riva

dos.

Ley

50/

1980

, de

8 de

oct

ubre

, del

Con

trat

o de

Seg

uro.

L

ey 5

0/19

80, d

e 8

de

octu

bre,

del

Con

trat

o d

e Se

guro

.

Mor

eiro

Gon

zále

z, J

. A. 2

002.

“Apl

icac

ion

es a

l an

ális

is a

utom

átic

o de

l con

ten

ido

prov

enie

ntes

de

la t

eorí

a m

atem

átic

a de

la in

form

ació

n.”

An

ales

de

doc

um

enta

ción

5: 2

73–2

86. <

http

://

ww

w.u

m.e

s/fc

cd/a

nal

es/a

d05/

ad05

15.p

df>

[14

/03/

2007

].O

rden

Min

iste

rial

de

27 d

e en

ero

de 1

988

por

la q

ue s

e ca

li�ca

la c

ober

tura

de

las

pres

taci

ones

de

asi

sten

cia

en v

iaje

com

o op

erac

ión

de

segu

ro p

riva

do.

Pear

son

, J.

1998

. T

erm

s in

Con

text

, St

ud

ies

in C

orpu

s L

ingu

isti

cs.

Am

ster

dam

/Phi

lade

lphi

a:

John

Ben

jam

ins

Pub

lishi

ng.

Rad

ev, D

., Fa

n, W

., Q

i, H

., W

u, H

. an

d G

rew

al, A

. 200

5. “

Pro

babi

listi

c qu

esti

on a

nsw

erin

g on

th

e w

eb.”

Jou

rnal

of

the

Am

eric

an S

ocie

ty f

or I

nfo

rmat

ion

Sci

ence

an

d T

ech

nol

ogy

(JA

SIST

) 56

(6)

: 571

–583

. <ht

tp:/

/�le

box.

vt.e

du/u

sers

/wfa

n/p

aper

/ww

w/w

ww

.pdf

> [

14/0

3/20

07].

San

ahuj

a, S

. an

d Si

lva,

A. 2

001.

“M

uest

reo

teór

ico

y es

tudi

os d

el d

iscu

rso.

Un

a pr

opue

sta

teór

i-co

-met

odol

ógic

a pa

ra l

a ge

ner

ació

n d

e ca

tego

rías

sig

ni�

cati

vas

en e

l ca

mpo

del

An

ális

is

del D

iscu

rso.

” E

l Est

ud

io d

el D

iscu

rso:

Met

odol

ogía

Mu

ltid

isci

plin

aria

. II

Col

oqu

io N

acio

nal

de

Inve

stig

ador

es e

n E

stu

dio

s d

el D

iscu

rso.

La

Pla

ta,

6 al

8 d

e se

ptie

mbr

e d

e 20

01. B

uen

os

Air

es:

Aso

ciac

ión

Lat

inoa

mer

ican

a de

Est

udio

s de

l D

iscu

rso

and

Un

iver

sida

d N

acio

nal

de

l Cen

tro

de la

Pro

vin

cia

de B

uen

os A

ires

. <ht

tp:/

/ww

w.s

ai.c

om.a

r/K

UC

OR

IA/d

iscu

rso.

htm

l> [

14/0

3/20

07].

Sán

chez

-Gijó

n,

P. 2

003a

. “É

s la

web

púb

lica

la n

ova

bibl

iote

ca d

el t

radu

ctor

?” T

rad

um

àtic

a:

Tra

du

cció

i t

ecn

olog

ies

de

la i

nfo

rmac

ió i

la

com

un

icac

ió 2

: 1–

7. <

http

://w

ww

.bib

.uab

.es/

pub/

trad

umat

ica/

1578

7559

n2a

7.pd

f> [

14/0

3/20

07].

Sán

chez

-Gijó

n, P

. 200

3b. E

ls d

ocu

men

ts d

igit

als

espe

cial

itza

ts: u

tili

tzac

ió d

e la

lin

güís

tica

de

cor-

pus

com

a f

ont

de

recu

rsos

per

a la

tra

du

cció

. PhD

�es

is. B

arce

lon

a: U

niv

ersi

dad

Aut

óno-

ma

de B

arce

lon

a.Sá

nch

ez P

érez

, A. a

nd

Can

tos

Góm

ez, P

. 199

7. “

Pre

dict

abili

ty o

f Wor

d Fo

rms

(Typ

es)

and

Lem

-m

as in

Lin

guis

tic

Cor

pora

. A C

ase

Stud

y B

ased

on

the

An

alys

is o

f th

e C

UM

BR

E C

orpu

s:

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

107

An

8-M

illio

n-W

ord

Cor

pus

of C

onte

mpo

rary

Spa

nis

h.”

Inte

rnat

ion

al J

ourn

al o

f C

orpu

s

Lin

guis

tics

2 (

2): 2

59–2

80.

Seco

nd

Cou

nci

l Dir

ecti

ve 8

8/35

7/E

EC

of 2

2 Ju

ne

1988

on

the

coor

din

atio

n o

f law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

dir

ect

insu

ran

ce o

ther

tha

n l

ife

assu

ran

ce a

nd

layi

ng

dow

n p

rovi

sion

s to

fac

ilita

te t

he e

#ec

tive

exe

rcis

e of

fre

edom

to

prov

ide

serv

ices

an

d am

endi

ng

Dir

ecti

ve 7

3/23

9/E

EC

.Se

ghir

i, M

. 200

6. C

ompi

laci

ón d

e u

n c

orpu

s tr

ilin

güe

de

segu

ros

turí

stic

os (

espa

ñol

-in

glés

-ita

l-

ian

o):

asp

ecto

s d

e ev

alu

ació

n,

cata

loga

ción

, d

iseñ

o y

repr

esen

tati

vid

ad [

Com

pila

tion

of

a

tril

ingu

al c

orpu

s of

tra

vel

insu

ran

ce c

ontr

acts

(E

ngl

ish

-Ita

lian

-Spa

nis

h):

eva

luat

ion

, cla

ssi"

-

cati

on, d

esig

n a

nd

rep

rese

nta

tive

nes

s]. P

hD �

esis

. Mál

aga:

Un

iver

sida

d de

Mál

aga.

Si

ncl

air,

J. M

. 199

1. C

orpu

s, C

onco

rdan

ce, C

ollo

cati

on. O

xfor

d: O

xfor

d U

niv

ersi

ty P

ress

. �

e Fi

nan

cial

Ser

vice

s an

d M

arke

ts A

ct 2

000

(Reg

ula

ted

Act

ivit

ies)

.�

e In

sure

rs (

Reo

rgan

isat

ion

an

d W

indi

ng

Up)

Reg

ula

tion

s 20

04.

Var

anto

la, K

. 199

7. “

Tra

nsl

ator

s, d

icti

onar

ies

and

text

cor

pora

.” In

I co

rpor

a n

ella

did

atti

ca d

ella

trad

uzi

one,

S. B

ern

ardi

ni a

nd

F. Z

anet

tin

(ed

s), 1

17–1

33. B

olog

na:

CLU

EB

. W

TT

C.

2006

a. W

orld

Tra

vel

and

Tou

rism

cli

mbi

ng

to n

ew h

eigh

ts.

!e

2006

Tra

vel

& T

our-

ism

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/W

orld

.pdf

> [

14/0

3/20

07].

WT

TC

. 200

6b. U

nit

ed K

ingd

om T

rave

l an

d T

ouri

sm c

lim

bin

g to

new

hei

ghts

. !e

2006

Tra

vel &

Tou

rism

Eco

nom

ic R

esea

rch

. Lon

don

: Wor

ld T

rave

l & T

ouri

sm C

oun

cil.

<ht

tp:/

/ww

w.w

ttc.

org/

2006

TSA

/pdf

/Un

ited

%20

Kin

gdom

.pdf

> [

14/0

3/20

07].

WT

TC

. 20

06c.

Ire

lan

d T

rave

l an

d T

ouri

sm c

lim

bin

g to

new

hei

ghts

. !

e 20

06 T

rave

l &

Tou

r-

ism

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/I

rela

nd.

pdf>

[14

/03/

2007

].W

TT

C.

2006

d. I

taly

Tra

vel

and

Tou

rism

cli

mbi

ng

to n

ew h

eigh

ts.

!e

2006

Tra

vel

& T

ouri

sm

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/I

taly

.pdf

> [

14/0

3/20

07].

WT

TC

. 20

06e.

Spa

in T

rave

l an

d T

ouri

sm c

lim

bin

g to

new

hei

ghts

. !

e 20

06 T

rave

l &

Tou

r-

ism

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/S

pain

.pdf

> [

14/0

3/20

07].

Yan

g, D

., C

anto

s G

ómez

, P. a

nd

Son

g, M

. 200

0. “A

n A

lgor

ithm

for

Pre

dict

ing

the

Rel

atio

nsh

ip

betw

een

Lem

mas

an

d C

orpu

s Si

ze.”

ET

RI

Jou

rnal

22

(2):

20–

31.

<ht

tp:/

/etr

ij.et

ri.r

e.kr

/ C

yber

/ser

vlet

/Get

File

?�le

id=

SPF-

1042

4533

5498

8> [

14/0

3/20

07].

Youn

g-M

i, Je

ong.

199

5. “

Stat

isti

cal C

hara

cter

isti

cs o

f K

orea

n V

ocab

ula

ry a

nd

Its

App

licat

ion”

. L

exic

ogra

phic

Stu

dy

5 (6

): 1

34–1

63.

Zan

etti

n,

F. 2

002a

. “D

IY C

orpo

ra:

�e

WW

W a

nd

the

Tra

nsl

ator

.” In

Tra

inin

g th

e L

angu

age

Serv

ices

Pro

vid

er f

or t

he

New

Mil

len

niu

m, B

. Mai

a; J

. Hal

ler

and

M. U

rlry

ch (

eds)

. Por

to:

Facu

ltad

e de

Let

ras,

Un

iver

sida

de d

o Po

rto.

<ht

tp:/

/ww

w.fe

deri

coza

net

tin

.net

/DIY

cor-

pora

.htm

> [

14/0

3/20

07].

Zan

etti

n, F

. 200

2b. “

CE

XI.

Des

ign

ing

an E

ngl

ish

Ital

ian

Tra

nsl

atio

nal

Cor

pus.”

In

Tea

chin

g an

d

Lea

rnin

g by

Doi

ng

Cor

pus

An

alys

is, B

. Ket

tem

an a

nd

G. M

arko

(ed

s), 3

29–3

43. A

mst

er-

dam

: Rod

opi.

Zan

etti

n, F

., B

ern

ardi

ni

S. a

nd

Stew

art,

D. (

eds)

. 200

3. C

orpo

ra i

n t

ran

slat

or e

du

cati

on. M

an-

ches

ter:

St.

Jero

me.


Top Related