teradata interview prep questions

Upload: amitosh007

Post on 04-Jun-2018

235 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Teradata Interview Prep Questions

    1/52

    1.Explain Teradata ArchitectureMajor Components of Teradata ArchitectureNODE: A node is made up of various hardware and software components.Components that make up a node are

    1. Parsing Engine (PE)

    2. BYE!". Access #odu$e Processor (A#P)%. &isks

    Parsin Enine!he Parsing Engine (PE) is a component that interprets ' re*uests+ receives inputrecords+ and passes data. !o do that it sends the messages through the BYE! to theA#Ps.!"NET!he BYE! is the message passing $a,er. -t determines which A#P(s)(Access #odu$eProcessor) shou$d receive a message.Access Module Processor #AMP$

    !he A#P is a virtua$ processor designed for and dedicated to managing a portion of theentire dataase. -t performs a$$ dataase management functions such as sorting+aggregating+ and formatting data. !he A#P receives data from the PE+ formats rows+ anddistriutes them to the disk storage units it contro$s. !he A#P a$so retrieves the rowsre*uested , the Parsing Engine.Dis%s&isks are disk drives associated with an A#P that store the data rows. /n currents,stems+ the, are imp$emented using a dis% arra&

  • 8/13/2019 Teradata Interview Prep Questions

    2/52

    A$$ app$ications run under 0-+ indows ! or indows 2333 and a$$ !eradata softwareruns under P&E. A$$ share the resources of CP0 and memor, on the node.A#Ps and PEs are 'irtual processors running under contro$ of the P&E.!heir numersare software configura$e. -n addition to user app$ications+ gatewa, software and channe$

    driver support ma, a$so e running.

    !he !eradata 4&B#' has a 5shared6nothing5 architecture+ which means that the vprocs(which are the PEs and A#Ps) do not share common components. 7or e8amp$e+ eachA#P manages its own dedicated memor, space (taken from the memor, poo$) and thedata on its own vdisk 66 these are not shared with other A#Ps. Each A#P uses s,stemresources independent$, of the other A#Ps so the, can a$$ work in para$$e$ for high s,stemperformance overa$$.

    (&mmetric Multi)Processor #(MP$:A sing$e node is a ',mmetric #u$ti6Processor ('#P)

    Massi'el& Parallel Processin #MPP$:hen mu$tip$e '#P nodes are connected to forma $arger configuration+we refer to this as a #assive$, Para$$e$ Processing (#PP) s,stem.

    M *.+unctionalit& of each that include in Teradata architecture9

    Parsin Enine:

    A Parsing Engine (PE) is a virtua$ processor(vproc). -t is made up of the fo$$owing software components9 1. 'ession Contro$+

    2. Parser+"./ptimi:er+%. &ispatcher.

  • 8/13/2019 Teradata Interview Prep Questions

    3/52

    (ession Control!he ma;or functions performed , 'ession Contro$ are $ogon and $ogoff. ogon takes a te8tua$ re*uest forsession authori:ation+ verifies it+ and returns a ,es or no answer. ogoff terminates an, ongoing activit, andde$etes the session

  • 8/13/2019 Teradata Interview Prep Questions

    4/52

    !he !"NET hand$es the interna$ communication of the !eradata 4&B#'. A$$ communication etween PEsand A#Ps is done via the BYE!.

    hen the PE dispatches the steps for the A#Ps to perform+ the, are dispatched onto the BYE!. !hemessages are routed to the appropriate A#P(s) where resu$ts sets and status information are generated.!his response information is a$so routed ack to the re*uesting PE via the BYE!.

    &epending on the nature of the dispatch re*uest+ the communication ma, e a9= !roadcast>message is routed to a$$ nodes in the s,stem.= Point)to)point>message is routed to one specific node in the s,stem.

    /nce the message is on a participating node+ P&E hand$es the mu$ticast(carries the message to ;ust theA#Ps that shou$d get it). 'o+ whi$e a teradata s,stem does do mu$ticast messaging+ the BYE! hardwarea$one cannot do it 6 the BYE! can on$, do point6to6point and roadcast etween nodes.

    7EA!04E' /7 BYE!9!he BYE! has severa$ uni*ue features9

    +ault tolerant9 each network has mu$tip$e connection paths. -f the BYE! detects an unusa$e path in eithernetwork+ it wi$$ automatica$$, reconfigure that network so a$$ messages avoid the unusa$e path. Additiona$$,+in the rare case that BYE! 3 cannot e reconfigured+ hardware on BYE! 3 is disa$ed and messages arere6routed to BYE! 1 (or e*ua$$, distriuted if there are more than two BYE!s present)+ and vice versa.

    -oad alanced: traffic is automatica$$, and d,namica$$, distriuted etween oth BYE!s.

    (calale9 as ,ou add nodes to the s,stem+ overa$$ network andwidth sca$es $inear$, 6 meaning an increasein s,stem si:e without $oss of performance.

    /ih Performance9 an #PP s,stem t,pica$$, has two or more BYE! networks. Because a$$ networks areactive+ the s,stem enefits from the fu$$ aggregate andwidth of a$$ networks. 'ince the numer of networkscan e sca$ed+ performance can a$so e sca$ed to meet the needs of demanding app$ications. !hetechno$og, of the BYE! is what makes the !eradata para$$e$ism possi$e.

    The Access Module Processor #AMP$

  • 8/13/2019 Teradata Interview Prep Questions

    5/52

    !he Access Module Processor #AMP$ is the virtua$ processor. An A#P wi$$ contro$ some portion of eachta$e on the s,stem. A#Ps do the ph,sica$ work associated with generating an answer set inc$uding sorting+aggregating+ formatting and converting. AnA#P can contro$ up to ?% ph,sica$ disks. !he A#Ps perform a$$ dataase management functions in thes,stem.An A#P responds to Parser@/ptimi:er steps transmitted across theBYE! , se$ecting data from or storing data to its disks. 7or some re*uests+ the A#Ps ma, redistriute acop, of the data to other A#Ps.

    !he Dataase Manaer sus,stem resides on each A#P. !he &ataase #anager9= 4eceives the steps from the &ispatcher and processes the steps. -t has the ai$it, to9 -oc% dataases and ta$es. Create+ modif&+ or delete definitions of ta$es. 0nsert delete or modif& ro2s within the ta$es. 3etrie'e information from definitions and ta$es.= Co$$ects accountin statistics+ recording accesses , session sousers can e i$$ed appropriate$,.= 4eturns responses to the &ispatcher.

    !he &ataase #anager provides a ridge etween that $ogica$ organi:ation and the ph,sica$ organi:ation ofthe data on disks. !he &ataase #anager performs a space6management function that contro$s the use and

    a$$ocation of space.

  • 8/13/2019 Teradata Interview Prep Questions

    6/52

    A dis% arra& is a configuration of disk drives that uti$i:es specia$i:ed contro$$ers to manage and distriutedata and parit, across the disks whi$e providing fast access and data integrit,.Each AMP 'proc must have access to an arra, contro$$er that in turn accesses the ph,sica$ disks. A#Pvprocs are associated with one or more ranks of data. !he tota$ disk space associated with an A#P vproc isca$$ed a 'dis%. A vdisk ma, have up to three ranks.!eradata supports severa$ protection schemes9

    = 4A-& eve$ >&ata and parit, protection striped across mu$tip$e disks.= 4A-& eve$ 1>Each disk has a ph,sica$ mirror rep$icating the data.= 4A-& eve$ '>&ata and parit, protection simi$ar to 4A-& ut used for E#C disk arra,s.!he disk arra, contro$$ers are referred to as dual acti'e arra& controllers+ which means that othcontro$$ers are active$, used in addition to serving as ackup for each other.

    4./o2 is Teradata parallel5

    !eradata is Para$$e$ for the fo$$owing reasons9

    Each PE can support up to 123 user sessions in para$$e$. Each session ma, hand$e mu$tip$e re*uests concurrent$,. hi$e on$, one re*uest at a time ma, e

    active on eha$f of a session+ the session itse$f can manage the activities of 1? re*uests and their

    associated answer sets. !he #P is imp$emented different$, for different p$atforms+ this means that it wi$$ a$wa,s e we$$

    within the needed andwidth for each particu$ar p$atform

  • 8/13/2019 Teradata Interview Prep Questions

    7/52

    operation is performed on a DP4/C

  • 8/13/2019 Teradata Interview Prep Questions

    8/52

    Data retri'al:

    4etrieving data from the !eradata 4&B#' simp$, reverses the storage mode$ process. A re*uest

    made for data is passed on to a Parsing Engine(PE). !he PE optimi:es the re*uest for efficient processingand creates tasks for the A#Ps to perform+ which resu$ts in the re*uest eing satisfied. !asks are thendispatched to the A#Ps via the BYE!. /ften+ a$$ A#Ps must participate in creating the answer set+ such asreturning a$$ rows of a ta$e to a c$ient app$ication. /ther times+ on$, one or a few A#Ps need participate. !hePE wi$$ ensure that on$, the A#Ps that need to wi$$ e assigned tasks. /nce the A#Ps have een given theirassignments+ the, retrieve the desired rows from their respective disks. !he A#Ps wi$$ sort+ aggregate+orformat if needed. !he rows are then returned to the re*uesting PE viathe BYE!. !he PE takes the returnedanswer set and returns it to the re*uesting c$ient app$ication.

    7. 0f P0 is not defined on a Teradata tale 2hat 2ill happen5 !eradata ta$es must have a primar, inde8. -f none is specified whi$e creating the ta$e+ teradata supp$ies an

    automatica$$, created one.

    8.2hat are the t&pes of indexes in Teradata5

    0ni*ue Primar, -nde8 (0P-)

    0ni*ue 'econdar, -nde8 (0'-)

    on60ni*ue Primar, -nde8 (0P-)

    on60ni*ue 'econdar, -nde8 (0P-)

    Hoin -nde8

    9.2hat is secondar& index5 hats are its uses5

    A secondar& index is an a$ternate path to the data. 'econdar, inde8es are used to improve

    performance , a$$owing the user to avoid scanning the entire ta$e during a *uer,. A secondar, inde8 is $ikea primar, inde8 in that it a$$ows the user to $ocate rows. 0n$ike a primar, inde8+ it has no inf$uence on the wa,rows are distriuted among A#Ps. 'econdar, -nde8es are optiona$ and can e created and droppedd,namica$$,. 'econdar, -nde8es re*uire separate suta$es which re*uire e8tra -@/ to maintain the inde8es.

    Comparing to primar, inde8es+ 'econdar, inde8es a$$ow access to information in a ta$e , a$ternate+$ess fre*uent$, used paths. !eradata automatica$$, creates a 'econdar, -nde8 'uta$e. !he suta$e wi$$contain9

    'econdar, -nde8 Da$ue 'econdar, -nde8 4ow -& Primar, -nde8 4ow -&

    hen a user writes an ' *uer, that has a '- in the IE4E c$ause+ the Parsing Engine wi$$ hash the

    'econdar, -nde8 Da$ue. !he output is the 4ow Iash of the '-. !he PE creates a re*uest containing the 4owIash and gives the re*uest to the #essage Passing a,er (which inc$udes the BYE! software andnetwork). !he #essage Passing a,er uses a portion of the 4ow Iash to point to a ucket in the Iash #ap.!hat ucket contains an A#P numer to which the PEJs re*uest wi$$ e sent. !he A#P gets the re*uest and

  • 8/13/2019 Teradata Interview Prep Questions

    9/52

    accesses the 'econdar, -nde8 'uta$e pertaining to the re*uested '- information. !he A#P wi$$ check tosee if the 4ow Iash e8ists in the suta$e and dou$e check the suta$e row with the actua$ secondar,inde8 va$ue. !hen+ the A#P wi$$ create a re*uest containing the Primar, -nde8 4ow -& and send it ack tothe #essage Passing a,er. !his re*uest is directed to the A#P with the ase ta$e row+ and the A#P easi$,retrieves the data row.

    'econdar, inde8es can e usefu$ for 9

    'atisf,ing comp$e8 condition

    Processing aggregates

    va$ue comparision

    #atching character comination

    Hoining ta$es

    ;.2h& is secondar& index needed5

    'econdar, inde8es are used to improve performance , a$$owing the user to avoid scanning the entire ta$eduring a *uer, 'econdar, inde8es are fre*uent$, used in the where c$ause. !he ase ta$e data arenJtredistriuted when secondar, inde8es are defined.

    'econdar, inde8es can e usefu$ for 9

    'atisf,ing comp$e8 condition

    Processing aggregates

    va$ue comparison

    #atching character comination

    Hoining ta$es

    .hen is ACCE(( loc% used5

    Access $ocks are used for the *uick access to ta$es in mu$ti6user environment even if other re*uest areupdating the data. !he, a$so have minima$ effect on $ocking out others 6 when,ou use an access $ock+ virtua$$, a$$ re*uests are compati$e with ,ours.

  • 8/13/2019 Teradata Interview Prep Questions

    10/52

    11./o2 to set default dataase5(ettin the default dataase: !he user name ,ou $ogon with is ,our temporar, dataase.7or e8amp$e +if ,ou $ogon as .$ogon acK

    password98,:then ac is norma$$, defau$t dataase

    ueries ,ou make that do not specif, dataase name wi$$ e made against ,our defau$t dataase.

    Chanin the default dataase: !he DATA!A(Ecommand is used to change the defau$t dataase

    7or e8amp$e9&A!ABA'E ir$aK

    set ,our defau$t dataase to ir$a and the suse*uent *ueries are made against ir$a dataase.

    1*.hat is a cluster5 A c$uster is a group of A#Ps that act as a sing$e fa$$ack unit. C$ustering has no effect on primar, rowdistriution of the ta$e+ ut the fa$$ack row cop, wi$$ a$wa,s go to another A#P in the same c$uster. 'hou$d an A#P fai$+ the primar, and fa$$ack row copies stored on that A#P cannot e accessed.Iowever+ their a$ternate copies are avai$a$e through the other A#Ps in the same c$uster. !he $oss of an A#P in one c$uster has no effect upon other c$usters. -t is possi$e to $ose one A#P in eachc$uster and sti$$ have fu$$ access to a$$ fa$$ack6protected ta$e data. -f there are two A#P fai$ures in the samec$uster+ the entire !eradata s,stem ha$ts.hi$e an A#P is down+ the remaining A#Ps in the c$uster must dotheir own work p$us the work of the down A#P.

    !he e8amp$e shows an 6A#P s,stem set up in two c$usters of %6A#Ps each.

    14.hat are the connections in'ol'ed in Channel attached s&stem5 -n channe$6attached s,stems+ there are three ma;or software components+ which p$a, important ro$es ingetting the re*uests to and from the !eradata 4&B#'.

  • 8/13/2019 Teradata Interview Prep Questions

    11/52

    !he client application is either written , a programmer or is one of !eradata

  • 8/13/2019 Teradata Interview Prep Questions

    12/52

    app$ication programs use these routines to perform operations such as $ogging on and off+ sumitting '*ueries and receiving responses which contain the answer set. !hese routines are MN the same in anetwork6attached environment as the, are in a channe$ attached.

    !he Teradata OD!C? #Open Dataase Connecti'it&$ driver uses an open standardsased/&BC interface to provide c$ient app$ications access to !eradata across A6asedenvironments. C4 has /&BC drivers for oth 0- and indows6ased app$ications.

    !he Micro Teradata Director Proram #MTDP$ is a !eradata6supp$ied program that must e $inked to an,app$ication that wi$$ e network6attached to the !eradata 4&B#'. !he #!&P performs man, of the functionsof the channe$ ased !&P inc$uding session management. !he #!&P does not contro$ session a$ancingacross PEs. Connect and Assign 'ervers that run on the !eradata s,stem hand$e this activit,.

    !he Micro Operatin (&stem 0nterface #MO(0$ is a $irar, of routines providing operating s,stemindependence for c$ients accessing the 4&B#'. B, using #/'-+ we on$, need one version of the #!&P torun on a$$ network6attached p$atforms.

    17./o2 do &ou replace a null 'alue 2ith a default 'alue 2hile loadin5

    0sing C/AE'CE function

    ',nta89 C/AE'CE( C/+ J&E7A0!J)

    18.hat is COMP3E((5Compress9 B, defau$t compresses the nu$$ va$ues. -n order to compress an, va$ues e8p$icit$, we need togive the characters or va$ues in order to compress those va$ues.

    19./o2 man& 'alues can 2e compress in Teradata5An, co$umn can e compressed e8cept the inde8ed co$umn and non vo$ati$e.

    1;.Difference et2een 'olatile and loal 'olatile tale5

    O$oa$ !emporar, ta$es (O!!) 61. hen the, are created+ its definition goes into &ata &ictionar,.2. hen materia$i:ed data goes in temp space.

    ". thats wh,+ data is active up to the session ends+ and definition wi$$ remain there upto its not dropped using&rop ta$e statement.-f dropped from some other session then its shou$d e &rop ta$e a$$K%. ,ou can co$$ect stats on O!!.

    Do$ati$e !emporar, ta$es (D!!) 61. !a$e &efinition is stored in ',stem cache2. &ata is stored in spoo$ space.". thats wh,+ data and ta$e definition oth are active on$, upto session ends.%. o co$$ect stats for D!!.

    1

  • 8/13/2019 Teradata Interview Prep Questions

    13/52

    ot re*uired+ un$ess referentia$ integrit, checks are to e performed. &efine , C4EA!E !ABE statement. 0ni*ue. -dentifies a row uni*ue$,. Da$ue can not e changed. Can not e nu$$. ot re$ated to access path.

    Primar& 0ndex: 0sed to store rows on disk. &efined , C4EA!E !ABE '!A!E#E! . 0ni*ue or on uni*ue. -t is used to distriute rows. Da$ues can e changed. Can e nu$$. 4e$ated to access path.

    *>.hat is multiple statement processin5#u$tip$e statement processing increases the performance when $oading into $arge ta$es. A$$

    statements are sent to parser simu$taneous$,. A$$ statements are e8ecuted para$$e$.

    *1.hat is TDP0D5!&P-& is the -P address of the teradata server machine.

    **.hat is tenacit&5'pecifies the no. of hours that teradata 7/A& continuous tr,ing to $ogon when the ma8imum no of $oad;os is a$read, running on teradata dataase.

    *4.hat is (leep5'pecifies the no. of minutes that teradata 7/A& pauses efore retr,ing on $ogon operation.

    *6.hat is dataase s%e2in5

    'kew factor occurs when the primar, inde8 co$umn se$ected is not a good candidate.#ean+ -f for a ta$e when the P- se$ected having high$, non uni*ue va$ues then 'E factor wi$$e getting , defau$t it wi$$ e :ero+ if skew factor se$ected is greater than 2 then it is not a goodsign.

    *7.hat is soft 3eferential 0nterit& and !atch 3eferential 0nterit&5(oft 3eferential 0nterit&:-t provides a mechanism to a$$ow user6specified 4eferentia$ -ntegrit, (4-) constraints that are notenforced , the dataase.Ena$es optimi:ation techni*ues such as Hoin E$imination.!atch 3eferential 0nterit&:!ests an entire insert+ de$ete+ or update atch operation for referentia$ integrit,. -f insertion+ de$etion+ orupdate of an, row in the atch vio$ates referentia$ integrit,+ then parsing engine software ro$$s ack the entire

    atch and returns an aort message.

    *8.Difference !et2een M-OAD +-OAD

    M-OAD:

    -t does the $oading in the phasesPhase19-t wi$$ get the import fi$e and checks the scriptPhase29-t reads the record from the ase ta$e and store in the work ta$ePhase"9-n this App$ication phase it $ocks the ta$e headerPhase%9-n the operation wi$$ done in the ta$esPhase 9 -n this ta$e $ocks wi$$ e re$eased and work ta$es wi$$ e dropped.

    #u$ti$oad a$$ows nonuniBuesecondar, inde8es 6 automatica$$, reui$ds them after $oading.

    #u$ti$oad can $oad at ma8 t$s at a time and can a$so update and de$ete the data

  • 8/13/2019 Teradata Interview Prep Questions

    14/52

    +ast-oad:

    7ast$oad performs the $oading of the data in 2phase and it no need a work ta$e for $oading the data so it isfaster as we$$ as it fo$$ows the e$ow steps to $oad the data in the ta$ePhase16-t moves a$$ the records to a$$ the A#P first without an, hashingPhase26After giving end $oading command+Amp wi$$ hashes the record and send it to the appropriate A#P' .

    7ast$oad is used to $oad empt, ta$es and is ver, fast+ can $oad one ta$e at a time.

    *9. Ad'antaes of PP0PP-96Partitioned Primar, -nde8.

    hen a -nde8 is given on a partitioned ta$e on the partitioned co$umn that is the co$umn onwhich the partitioned has done the same co$umn has een given as a primar, inde8 then+

    -f there are more partitions+ then it wi$$ e faster to scan the ta$e+ that too with the P-va$ue itse$f.

    *;. Disad'ataes of PP0 -f there are no partition dec$ared for the row to e inserted in a particu$ar partition then it is waste todec$are the primar, inde8 itse$f. -t is etter to use the secondar, inde8 for partition for etter performance.

    *

  • 8/13/2019 Teradata Interview Prep Questions

    15/52

    domain. This ma& in'ol'e de)normali,ed tales.

    +or instance if the Emplo&ee tale contained a column for the manaerLs emplo&ee numer andthe manaer is an emplo&ee these t2o columns ha'e the same domain. !& joinin on these t2ocolumns in the Emplo&ee tale the manaers can e joined to the emplo&ees.

    Example:

    'EEC! #gr.astUname (!it$e J#anager ameJ+ format J(13) ) +&epartmentUname (!it$e J7or &epartment J)

    74/# Emplo&eeKtale A( Emp

    0NNE3 HO0N Emplo&eeKtale A( Mr

    / Emp.#anagerUEmpU-& V #gr.Emp$o,eeUumer

    -E4 H/- &epartmentUta$e A' &ept

    / Emp.&epartmentUnumer V &ept.&epartmentUnumer

    O3DE3 !" *

    0NNE3 HO0N:

    -E4 H/- ke,word return rows when there is at $east one match in oth ta$es

    -E4 H/- ',nta89

    'EEC! co$umnUname(s) 74/# ta$eUname1 -E4 H/- ta$eUname2

    / ta$eUname1.co$umnUnameVta$eUname2.co$umnUname

    -E+T OITE3 HO0N

    !he E7! /0!E4 H/- ke,word returns a$$ rows from the $eft ta$e (ta$eUname1)+ even if there are

    no matches in the right ta$e(ta$eUname2).

    -E+T OITE3 HO0N (&ntax:

    'EEC! co$umnUname(s) 74/# ta$eUname1 E7! /0!E4 H/- ta$eUname2

    / ta$eUname1.co$umnUnameVta$eUname2.co$umnUname

    30J/T OITE3 HO0N:

    !he 4-OI! /0!E4 H/- ke,word 4eturn a$$ rows from the right ta$e (ta$eUname2)+ even if there areno matches in the $eft ta$e (ta$eUname1).

    30J/T OITE3 HO0N (&ntax:

    'EEC! co$umnUname(s) 74/# ta$eUname1 4-OI! /0!E4 H/- ta$eUname2

    / ta$eUname1.co$umnUnameVta$eUname2.co$umnUname

    +I-- OITE3 HO0N:

    !he 70 /0!E4 H/- ke,word return rows when there is a match in one of the ta$es.

  • 8/13/2019 Teradata Interview Prep Questions

    16/52

    +I-- OITE3 HO0N (&ntax:

    'EEC! co$umnUname(s) 74/# ta$eUname1 70 /0!E4 H/- ta$eUname2

    / ta$eUname1.co$umnUnameVta$eUname2.co$umnUname

    A 70 /0!E4 H/- uses oth of the ta$es as outer ta$es. !he e8ceptions are returned from othta$es and the missing co$umn va$ues from either ta$e are e8tended with 0.

    Product Hoin

    -t is ver, important to use an e*ua$ condition in the IE4E c$ause. /therwise ,ou get a product ;oin.!his means that one row of a ta$e is ;oined to mu$tip$e rows of another ta$e. A mathematic productmeans that mu$tip$ication is used.

    4>. Difference et2een Primar& index and secondar& index5

    1. primar, inde8 cannot create after ta$e creation+ whereas secondar, inde8 can e created d,namica$$,.2. primar, inde8 is 1 A#P operation+ secondar, inde8 is 2 A#P operation and non uni*ue secondar, inde8is A A#P operation.

    41. 2hat are Hournals5 Hourna$ing is a data protection mechanism in teradata Hourna$s are generated to maintain pre6 images and post images of a transaction starting@ending at@from a checkpoint. hen a transaction fai$s+the ta$e is restored ack to the $ast avai$a$e checkpoint using the ;ourna$ -mages.

    !here are two t,pes of Hourna$s (1) permanent (2) !ransient ;ourna$.

    !he purpose of the permanent ;ourna$ is to provide se$ective or fu$$ dataase recover, to a

    specified point in time. -t permits recover, from une8pected hardware or software disasters. !he permanent ;ourna$ a$so reduces the need for fu$$ ta$e ackups that can e cost$, in oth time and resources.

    1. Permanent journalsare e8p$icit$, created during dataase and@or ta$e creation time. !his ;ourna$ing can e imp$emented depending upon the need and avai$a$e disk space.

    PH processing is a user se$ecta$e option on a dataase which a$$ows the user to se$ect e8tra ;ourna$ing for changes made to a ta$e. !here are more options and the data can e ro$$ed forward or ackward (depending if ,ou se$ected the correct options) at points of the customers choosing. !he, are permanent ecause the changes are kept unti$ the customer de$etes them or un$oads them to a ackup tape. !he, are usua$$, kept in con;unction with ackups of the dataase and a$$ow partia$ ro$$ack or ro$$ forward for some corrupted data or operationa$ error $ike someone

    de$eted a months worth of data ecause the, messed up the where c$ause

    *.Transient Hournal

    !he transient ;ourna$ permits the successfu$ ro$$ack of a fai$ed transaction (!). !ransactions are not committed to the dataase unti$ the A#Ps have received an End !ransaction re*uest+ either imp$icit$, or e8p$icit$,. !here is a$wa,s the possii$it, that the transaction ma, fai$. -f

    so+ the participating ta$e(s) must e restored to their pre6transaction state.

    !he transient ;ourna$ maintains a cop, of efore images of a$$ rows affected , the transaction. -n the event of transaction fai$ure+ the efore images are reapp$ied to the affected ta$es+ then are de$eted from the ;ourna$+ and a ro$$ack operation is comp$eted. -n the event of transaction success+ the efore images for the transaction are discarded from the ;ourna$ at the point of

    transaction commit.

    !ransient Hourna$ activities are automatic and transparent to the user

  • 8/13/2019 Teradata Interview Prep Questions

    17/52

    4*.Teradata fast export script5

    .-OJTA!-E 3estart-o1Kfxp

    .3IN +0-E loon

    .!EJ0N EPO3T (E((0ON( 6

    .-A"OIT 3ecordK-a&out .+0E-D inKCit& 1 C/A3#*>$

    .+0E-D inKip C/A3#7$

    .0MPO3T 0N+0-E cit&K,ipKinfile -A"OIT 3ecordK-a&out

    .EPO3T OIT+0-E custKacctKoutfile*(E-ECT A.AccountKNumer

    C.-astKName C.+irstKName A.!alanceKCurrent

    +3OM Accounts A 0NNE3 HO0NAccountsKCustomer AC 0NNE3 HO0N

    Customer CON C.CustomerKNumer G AC.CustomerKNumerON A.AccountKNumer G AC.AccountKNumer/E3E A.Cit& G :inKCit&AND A.ipKCodeG :inKipO3DE3 !" 1

    .END EPO3T

    .-OJO++

    44.Teradata statistics.

    'tatistics co$$ection is essentia$ for the optima$ performance of the !eradata *uer, optimi:er. !he *uer,optimi:er re$ies on statistics to he$p it determine the est wa, to access data. 'tatistics a$so he$p theoptimi:er ascertain how man, rows e8ist in ta$es eing *ueried and predict how man, rows wi$$ *ua$if, forgiven conditions. ack of statistics+ or out6dated statistics+ might resu$t in the optimi:er choosing a $ess6than6optima$ method for accessing data ta$es.

    Points:

    1: Once a collect stats is done on the tale#on index or column$ 2here is this information stored sothat the optimi,er can refer this5

    Ans9 Co$$ected statistics are stored in &BC.!D7ie$ds or &BC.-nde8es. Iowever+ ,ou cannot *uer, these twota$es.

    *: /o2 often collect stats has to e made for a tale that is freBuentl& updated5

    Answer9 You need to refresh stats when to 13N of ta$eJs rows have changed. Co$$ect stats cou$d e prett,resource consuming for $arge ta$es. 'o it is a$wa,s advisa$e to schedu$e the ;o at off peak period andnorma$$, after appro8imate$, 13N of data changes.

    4: Once a collect stats has een done on the tale ho2 can i e sure that the optimi,er is considerinthis efore execution 5 i.e until the next collect stats has een done 2ill the optimi,er refer this5

    Ans9 Yes+ optimi:er wi$$ use stats data for *uer, e8ecution p$an if avai$a$e. !hatJs wh, sta$e stats isdangerous as that ma, mis$ead the optimi:er.

    hat is a /OT AMP

    hen the work$oad is not distriuted across a$$ the A#Ps+ on$, a few A#Ps end up overurdened with the

  • 8/13/2019 Teradata Interview Prep Questions

    18/52

    work. !his is a hot A#P condition.!his t,pica$$, occurs when the vo$ume of data ,ou are dea$ing with is high and(a). You are tr,ing to retrieve the data in a !E4A&A!A ta$e which is not we$$ distriuted across the A#Ps onthe s,stem (ad Primar, -nde8)/4(). hen ,ou are tr,ing to ;oin on co$umn with high$, non uni*ue va$ues/4

    (c). hen ,ou app$, the &-'!-C! operator on a co$umn with high$, non uni*ue va$ues

    6: /o2 can i %no2 the tales for 2hich the collect stats has een done5

    Ans9 You run Ie$p 'tats command on that ta$e. e.g IEP '!A!--'!-C' !ABEUA#E K this wi$$ give ,ou&ate and time when stats were $ast co$$ected. You wi$$ a$so see stats for the co$umns ( for which stats weredefined) for the ta$e. You can use !eradata #anager too.

    7: To 2hat extent 2ill there e performance issues 2hen a collect stats is not done5Can aperformance issue e related onl& due to collect stats5 Proal& a /OT AMP could e the reason forlac% of spool space 2hich is leadin to performance deradation QQQ

    As9 1stpart9 !eradata uses a cost ased optimi:er and cost estimates are done ased on statistics. 'o if ,ou

    dont have statistics co$$ected then optimi:er wi$$ use a &,namic A#P 'amp$ing method to get the stats. -f,our ta$e is ig and data was uneven$, distriuted then d,namic samp$ing ma, not get right information and,our performance wi$$ suffer.

    2nd Part9 o+ performance cou$d e re$ated to ad se$ection of inde8es ( most important$, P-) and the accesspath of a particu$ar *uer,.?9 A$so $et me know what can $ead to $ack of spoo$ space apart from I/! A#P WWWAns9 /ne reason comes to m, mind+ a product ;oin on two ig data sets ma, $ead to the $ack of spoo$ space.

    46. here 2ill &ou define error tales in the script5

    -n 7/A& X #/A& we define in BEO- /A&-O statement. 47. 0 ha'e to load data dail&. hich load utilit& 2ill e ood5

    !P0#P.

    48 hat are different (PACE( a'ailale in Teradata5 Perm 'pace

    !emp 'pace spoo$ space Perm (pace 9A$$ dataases have a defined upper $imit of permanent space.Permanent space is used for storing the data rows of ta$es. Perm space is not pre6a$$ocated. -t

    represents a ma8imum $imit.(pool (pace :

    A$$ dataases a$so have an upper $imit of spoo$ space. -f there is no $imit defined for a particu$ar dataase or user+ $imits are inherited from parents. !heoretica$$,+ a user cou$d use a$$ una$$ocated space in the s,stem for their *uer,. 'poo$ space is temporar, space used to ho$d intermediate

    *uer, resu$ts or formatted answer sets to *ueries. /nce the *uer, is comp$ete+ the spoo$ space is re$eased.

    E8amp$e9 You have a dataase with tota$ disk space of 133OB. You have13OB of user data and an additiona$ 13OB of overhead. hat is thema8imum amount of spoo$ space avai$a$e for *ueries

    Answer9 3OB. A$$ of the remaining space in the s,stem is avai$a$e for spoo$ Temp (pace : !he third t,pe of space is temporar, space. !emp space is used for O$oa$ and Do$ati$e temporar, ta$es+ and these resu$ts remain avai$a$e to the user unti$ the session is terminated. !a$es created in temp space wi$$ survive a restart.

    49.different options that 2e can specif& in C3EATE tale statement5 !here are two different ta$e t,pe phi$osophies so there are two different t,pe ta$es. !he, are 'E!and #0!-'E!. -t has een said+ 5A man with one watch knows the time+ ut a man with two watches is

  • 8/13/2019 Teradata Interview Prep Questions

    19/52

    never sure5. hen !eradata was origina$$, designed it did not a$$ow dup$icate rows in a ta$e. -f an, row inthe same ta$e had the same va$ues in ever, co$umn !eradata wou$d throw one of the rows out. !he,e$ieved a second row was a mistake. h, wou$d someone need two watches and wh, wou$d someoneneed two rows e8act$, the same !his is 'E! theor, and a 'E! ta$e kicks out dup$icate rows.!he A'- standard e$ieved in a different phi$osoph,. -f two rows are entered into a ta$e that are e8actdup$icates then this is accepta$e. -f a person wants to wear two watches then the, proa$, have agood reason. !his is a #0!-'E! ta$e and dup$icate rows are a$$owed. -f ,ou do not specif, 'E! or

    #0!-'E!+ one is used as a defau$t. Iere is the issue9 the defau$t in Teradata mode is (ETand thedefau$t in AN(0 mode is MI-T0(ET.

    !herefore+ to e$iminate confusion it is important to e8p$icit$, define which one is desired. /therwise+ ,oumust know in which mode the C4EA!E !ABE wi$$ e8ecute in so that the correct t,pe is used for eachta$e. !he imp$ication of using a 'E! or #0!-'E! ta$e is discussed further.

    (ET and MI-T0(ET Tales

    A 'E! ta$e does not a$$ow dup$icate rows so !eradata checks to ensure that no two rows in a ta$e aree8act$, the same. !his can e a urden. /ne wa, around the dup$icate row check is to have a co$umn inthe ta$e defined as 0-0E. !his cou$d e a 0ni*ue Primar, -nde8 (0P-)+ 0ni*ue 'econdar, -nde8(0'-) or even a co$umn with a 0-0E or P4-#A4Y EY constraint. 'ince a$$ must e uni*ue+ a

    dup$icate row ma, never e8ist. !herefore+ the check on either the inde8 or constraint e$iminates theneed for the row to e e8amined for uni*ueness. As a resu$t+ inserting new rows can e much faster ,e$iminating the dup$icate row check.

    Iowever+ if the ta$e is defined with a 0P- and the ta$e uses 'E! as the ta$e t,pe+ now a dup$icaterow check must e performed. 'ince 'E! ta$es do not a$$ow dup$icate rows a check must eperformed ever, time a 0P- &0P (dup$icate of an e8isting row 0P- va$ue) va$ue is inserted orupdated in the ta$e. &o not e foo$edW A dup$icate row check can e a ver, e8pensive operation interms of processing time. !his is ecause ever, new row inserted must e checked to see if it is adup$icate of an, e8isting row with the same 0P- 4ow Iash va$ue. !he numer of checks increasese8ponentia$$, as each new row is added to the ta$e.

    hat is the so$ution !here are two9 either make the ta$e a #0!-'E! ta$e (on$, if ,ou want dup$icate

    rows to e possi$e) or define at $east one co$umn or composite co$umns as 0-0E. -f neither is anoption then the 'E! ta$e with no uni*ue co$umns wi$$ work+ ut inserts and updates wi$$ take more timeecause of the mandator, dup$icate row check.

    Be$ow is an e8amp$e of creating a 'E! ta$e9

    C4EA!E 'E! !a$e !omC.emp$o,ee

    ( emp -!EOE4

    +dept -!EOE4

    +$name CIA4(23)

    +fname DA4CIA4(23)

    +sa$ar, &EC-#A(13+2)

    +hireUdate &A!E )

    IN0=IEP4-#A4Y -&E(emp)Kotice the 0-0E P4-#A4Y -&E on the co$umn emp. Because this is a 'E! ta$e it is much moreefficient to have at $east one uni*ue ke, so the dup$icate row check is e$iminated.

    !he fo$$owing is an e8amp$e of creating the same ta$e as efore+ ut this time as a #0!-'E! ta$e9

    C4EA!E MI-T0(ET!ABE emp$o,ee

    ( emp -!EOE4

    +dept -!EOE4

    +$name CIA4(23)

    +fname DA4CIA4(23)

    +sa$ar, &EC-#A(13+2)

    +hireUdate &A!E )

  • 8/13/2019 Teradata Interview Prep Questions

    20/52

    P30MA3"-&E(emp)Kotice a$so that the P0 is no2 a NIP0ecause it does not use the word 0-0E. !his is importantW Asmentioned previous$,+ if the 0P- is re*uested+ no dup$icate rows can e inserted. !herefore+ it acts more$ike a 'E! ta$e. !his #0!-'E! e8amp$e a$$ows dup$icate rows. -nserts wi$$ take $onger ecause of themandator, dup$icate row check.

    4;. hat is macro5 Ad'ataes of it. Macros:A macro is a predefined+ stored set of one or more ' commands and report6formatting commands. #acros are used to simp$if, the e8ecution of fre*uent$, used ' commands. #acros do not re*uire permanent space.

    4. /o2 Does Teradata (tore 3o2s5

    = !eradata uses hash partitioning and distriution to random$, and even$, distriute data across a$$ A#Ps. =!he rows of ever, ta$e are distriuted among a$$ A#Ps 6 and idea$$, wi$$ e even$, distriuted among a$$A#Ps. = Each A#P is responsi$e for a suset of the rows of each ta$e.= Even$, distriuted ta$es resu$t in even$, distriuted work$oads.

    +allac% Do2n Amp reco'er& journalQQQ/i

    hen a +allac% protected AMP oes do2n durin a 2rite operation the update ta%es placein the +allac% AMP in the same cluster to later update in the oriinal AMP 2hen it reco'ers.

    hen an AMP oes do2n the updates are also recorded in the Do2n AMP 3eco'er& journal to laterupdate 2hen AMP reco'ers.

    M& dout is 2hen an AMP oes do2n are the updates made in oth +allac% AMP Do2n AMPreco'er& journal5

    !ecause if "es it loo%s li%e a redundant reco'er& measure or0s it li%e Do2n AMP 3eco'er& journal is used for onl& Non +allac% protected AMPs orfor +allac% protected AMPs 2hen oth the AMPs in the cluster are do2n.

    3eardsAnnal T

    Hi Annal,

    According to my knowledge

    1.Down amp recovery journal will start when AMP goes down to restore the data for the down amp.fall !ack is like it has redundant data,if one amp goes down in the cluster also it wont affect your

    "ueries.the "uery will use data from fall !ack rows.the down amp wont !e updated use the data

    from fall !ack.

    #or your dou!t,$hen amp is down you ran the update,so fall !ack rows will !e updated.%till amp is

    in down condition and if you run the "uery,the "uery will use the updated ones and run.whenever

    down amp active it will use downamp recovery journal and data will !e updated.

    Hope this helps.

    &egards,

    %yam Prasad '

  • 8/13/2019 Teradata Interview Prep Questions

    21/52

    61. hich one 2ill ta%e care 2hen an AMP oes do2n5 &own amp recover, ;ourna$ wi$$ start when A#P goes down to restore the data for the down amp 2.fa$$ ack is $ike it has redundant data+if one amp goes down in the c$uster a$so it wont affect ,our *ueries.the *uer, wi$$ use data from fa$$ ack rows.the down amp wont e updated use the data from fa$$ ack.

    7or ,our dout+hen amp is down ,ou ran the update+so fa$$ ack rows wi$$ e updated.'ti$$ amp is in down condition and if ,ou run the *uer,+the *uer, wi$$ use the updated ones and run.whenever down amp active it wi$$ use downamp recover, ;ourna$ and data wi$$ e updated.

    6*.hich one 2ill ta%e care 2hen a NODE oes do2n5 -n the event of node fai$ure+ a$$ virtua$ processors can migrate to another avai$a$e node in the c$i*ue. A$$ nodes in the c$i*ue must have access to the same disk arra,s

    64.hat is the use of EP-0N plan5

    !he EPA- faci$it, a$$ows ,ou to preview how !eradata wi$$ e8ecute a re*uested *uer,. -t returns a summar, of the steps the !eradata 4&B#' wou$d perform to e8ecute the re*uest.

    EPA- a$so disc$oses the strateg, and access method to e used+ how man, rows wi$$ e invo$ved+ and its cost in minutes and seconds. 0se EPA- to eva$uate a *uer, performance and to deve$op an a$ternative processing strateg, that ma, e more efficient. EPA- works on an, ' re*uest. !he re*uest is fu$$, parsed and optimi:ed+ ut not run. !he comp$ete p$an is returned to the user in reada$e Eng$ish statements.

    EPA- provides information aout $ocking+ sorting+ row se$ection criteria+ ;oin strateg, and conditions+ access method+ and para$$e$ step processing. EPA- is usefu$ for performance tuning+ deugging+ pre6va$idation of re*uests+ and for technica$ training.

    66.Ise of COA-E(CE function5

    !he newer A'- standard C/AE'CE can a$so convert a 0 to a :ero. Iowever+ it can convert a 0

    va$ue to an, data va$ue as we$$. !he C/AE'CE searches a va$ue $ist+ ranging from one to man, va$ues+and returns the first on60 va$ue it finds. At the same time+ it returns a 0 if a$$ va$ues in the $ist are 0.

    !o use the C/AE'CE+ the ' must pass the name of a co$umn to the function. !he data in theco$umn is then compared for a 0. A$though one co$umn name is a$$ that is re*uired+ norma$$, more thanone co$umn is norma$$, passed to it. Additiona$$,+ a $itera$ va$ue+ which is never 0+ can e returned toprovide a defau$t va$ue if a$$ of the previous co$umn va$ues are 0.

    !he s,nta8 for the C/AE'CE fo$$ows9

    'EEC! C/AE'CE (Rco$umn6$istS Q+R$itera$S T )

    +RAggregateS( C/AE'CE(Rco$umn6$istSQ+R$itera$ST ) )74/# Rta$e6nameS

    O4/0P BY 1 K

    -n the aove s,nta8 the Rco$umn6$istS is a $ist of co$umns. -t is written as a series of co$umn namesseparated , commas.

    'EEC! C/AE'CE(0+3) A' Co$1

    +C/AE'CE(0+0+0) A' Co$2

    +C/AE'CE(") A' Co$" +C/AE'CE(JAJ+") A' Co$% K

    67.Diff et2een role pri'ilee and profile5

    A ro$e can e assisgned a co$$ection of access rights in the same wa, a user can.

  • 8/13/2019 Teradata Interview Prep Questions

    22/52

    You then grant the ro$e to a set of users+ rather than grant each user the same rights.

    !his cuts down on maintenance+ adds standardisation (hence reducing erroneous access to sensitive data)and reduces the si:e of the dc.a$$rights ta$e+ which is ver, important in reducing &BC $ocking in a $argeenvironment.Profi$es assign different characteristics on a 0ser+ such as spoo$ space+ permspace and account strings.Again this he$ps with standardisation. ote that spoo$ assigned to a profi$e wi$$ overru$e spoo$ assigned on a

    create user statement. Check the on $ine manua$s for the fu$$ $ists of properties

    &ata Contro$ anguage is used to restrict or permit a userJs access. -t can se$ective$, $imit a userJs ai$it, toretrieve+ add+ or modif, data. -t is used to grant and revoke access privi$eges on ta$es and views.

    68.Diff et2een dataase and user5

    Both ma, own o;ects such as ta$es+ views+ macros+ procedures+ and functions. Both users and dataasesma, ho$d privi$eges. Iowever+ on$, users ma, $og on+ esta$ish a session with the !eradata &ataase+ andsumit re*uests.

    A user performs actions where as a dataase is passive. 0sers have passwords and startup stringsKdataases do not. 0sers can $og on to the !eradata &ataase+ esta$ish sessions+ and sumit '

    statementsK dataases cannot.

    Creator privi$eges are associated on$, with a user ecause on$, a user can $og on and sumit a C4EA!Estatement. -mp$icit privi$eges are associated with either a dataase or a user ecause each can ho$d ano;ect and an o;ect is owned , the named space in which it resides

    69./o2 man& mload scripts are reBuired for the elo2 scenario7irst - want to $oad data from source to vo$ati$e ta$e.After that - want to $oad data from vo$ati$e ta$e to Permanent ta$e.

    6;.hat are the t&pes of CA(E statements a'ailale in Teradata5

    !he CA'E function provides an additiona$ $eve$ of data testing after a row is accepted , the IE4E

    c$ause. !he additiona$ test a$$ows for mu$tip$e comparisons on mu$tip$e co$umns with mu$tip$e outcomes.-t a$so incorporates $ogic to hand$e a situation in which none of the va$ues compares e*ua$.

    hen using CA'E+ each row retrieved is eva$uated once , ever, CA'E function. !herefore+ if twoCA'E operations are in the same ' statement+ each row has a co$umn checked twice+ or twodifferent va$ues each checked one time.

    !he asic s,nta8 of the CA'E fo$$ows9

    CA'E Rco$umn6nameS

    IE Rva$ue1S !IE Rtrue6resu$t1S

    IE Rva$ue2S !IE Rtrue6resu$t2S

    IE Rva$ueS !IE Rtrue6resu$tS

    Q E'E Rfa$se6resu$tS T

    E&

    !,pes9

    1.7$e8i$e Comparisons within CA'E

    hen it is necessar, to compare more than ;ust e*ua$ conditions within the CA'E+ the format ismodified s$ight$, to hand$e the comparison. #an, peop$e prefer to use the fo$$owing format ecause it ismore f$e8i$e and can compare ine*ua$ities as we$$ as e*ua$ities.

    !his is a more f$e8i$e form of the CA'E s,nta8 and a$$ows for ine*ua$it, tests9

    CA'E

    IE Rcondition6test1S !IE Rtrue6resu$t1S

  • 8/13/2019 Teradata Interview Prep Questions

    23/52

    IE Rcondition6test2S !IE Rtrue6resu$t2S

    IE Rcondition6testS !IE Rtrue6resu$tS

    Q E'E Rfa$se6resu$tS T

    E&

    !he aove s,nta8 shows that mu$tip$e tests can e made within each CA'E. !he va$ue stored in theco$umn continues to e tested unti$ it finds a true condition. At that point+ it does the !IE portion and

    e8its the CA'E $ogic , going direct$, to the E&.

    2.Comparison /perators within CA'E

    -n this section+ we wi$$ investigate adding more power to the CA'E statement. -n the aove e8amp$es+ a$itera$ va$ue was returned. -n most cases+ it is necessar, to return data. !he returned va$ue can comefrom a co$umn name ;ust $ike an, se$ected co$umn or a mathematica$ operation.

    Additiona$$,+ the aove e8amp$es used a $itera$ LV< as the comparison operator. !he CA'E comparisonsa$so a$$ow the use of -+ BE!EE+ 0-7 and C/AE'CE. -n rea$it,+ the BE!EE is a compoundcomparison. -t checks for va$ues that are greater than or e*ua$ to the first numer and $ess than or e*ua$to the second numer.

    !he ne8t e8amp$e uses oth formats of the CA'E in a sing$e 'EEC! with each one producing aco$umn disp$a,. -t a$so uses A' to esta$ish an a$ias after the E&9

    'EEC! CA(E /EN JradeKpt 0( NI-- T/EN LJrade Point In%no2nL

    /EN JradeKpt 0N #1*4$ T/EN L0nteer JPAL

    /EN JradeKpt !ETEEN 1 AND * T/EN L-o2 Decimal 'alueL

    /EN JradeKpt 4.

  • 8/13/2019 Teradata Interview Prep Questions

    24/52

    /EN L+3L T/EN JradeKpt

    E-(E NI-- END) (format JZ.ZZJ) A' 7reshmanUOPA

    +ADO(CA(E ClassKcode

    /EN L(OL T/EN JradeKpt

    E-(E NI-- END) (format JZ.ZZJ) A' 'ophomoreUOPA

    +ADO(CA(E ClassKcode /EN LH3L T/EN JradeKpt

    E-(E NI-- END) (format JZ.ZZJ) A' HuniorUOPA

    +ADO(CA(E ClassKcode

    /EN L(3L T/EN JradeKpt

    E-(E NI-- END) (format JZ.ZZJ) A' 'eniorUOPA

    74/# 'tudentU!a$e

    IE4E C$assUcode -' /! 0 K

    6.Nested CA(E Expressions

    After ecoming comforta$e with the previous e8amp$es of the CA'E+ it ma, ecome apparent that asing$e check on a co$umn is not sufficient for more comp$icated re*uests. hen that is the situation+ oneCA'E can e imedded within another. !his is ca$$ed nested CA'E statements.

    !he CA'E ma, e nested to check data in a second co$umn in a second CA'E efore determiningwhat va$ue to return. -t is common to have more than one CA'E in a sing$e ' statement. Iowever+ itis powerfu$ enough to have a CA'E statement within a CA'E statement.

    E8amp$e9

    'EEC! astUname

    +CA(EC$assUcode IE JH4J

    !IE JHunior J [[(CA(E /EN JradeKpt * T/EN L+ailinL

    /EN JradeKpt 4.7 T/EN LPassinL

    E-(E LExceedinL END)

    E'E J'enior J [[(CA(E /EN JradeKpt * T/EN L+ailinL

    /EN JradeKpt 4.7 T/EN LPassinL

    E-(E LExceedinL END)

    ENDA' CurrentU'tatus

    74/# 'tudentU!a$e

    IE4E C$assUcode - (JH4J+J'4J)

    /4&E4 BY c$assUcode+ $astUnameK

    6

  • 8/13/2019 Teradata Interview Prep Questions

    25/52

    +&4/P PrefK

    A!E4 !ABE(4ename)9 ',nta8

    A!E4 !ABE emp$o,ee 4EA#E 'treet !/ 'treetAddrK

    7>.mention the order of (=- execution

    'EEC!6IE4E6O4/0P BY6IAD-O6 /4&E4 BY c$ass

    71 hat is the (=- to find the ase AMP no. of records stored for a particular tale5

    7* hen a P0 is not mentioned on a tale ho2 2ill Teradata consider the P0 for that tale5

    -f ,ou donJt specif, a P- at ta$e create time then !eradata must chose one. 7or instance+ if the && isported from another dataase that uses a Primar, e, instead of a Primar, -nde8+ the C4EA!E !ABEcontains a P4-#A4Y EY (P) constraint. !eradata is smart enough to know that Primar, e,s muste uni*ue and cannot e nu$$. 'o+ the first $eve$ of defau$t is to use the P30MA3" @E" column#s$ as aIP0.

    -f the && defines no P4-#A4Y EY+ !eradata $ooks for a co$umn defined as 0-0E. As a second$eve$ defau$t+ !eradata uses thefirst column defined 2ith a IN0=IE constraint as a IP0.-f none of the aove attriutes are found+ !eradata uses the first column defined inthe ta$e as a/60-0E P4-#A4Y -&E (0P-).

    74 hat is co'ered Buer& in Teradata5

    -f a 'EEC! *uer, covers a$$ the co$umns that are defined in the H/- -&E as ;oin co$umns+ such t,pe of*ueries are ca$$ed as C/DE4E& *uer,.#u$ti6Co$umn 0'- Co$umns used as a Covered uer,

    76. hat is NI(0 it mappin5

    77. hat are data demoraphics5&ata demographics give us the information re$ated to fre*uent$, updated co$umns.data demographics are 9ma8imum rows per va$uet,pica$ rows per va$uedistinct va$ues

    78. Diff et2een loical and ph&sical data modelin5-oical Rersus Ph&sical Dataase Modelin

    After a$$ usiness re*uirements have een gathered for a proposed dataase+ the, must e mode$ed. #ode$sare created to visua$$, represent the proposed dataase so that usiness re*uirements can easi$, eassociated with dataase o;ects to ensure that a$$ re*uirements have een comp$ete$, and accurate$,

    gathered. &ifferent t,pes of diagrams are t,pica$$, produced to i$$ustrate the usiness processes+ ru$es+entities+ and organi:ationa$ units that have een identified. !hese diagrams often inc$ude entit, re$ationshipdiagrams+ process f$ow diagrams+ and server mode$ diagrams. An entit, re$ationship diagram (E4&)represents the entities+ or groups of information+ and their re$ationships maintained for a usiness. Processf$ow diagrams represent usiness processes and the f$ow of data etween different processes and entitiesthat have een defined. 'erver mode$ diagrams represent a detai$ed picture of the dataase as eingtransformed from the usiness mode$ into a re$ationa$ dataase with ta$es+ co$umns+ and constraints.Basica$$,+ data mode$ing serves as a $ink etween usiness needs and s,stem re*uirements.

    !wo t,pes of data mode$ing are as fo$$ows9

    ogica$ mode$ing

    Ph,sica$ mode$ing

    -f ,ou are going to e working with dataases+ then it is important to understand the difference etween$ogica$ and ph,sica$ mode$ing+ and how the, re$ate to one another. ogica$ and ph,sica$ mode$ing are

  • 8/13/2019 Teradata Interview Prep Questions

    26/52

    descried in more detai$ in the fo$$owing susections.

    -oical Modelin

    ogica$ mode$ing dea$s with gathering usiness re*uirements and converting those re*uirements into amode$. !he $ogica$ mode$ revo$ves around the needs of the usiness+ not the dataase+ a$though the needsof the usiness are used to esta$ish the needs of the dataase. ogica$ mode$ing invo$ves gatheringinformation aout usiness processes+ usiness entities (categories of data)+ and organi:ationa$ units. After

    this information is gathered+ diagrams and reports are produced inc$uding entit, re$ationship diagrams+usiness process diagrams+ and eventua$$, process f$ow diagrams. !he diagrams produced shou$d show theprocesses and data that e8ists+ as we$$ as the re$ationships etween usiness processes and data. ogica$mode$ing shou$d accurate$, render a visua$ representation of the activities and data re$evant to a particu$arusiness.

    !he diagrams and documentation generated during $ogica$ mode$ing is used to determine whether there*uirements of the usiness have een comp$ete$, gathered. #anagement+ deve$opers+ and end users a$ikereview these diagrams and documentation to determine if more work is re*uired efore ph,sica$ mode$ingcommences.

    !,pica$ de$ivera$es of $ogica$ mode$ing inc$ude

    Entit, re$ationship diagramsAn Entit, 4e$ationship &iagram is a$so referred to as an ana$,sis E4&. !he point of the initia$ E4& isto provide the deve$opment team with a picture of the different categories of data for the usiness+ aswe$$ as how these categories of data are re$ated to one another.

    Business process diagrams!he process mode$ i$$ustrates a$$ the parent and chi$d processes that are performed , individua$swithin a compan,. !he process mode$ gives the deve$opment team an idea of how data moves withinthe organi:ation. Because process mode$s i$$ustrate the activities of individua$s in the compan,+ theprocess mode$ can e used to determine how a dataase app$ication interface is design.

    0ser feedack documentation

    Ph&sical Modelin

    Ph,sica$ mode$ing invo$ves the actua$ design of a dataase according to the re*uirements that wereesta$ished during $ogica$ mode$ing. ogica$ mode$ing main$, invo$ves gathering the re*uirements of theusiness+ with the $atter part of $ogica$ mode$ing directed toward the goa$s and re*uirements of the dataase.Ph,sica$ mode$ing dea$s with the conversion of the $ogica$+ or usiness mode$+ into a re$ationa$ dataasemode$. hen ph,sica$ mode$ing occurs+ o;ects are eing defined at the schema $eve$. A schema is a groupof re$ated o;ects in a dataase. A dataase design effort is norma$$, associated with one schema.

    &uring ph,sica$ mode$ing+ o;ects such as ta$es and co$umns are created ased on entities and attriutesthat were defined during $ogica$ mode$ing. Constraints are a$so defined+ inc$uding primar, ke,s+ foreign ke,s+other uni*ue ke,s+ and check constraints. Diews can e created from dataase ta$es to summari:e data orto simp$, provide the user with another perspective of certain data. /ther o;ects such as inde8es andsnapshots can a$so e defined during ph,sica$ mode$ing. Ph,sica$ mode$ing is when a$$ the pieces come

    together to comp$ete the process of defining a dataase for a usiness.

    Ph,sica$ mode$ing is dataase software specific+ meaning that the o;ects defined during ph,sica$ mode$ingcan var, depending on the re$ationa$ dataase software eing used. 7or e8amp$e+ most re$ationa$ dataases,stems have variations with the wa, data t,pes are represented and the wa, data is stored+ a$though asicdata t,pes are conceptua$$, the same among different imp$ementations. Additiona$$,+ some dataases,stems have o;ects that are not avai$a$e in other dataase s,stems.

    79. 2hat is deri'ed Tale5

    Deri'ed talesare a$wa,s $oca$ to a sing$e ' re*uest. !he, are ui$t d,namica$$, using an additiona$'EEC! within the *uer,. !he rows of the derived ta$e are stored in spoo$ and discarded as soon asthe *uer, finishes. !he && has no know$edge of derived ta$es. !herefore+ no e8tra privi$eges are

    necessar,. -ts space comes from the users spoo$ space.

    7o$$owing is a simp$e e8amp$e using a derived ta$e named &! with a co$umn a$ias ca$$ed avgsa$ and its

  • 8/13/2019 Teradata Interview Prep Questions

    27/52

    data va$ue is otained using the ADO aggregation9

    'EEC! \74/# #(E-ECT ARJ#salar&$ +3OM Emplo&eeKtale$ DT#a'sal$K

    7;.2hat is the use of 0T/ C/EC@ OPT0ON in Teradata5

    -n !eradata+ the additiona$ ke, phase9 -!I CIEC /P!-/+ indicates that the IE4E c$ause

    conditions shou$d e app$ied during the e8ecution of an 0P&A!E or &EE!E against the view.!his is not a concern if views are not used for maintenance activit, due to restricted privi$eges.

    7

  • 8/13/2019 Teradata Interview Prep Questions

    28/52

    his means that the user has insured the following!

    he "# of the parent table has unique, not null values. he $# of the child table contains only values which are contained in the "# column of

    the parent table.

    Soft RI

    %oes not create or maintain reference inde&es %oes not validate referencing constraints

    'y allowing the optimizer to assume that RI constraints are implicitly in force, (even though noformal RI is assigned to the table), you enable the optimizer to eliminate oin steps in queriessuch as the one seen previously.

    Implementing Soft RI

    Soft RI is implemented using slightly different syntax than standard RI. The

    REFERENCESclause for the column definition will add the key words 'WITH NO CHECK

    OPTION'.

    &amples

    *reate the employee table with a soft RIreference to the department table.

    CREATE TABLE employee( employee_number INTEGER NOT NULL,

    manaer_employee_number INTEGER,!epar"men"_number INTEGER ,#ob_$o!e INTEGER,la%"_name CHAR(&' NOT NULL,)*r%"_name +ARCHAR(' NOT NULL,-*re_!a"e .ATE NOT NULL,

    b*r"-!a"e .ATE NOT NULL,%alary_amoun" .ECI/AL(0',& NOT NULL, 1OREIGN KE2 ( !epar"men"_number RE1ERENCE3 WITH NO CHECK OPTION!epar"men"( !epar"men"_numberUNI4UE PRI/AR2 IN.E5 (employee_number6

    he parent table must be created with a unique, not null referenced column. ither of thee&amples below may be used.

    CREATE TABLE !epar"men"( !epar"men"_numberINTEGER NOT NULL CON3TRAINT pr*mary_0 PRI/AR2 KE2,!epar"men"_name CHAR(' UPPERCA3E NOT NULL UNI4UE,bu!e"_amoun" .ECI/AL(0',&,manaer_employee_number INTEGER6

    CREATE TABLE !epar"men"( !epar"men"_number INTEGER NOT NULL,!epar"men"_name CHAR(' UPPERCA3E NOT NULL UNI4UE,bu!e"_amoun" .ECI/AL(0',&,manaer_employee_number INTEGERUNI4UE PRI/AR2 IN.E5 (!epar"men"_number6

    &ecuting the same query as before, notice the oin elimination step ta+es place ust as it didwhen standard RI was enforced.

    $ind all employees in valid departments.

    E5PLAIN 3ELECT employee_number

    , !epar"men"_number1RO/ employee e, !epar"men" !

    WHERE e7!epar"men"_number 8 !7!epar"men"_number

  • 8/13/2019 Teradata Interview Prep Questions

    29/52

    OR.ER B2 &,06

    n E5PLAINof this query produces the following partial result!

    3) We do an all-AMPs RETRIEVE step from SQL!e "# $a# of an all-ro$s s%an $&t'a %ond&t&on of (N*T (SQL!e!department+n,m"er IS NLL))&nto Spool .(/ro,p+amps)0 $'&%' &s ",&lt lo%all# on t'e AMPs! T'en $e do a S*RT to order

    Spool . "# t'e sort 1e# &n spool f&eld.! T'e s&2e of Spool . &s est&mated $&t'no %onf&den%e to "e 4 ro$s! T'e est&mated t&me for t'&s step &s !5 se%onds!

    gain, the department table does not need to participate in the oin for the same reason asseen in the previous e&ample.

    Soft RI Caution!

    -ote that the responsibility for this query to produce accurate results lies with the user. If thetable data violates the rules of RI, then the oin elimination step can have consequences forthe accuracy of the results. It is assumed that the validation of the data for referential integrityta+es place e&ternal to eradata, or is enforced on eradata through other application methods.

    C/EC@(IM:

    +he pro!lem in the diskdrive and disk array...can corrupt the data....these type of corrupted data cant !e found easily..!ut "ueries against these

    corrupted data will get u wrong answers..we can find the corruption !y means of scandisk and

    checkta!le.....+hese errors will reduce the availa!ility

    of the D$H.......+his 'inda rrors is called D)sk )/o rrors

    )norder to avoid this in +D we have the D)sk )/o )ntegrity 0heck.... 0heck%um is used to check the

    Disk )/ )ntegrity 0heck

    !y means of checksum for ta!le level......this is a kinda protection techni"ue !y which we can select

    the

    various levels of corruption checking ..........

    +hese checks are done !y some integrity methods.....+his feature detects and logs the disk i/o errors

    +D give predefined data integrity levels check.....

    default,low,end,medium,high....etc...

    this checksum can !e ena!led.....using create ta!le for ta!le level.. DD2.

    for system level use D3%control utilty to set the parameter

    )f u wanna more hands on then u ve to use the scandisk and checkt!l utility....

    u ve to run the checkt!l utility in level 4 so that it will diagnos the entire rows,!yte !y !yte...

    8>.2hat is identit& column5- !eradata D24.1 with one+ co$umn (-!EOE4 data t,pe) that is defined as an -dentit, co$umn. IereJs the&&9

    C4EA!E 'E! !ABE testUta$e +/ 7ABAC + / BE7/4E H/04A+ / A7!E4 H/04A+ CIEC'0# V &E7A0! ( P4-#U4EO-/U-& -!EOE4 OEE4A!E& AAY' A' -&E!-!Y ('!A4! -!I 1

    -C4E#E! BY 1 #-DA0E 621%]%"?%] #ADA0E 21%]%"?%] / CYCE)+

  • 8/13/2019 Teradata Interview Prep Questions

    30/52

    P4-#U4EO-/UC& CIA4(?) CIA4AC!E4 'E! A!- /! CA'E'PEC-7-C /! 0) P4-#A4Y -&E ( P4-#U4EO-/U-& )K

    +eradata has a concept of identity columns on their ta!les !eginning around 5&6.-. +hese

    columns differ from racle7s se"uence concept in that the num!er assigned is not guaranteed to !e

    se"uential. +he identity column in +eradata is simply used to guaranteed row8uni"ueness.

    -ample9

    CREATE MLTISET TA6LE M#Ta"le ( ColA INTE7ER 7ENERATE8 69 8EFALT AS I8ENTIT9 (START WIT: . INCREMENT 69 ) Col6 VARC:AR() N*T NLL )NIQE PRIMAR9 IN8E; p&d< (ColA)=

    :ranted, 0olA may not !e the !est primary inde- for data access or joins with other ta!les in the

    data model. )t just shows that you coulduse it as the P) on the ta!le.

    81./o2 to implement IP(E3T loic in Teradata usin (=-5

    e have #E4OE6-!/ option avai$a$e in !eradata data which works as an 0P'E4! $ogic in teradata.

    E8amp$e9 #E4OE into deptUta$e1 as !aregt

    0'-O ('EEC! deptUno+ deptUname+ udget

    74/# deptUta$e where deptUno V 23) 'ource

    / (!arget.deptUno V 23)

    IE #A!CIE& then

    0P&A!E set deptUname V LBeing 4enamed>.2hat 2ill &ou do if &ou et lo2)confidence in explain plan.-n EPA- p$an when we get $ow confidence on a co$umn+ we define C/EC! '!A!-'!-C' for thatparticu$ar co$umn. !hen onwards PE prepares p$an with Iigh confidence.1>1.2hat 2ill &ou do if &ou et hih)confidence in explain plan.!hen we wi$$ run the *uer, without hesitation.1>*.0 ha'e one sBl Buer& 2hen 0 ran explain plan its sho2in Product join.

    hat are the factors &ou 2ill loo% in to the Buer& to ma%e mere join 5Product ;oins are the on$, ;oin t,pe that can ;oin two ta$es without a ind term.!he on$, wa, to avoid a product ;oin to make a merge ;oin is to supp$, a connecting term etween theta$es where the operator of the term is V. (!hese terms are ca$$ed B-& !E4#'.)-.e we can add another ;oin condition $ike 1 V 111>./o2 does indexin impro'e Buer& performance5

    -nde8ing is a wa, to ph,sica$$, reorgani:e the records to ena$e some fre*uent$, used *ueries to run faster.

    !he inde8 can e used as a pointer to the $arge ta$e. -t he$ps to $ocate the re*uired row *uick$, and thenreturn to ack to the user.or

    !he fre*uent$, used *ueries need not hit a $arge ta$e for data. the, can get what the, want from the inde8itse$f. 6 cover *ueries.-nde8 comes with the overhead of maintenance. !eradata maintains its inde8 , itse$f. Each time aninsert@update@de$ete is done on the ta$e the inde8es wi$$ a$so need to e updated and maintained.-nde8es cannot e accessed direct$, , users. /n$, the optimi:er has access to the inde8.

    111.Can 2e do collect stats on a tale 2hen the tale is ein updated5

    no

    11*.hat is Hoin 0ndex in TD and /o2 it 2or%s5

    A' 9 H/- -&E966666666666Hoin -nde8 is nothing ut pre6;oining 2 or more ta$es orviews which are common$, ;oined in order to reduce the;oining overhead.'o teradata uses the ;oin inde8 instead of reso$ving the;oins in the participating ase ta$es.!he, increase the efficienc, and performance of ;oin *ueries.!he, can have different primar, inde8es than the ase ta$esand a$so are automatica$$, updated as and when the ase rowsare updated. the, can have repeating va$ues.

    There are 4 t&pes of join indexes:

    1)'ing$e ta$e ;oin inde8 6 here the rows are distriutedased on the foreign ke, hash va$ue of the ase ta$e.2) #u$ti ta$e ;oin inde8 6 ;oining two ta$es.

    http://www.geekinterview.com/question_details/62647http://www.geekinterview.com/question_details/62647
  • 8/13/2019 Teradata Interview Prep Questions

    42/52

    ") Aggregate ;oin inde8 6 performing the aggregates ut on$,sum and count.

    114.0 ha'e t2o tales and one of the tale index is defined as IP0 or I(0. The second tale is ha'inan& of the indexes li%e IP0NIP0I(0 O3 NI(0. 0n this scenario 2hat t&pe of join strate& optimi,er2ill use 5#erge Hoin 'trateg,

    116.0 ha'e t2o tales. Most of the time 0 am joinin on the same columns. hich t&pe of join index2ill impro'e the performance in this scenario 5#u$ti ta$e ;oin inde8

    117.hen 2ill &ou create PP0 and 2hen 2ill &ou create secondar& indexes5Partitioned Primar, -nde8es are Created so as to divide the ta$e onto partitions ased on 4ange or Da$uesas 4e*uired. !his is effective for arger !a$es partitioned on the &ate and integer co$umns. !here is noe8tra /verhead on the ',stem (no 'p$ !a$es Created ect )

    'econdar, -nde8es are created on the ta$e for an a$ternate wa, to access data. !his is the second fastest

    method to retrieve data from a ta$e ne8t to the primar, inde8. 'u ta$es are created.

    PP- and secondar, inde8es do not perform fu$$ ta$e scans ut the, access on$, a defined st of data in the A#PJs.

    118.2hat is anoptimi:ationand performance tunin and ho2 does it reall& 2or% in practicalprojects. can i et an& example to etter understand.

    119.Explain aout (%e2 +actor5'kew factor occurs when the primar, inde8 co$umn se$ected is not a good candidate.#ean+ -f for a ta$e when the P- se$ected having high$, non uni*ue va$ues then 'E factor wi$$ egetting , defau$t it wi$$ e :ero+ if skew factor se$ected is greater than 2 then it is not a good sign.

    11;.hen &ou chose primar& index and 2hen 2ill &ou choose secondar& index5Primar, inde8 wi$$ e chosen at the time of ta$e creation. !his wi$$ he$p us in data distriution+ data retrieva$and ;oin operations.'econdar, inde8es can e created and dropped at an, time. !he, are used as an a$ternate path to accessdata other than the primar, inde8.

    11.hen 2ill &ou o for hash index5

    a.A hash inde8 organi:es the search ke,s with their associated pointers into a hash fi$e structure..e app$, a hash function on a search ke, to identif, a ucket+and store the ke, and its associated pointers in the ucket (or in overf$ow uckets).c.'trict$, speaking+ hash indices are on$, secondar, inde8 structures+since if a fi$e itse$f is organi:ed using hashing+ there is no need for a separate hash inde8 structure on it.

    scenario ased Buestions

    1*1. 0n case of replacement loadin 2hich utilit& &ou prefer5 Mload or +load57$oad.

    1**.0 ha'e a scenario 2here 0 update one column in a tale usin flat file as source. At the same timethe same column is ettin updated ecause of another flat file. hich utilit& 2ill e more applicalein this case5

    !pump is etter as it $ocks at row $eve$

    http://void%280%29/http://void%280%29/http://void%280%29/http://void%280%29/
  • 8/13/2019 Teradata Interview Prep Questions

    43/52

    The tale ot loaded 2ith 2ron data usin +astload and it failed. The error messae sho2n 2as:3D!M( error *87*: Operation not allo2ed: KdK.KtaleK is ein -oaded.U /o2 to realese loc% onthis tale5

    hen the data got $oaded comp$ete$, and sti$$ its $ocked+ sumit another fast$oad script withBEO- /A&-O A& E& /A&-O atetments a$one.

    0 need to create a delimited file usin fastexport. As fast export do not support delimited format so 0ha'e 2ritten the follo2in select to et the delimited output:

    selecttrim#col1$ VV LVL VVtrim#col*$ VV LVL VVtrim#col4$ VV LVL VV ..........................................trim#col7>$from tale

    ut the ao'e script prefix each line 2ith * jun% characters./o2 to et the data 2ithout the jun% characters.

    2hen the fastload chec% point 'alue is G 8> and 8> ho2 is that oin to matter5hen the checkpoint interva$ is RV ?3+ that indicates the minutes (time) interva$. -f the va$ueis more than ?3+ it wi$$ e considered as the no. of records ut not the time.

    1*4. 0 am loadin a delimited flat file 2ith a time format as the follo2in:

    //:MM PMSAM

    Examples 2ould e :

    :*7 PMAnd there is no ,ero if the hours is a sinle inteer 'alue.

    0s there an& 2a& that 0 2ould et the mload acBuisition phase count in the mload script5 M-OADsupport en'ironment pro'ides different 'ariales #total ins upd del etc.$ at the application phaseut not at the acBuisition phase.0s there an& 2a& other than scan the lo file5

    !here are various commands avai$a$e for the same.'Y'APYC!'Y'/APYC!'Y'4C&C!'Y'4HC!C!

    1*6. 0 ha'e this reBuirement 2hen error tale ets enerated durin the M-OAD 0 2ant to send anemail. /o2 can 0 achie'e this5W

    After #$oad use a B!E to *uer, for the error ta$e if present *uit on some va$ue sa, JMMJand use ,our /' to mai$ when the return code is MM.

    - am using the fo$$owing s,nta8 to $ogon to !eradata &emo thru B!E@B!Ein9.$ogon demotdat@dc+dcKand having the fo$$owing error9\\\ Error9 -nva$id $ogonW\\\ !ota$ e$apsed time was 1 second.!eradata B!E 3.32.33.33 for -"2. Enter ,our $ogon or B!E command9

    !he hosts fi$e shows the fo$$owing9

    12].3.3.1 $oca$host &emo!&A! &emo!&A!cop1

  • 8/13/2019 Teradata Interview Prep Questions

    44/52

    Iovewer when - use.$ogon demotdat@dcwithout specif,ing its password+ it prompts for a password... when - t,pe in its password+ - am a$e to $ogon.

    1*7.hat is the reason5hen we use B!E in interactive mode+ we cant direct$, gine the -d and pwd.e have to first give and the $ogon id and then press enter. /n$, after that we have to enter

    the password.

    1*8.Can 2e ma%e a M-OAD script fail 2hen the error tales are created 5Current$, the m$oad scripts e8its with a return code V 3 which means $oading is successfu$ even

    though it is not.-t has created some error ta$es which indicate some data has een re;ected....!here are various commands to do this operation..$ogoff X'Y'0DC! ^ X'Y'4HC!C! ^ X'Y'E!C! ^ X'Y'4CK

    T3OI!-E (/OOT0NJ

    7) open batch session got failed because of the following error.9RIR:7:;:7< 9R:=118 %atabase errors occurred!$n-ame! &ecute >> ?-*R@?A%'* eradata@?eradata %atabase@ %uplicate unique prime +eyerror in *$%91:%B:*-/.*$%9:*/:*5RR-:'*C.$n-ame! &ecute >> ?%ata%irect@?A%'* lib@ $unction sequence error

    Solution ! 9hen ever you want to open a fresh batch id, first of all you shouldclose the e&isting batch id and open a fresh batch id.

    1) source is $lat file and I am staging the this flat file in teradata.

    I found that the initial zeroDs are truncating in teradata. 9hat could be thereason.

    Solution ! he reason is that in teradata you are defined the column datatypeas Integer. hatDs why initial values are truncating. So, change the target tabledata type to BR*CR. BR*CR datatype it wonDt trucate the initial zeroDs.

    E) *anFt determine current batch I% for %ata Source 6G

    Solution ! $or any fresh stage load you should open a batch id for the currentdata source id.

    6) 5nique "rimary +ey violation *$%9:*/:*5RR-:'*C table.

    Solution ! In *$%9:*/:*5RR-:'*C table unique primary +ey definedon */:%:SR*:I%,*/:%:SR*:I-S:I% columns. t any point of time you shold have onlyone record for */:%:SR*:I%,*/:%:SR*:I-S:I% columns.

    2) canDt insert a -5// value in a -A -5// column.

    Solution ! $irst find all the -A -5// columns in a target table and cross verifywith the corresponding source columns and identify for which source column

  • 8/13/2019 Teradata Interview Prep Questions

    45/52

    you are getting -5// value and ta+e necessary action.

    H) source is $lat file and I am staging the this flat file in teradata.I found that the initial zeroDs are truncating in teradata. 9hat could be thereason.

    Solution ! he reason is that in teradata you are defined the column datatypeas Integer. hatDs why initial values are truncating. So, change the target tabledata type to BR*CR. BR*CR datatype it wonDt trucate the initial zeroDs.

    G) I am passing one record to target loo+ up but the loo+ up is not returningthe matching record.I +now that the record is present in loo up. 9hat actionyou will ta+e

    Solution ! use /RI,RRI in loo+ up sql override.this will remove theunwanted blan+ spaces. hen loo+ up will find the matching record in loo+ up.

    =) I am getting duplicate records for natural +ey (*/:%:SR*:#J) whatwill you do to eliminate duplicate records natural +ey.

    Solution! we will concatenate 1 ,E or more source columns and chec+ forduplicate records. If you are not getting duplicates after concatenating thenuse those columns to populate */:%:SR*:#J column in target.

    8) ccti:id is a -ot null column in 0R- table. Jou are getting a -5//value from *$%9:0R-:KR$ loo+ up what will you do to eliminate

    -5// records.

    Solution ! fter stage load, I will populate *$%9:0R-:KR$ table (thistable basically contain surrogate +eys). Ance you populate KR$ table then youwonDt get any -5// recordsccti:id column.

    74) 5nique primary +ey violation on *$%9:*/:'*C:CIS table.

    Solution ! In *$%9:*/:'*C:CIS table 5nique primary inde& defined on

    ectl:btch:id column. So, there should be only one uniue record for aectl:btch:id column.

    77) when will you use */:"0:I% column in target loo+ up sql overirde

    Solution ! when you are populating a single target table (0R- table)from multiple mappings in the same informatica folder then we will use*/:"0:I% in taget loo+ up sql override. his will eliminate unnecessaryupdating records.71) you are defined the primary +eys as per the / spec but you are getting

    the duplicate records. Cow will you handle.

    Solution ! part from the primary +ey columns in the spec,$irst I will add any

  • 8/13/2019 Teradata Interview Prep Questions

    46/52

    other column (other primary +ey columns in spec) as the primary +ey and I willchec+ for the duplicate records. If I didnDt get any duplicates, I will as+modeller to add this column as the primary +ey.

    7E) In teradata the error is mentioned as! Lno more room in databaseM

    Solution! I spo+e with %' to add the space for that database.

    76) hough the column is available in target table, when I am trying to loadusing load, it shows that tahe column is not available in the table. 9hy

    Solution! s the loading process was happening through a view and the viewwas not refreshed to add the new column, it was the error message. So,refresh the view definition to add the new column.

    72) when deleting the target table, though I wante to delete some data from

    the target table, by mista+e all the data got deleted from %evelopment table.

    Solution! dd */:%:SR*:I% and "0:I% in the where clause of thequery.

    7H) 9hile updatating the target table, it shows an error message sayingmultiple rows are trying to update a single row.

    Solution! here are duplicates available in the table matching the 9herecondition of the update qurey. hese duplicate records need to be eliminated.

    7G) I have a file with header, data records and trailer. %ata record is delimitedwith comma and header and trailer are fi&ed width. he header and trailerstarts with (C%R,R).I need to avoid the header and trailer while loading the file with ultiload."lease help me in this case.

    Solution! *ode load utility to consider only the data records e&cluding theheader and trailer records.

    A") la*el +,#R# R#C-T$-I% %&T I%',$R'/'TRA'0

    MO3E ON HO0N( 0NDEE(

    G+eradata makes itself the decision to use the inde- or not 8 if you are not careful you spend time in

    ta!le updates to keep up an inde- which is no used at all =one cannot give the "uery optimier hints

    to use some inde- 8 though collecting of statistics may affect the optimier strategy

    G)n the MP8&A% environment, look at the script (/etc/gsc/!in/perflook.sh(. +his will provide a

    system8wide snapshot in a series of files. +he :%0 uses this data for incident analysis.

    G $hen using an inde- one must keep sure that the inde- condition is met in the su! "ueries (using

    ), nested "ueries, or derived ta!les(

    G )ndication of the proper inde- use is found !y e-plain log entry (a &$ HA%H MA+0H %0A

    across A228AMP%(

  • 8/13/2019 Teradata Interview Prep Questions

    47/52

    G )f the inde- is not used the result of the analysis is the 7#?22 +A32 %0A7 where the

    performance time grows when the sie of the history ta!le grows

    G 'eeping up an inde- information is a time/space consuming issue. %ometimes +eradata is much

    !etter when you (manually( imitatate the inde- just !uilding it from scratch.

    G keeping up join inde- might help, !ut you cannot multiload to a ta!le which is a part of the join

    inde- 8 loading with 7tpump7 or pure 7%@27 is ' !ut does not perform as well. Dropping and re8creating a join inde- with a !ig ta!le takes time and space.

    G when your +eradata (e-plain( gives 7J7 steps from your "uery =even without the update of the

    results> and the actual "uery is a join of si- or more ta!les

    Case e.g.

    $e had already given up updating the secondary inde-es 8 !ecause we have not had much use for

    them.

    After some trials and errors we ended up to the strategy, where the actual (purchase fre"uency

    analysis( is never made (directly( against the history ta!le.

    )nstead91> +here is a (one8shot( run to !uild the initial (customer7s previous purchase( from the (purchase

    history( 8 it takes time, !ut that time is saved later

    > +he purchase fre"uency is calculated !y joining the (latest purchase( with the (customer7s

    previous purchase(.

    4> $hen the (latest purchase( rows are inserted to the (purchase history( the (customer7s previous

    purchase( ta!le is dropped and recreated !y merging the (customer7s previous purchase( with the

    (latest purchase(

    E> 3y following these steps the performance is not too fast yet =a!out J minutes in our two node

    system> for a !unch of almost 1.BBB.BBB latest receipts 8 !ut it is tolera!le now.

    =$e also tested !y adding !oth the previous and latest purchase to the same ta!le, !ut !ecause its

    sie was in average case much !igger than the pure (latest purchase(, the self8join was slower in

    that case>

    ;;;;;;;;;

    MANAGING CONCURRENT WORKLOADS

    0nterated e)commerce efforts present man& 2arehouse challenes. /ereLs ho2Teradata can help.

    +he word e8commerce means many things to many people. Although for some it connotes only the$e!, the real value of e8commerce can only !e realied when all channels of a !usiness are

    integrated and have full access to all customer information and transactions. )n fact, to me, e8

    commerce means using the rich technology availa!le today to !ring added value to the customer

    and additional value to the !usiness through all customer interaction channels.

    ?nder this definition of e8commerce, an active warehouse is at the epicenter, providing the storage

    and access for decision making in the e8commerce world. As more and more companies adopt

    active warehousing for this purpose, data warehouse workloads are e-panding and changing.

    )f your warehouse relies on a +eradata D3M%, you7ll find that handling the challenge of high8

    volume, widely varying, disparate service8level workloads is one of its core competencies. ne of

    the !iggest concerns ) hear from customers is how to deal with the "uickly rising num!er ofconcurrent "ueries and concurrent users that can result from active warehousing and e8commerce

    initiatives. -pected service levels vary widely among different groups of users, as do "uery types.

  • 8/13/2019 Teradata Interview Prep Questions

    48/52

    And, of course, the entire workload must scale upward linearly as the demand increases, ideally

    with a minimum of effort re"uired from users and systems staff. Here7s a look at some of the most

    fre"uent "uestions ) receive on the su!ject of mi-ed workloads and concurrency re"uirements.

    /o2 do 0 alance the 2or% comin in across all nodes of m& Teradata

    confiuration5*ou don7t. +eradata automatically !alances sessions across all nodes to evenly distri!ute work

    across the entire parallel configuration. ?sers connect to the system as a whole rather than a specific

    node, and the system uses a !alancing algorithm to assign their sessions to a node. 3alancing

    re"uires no effort from users or system administrators.

    Does Teradata alance the 2or% Bueries cause5

    +he even distri!ution of data is the key to parallelism and scala!ility in +eradata. ach "uery

    re"uest is sent to all units of parallelism, each of which has an even portion of the data to process,

    resulting in even work distri!ution across the entire system.

    #or short "ueries and update flow typical of $e! interactions, the optimier recognies that only asingle unit of parallelism is needed. A "uery coordinator routes the work to the unit of parallelism

    needed to process the re"uest. +he hashing algorithm does not cluster related data, !ut spreads it out

    across the entire system. #or e-ample, this month7s data and even today7s data is evenly distri!uted

    across all units of parallelism, which means the work to update or look at that data is evenly

    distri!uted.

    ill man& concurrent reBuests cause ottlenec%s in Buer& coordination5

    @uery coordination is carried out !y a fully parallel parsing engine =P> component. ?sually, one or

    more Ps are present on each node. ach P handles the re"uests for a set of sessions, and sessions

    are spread evenly across all configured Ps. ach P is multithreaded, so it can handle manyre"uests concurrently. And each P is independent of the others with no re"uired cross8

    coordination. +he num!er of users logged on and re"uests in flight are limited only !y the num!er

    of Ps in the configuration.

    /o2 do &ou a'oid ottlenec%s 2hen the Buer& coordinator must retrie'einformation from the data dictionar&5

    )n +eradata, the D3M% itself manages the data dictionary. ach dictionary ta!le is simply a

    relational ta!le, parallelied across all nodes. +he same "uery engine that manages user workloads

    also manages the dictionary access, using all nodes for processing dictionary information to spreadthe load and avoid !ottlenecks. +he P even caches recently used dictionary information in

    memory. 3ecause each P has its own cache, there is no coordination overhead. +he cache for each

    P learns the dictionary information most likely to !e needed !y the sessions assigned to it.

    ith a lare 'olume of 2or% ho2 can all reBuests execute at once5

    As in any computer system, the total num!er of items that can e-ecute at the same time is always

    limited to the num!er of 0P?s availa!le. +eradata uses the scheduling services ?ni- and +

    provide to handle all the threads of e-ecution running concurrently. %ome re"uests might also e-ist

    on other "ueues inside the system, waiting for )/ from the disk or a message from the 3*+, for

    e-ample. ach work item runs in a threadI each thread gets a turn at the 0P? until it needs to wait

    for some e-ternal event or until it completes the current work. +eradata configures several units of

    parallelism in each %MP node. ach unit of parallelism contains many threads of e-ecution that

    aren7t restricted to a particular 0P?I therefore, every thread gets to compete e"ually for the 0P?s in

  • 8/13/2019 Teradata Interview Prep Questions

    49/52

    the %MP node.

    +here is a limit, of course, to the num!er of pieces of work that can actually have a thread allocated

    in a unit of parallelism. nce that limit is reached, +eradata "ueues work for the threads. ach

    thread is conte-t free, which means that it is not assigned to any session, transaction, or re"uest.

    +herefore, each thread is free to work on whatever is ne-t on the "ueue. +he unit of work on the

    "ueue is a processing step for a re"uest. 0om!ining the "ueuing of steps with conte-t8free threads

    allows +eradata to share the processing service e"ually across all the concurrent re"uests in thesystem. #rom the users7 point of view, all the re"uests in the system are running, receiving service,

    and sharing system resources.

    /o2 does Teradata a'oid resource contention and the resultin performance andmanaement prolems5

    +eradata algorithms are very resource efficient. ther D3M%s optimie for single8"uery

    performance !y giving all resources to the single "uery. 3ut +eradata optimies for throughput of

    many concurrent "ueries !y allocating resources sparingly and using them efficiently. +his kind of

    optimiation helps avoid wide performance variations that can occur depending on the num!er of

    concurrent "ueries.$hen faced with a workload that re"uires more system resources than are availa!le, +eradata tunes

    itself to that workload. +hrashing, a common performance failure mode in computer systems,

    occurs when the system has fewer resources than the current workload re"uires and !egins using

    more processing time to manage resources than to do the work. $ith most data!ases, a D3A would

    tune the system to avoid thrashing. However, +eradata adjusts automatically to workload changes

    !y adjusting the amount of running work and internally pushing !ack incoming work. ach unit of

    parallelism manages this flow control mechanism independently.

    0f all concurrent 2or% shares resources e'enl& ho2 are different ser'ice le'elspro'ided to different users5

    +he Priority %cheduler #acility =P%#> in +eradata manages service levels among different parts of

    the workload. P%# allows granular control of system resources. +he system administrator can define

    up to five resource partitionsI each partition contains four availa!le priorities. +ogether, they

    provide B allocation groups =A:s> to which portions of the workload are assigned !y an attri!ute

    of the logon )D for the user or application. +he administrator assigns each A: a portion of the total

    system resources and a scheduling policy.

    #or e-ample, the administrator can assign short "ueries from the $e! site a guaranteed B percent

    of system resources and a high priority. )n contrast, the administrator might assign medium priority

    and 1B percent of system resources to more comple- "ueries with lower response8time

    re"uirements. %imilarly, the administrator might assign data mining "ueries a low priority and five

    percent of the total resources, effectively running them in the !ackground. *ou can define policies

    so that the resources adjust to the work in the system. #or e-ample, you could allow data mining

    "ueries to take up all the resources in the system if nothing else is running.

    ?nlike other scheduling utilities, P%# is fully integrated into the D3M%, not managed at the task or

    thread level, which makes it easier to use for parallel data!ase workloads. 3ecause P%# is an

    attri!ute of the session, it follows the work wherever it goes in the system. $hether that piece of

    work is e-ecuted !y a single thread in a single unit of parallelism or in ,BBB threads in JBB units of

    parallelism, P%# manages it without system administrator involvement.

    0P? scheduling is a primary component of P%#, using all the normal techni"ues =such as "uantum

    sie, 0P? "ueues !y priority, and so on>. However, P%# is endemic throughout the +eradata D3M%.

    +here are many "ueues inside a D3M% handling a large volume mi-ed workload. All of those

    "ueues are prioritied !ased on the priority of the work. +hus, a high priority "uery entered after

    several lower priority re"uests that are awaiting their turn to run will go to the head of the "ueue

  • 8/13/2019 Teradata Interview Prep Questions

    50/52

    and will !e e-ecuted first. )/ is managed !y priority. Data warehouse workloads are heavy )/

    users, so a large "uery performing a lot of )/ could hold up a short, high8priority re"uest. P%# puts

    the high8priority re"uest )/s to the head of the "ueue, helping to deliver response time goals.

    Data 2arehouse dataases often set the s&stem en'ironment to allo2 for fast scans.Does Teradata performance suffer 2hen the short 2or% is mixed in5

    3ecause +eradata was designed to handle a high volume of concurrent "ueries, it doesn7t count onse"uential scans to produce high performance for "ueries. Although other D3M% products see a

    large fall in re"uest performance when they go from a single large "uery to multiple "ueries or

    when a mi-ed workload is applied, +eradata sees no such performance change. +eradata never plans

    on se"uential access in the first place. )n fact, +eradata doesn7t even store the data for se"uential

    accesses. +herefore, random accesses from many concurrent re"uests are just !usiness as usual.

    %ync scan algorithms provide additional optimiation. $hen multiple concurrent re"uests are

    scanning or joining the same ta!le, their )/ is piggy!acked so that only a single )/ is performed to

    the disk. Multiple concurrent "ueries can run without increasing the physical )/ load, leaving the

    )/ !andwidth availa!le for other parts of the workload.

    hat if 2or% demand exceeds TeradataLs capailities5

    +here are limits to how much work the engine can handle. A successful data warehouse will almost

    certainly create a demand for service that is greater than the total processing power availa!le on the

    system. +er