m_dump.doc

Upload: srikanthkumarreddy

Post on 03-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 m_dump.doc

    1/4

    m_dump metadata [data] [action] It produces human readable report thatshows how input data is interpreted by AbInitio metadata. It can print the metadata ora description of the metadata. It can alsoevaluate specified expressions within eachrecord.

    metadatais one of: filename: Read metadata from file. -stringstring: Read metadata from string.

    datais one of: filename: Read data from file. Specify withURL for a remote file or multifile.

    -stringstring: Read data from string. - [hyphen]: Read data from standard input.

    actionis zero or one of: -print-metadata: Print metadata. -describe: Describe structure of metadata:

    the names, offsets, sizes, and types forevery field.

    -print-data: Default: Print data to standardoutput.

    -no-print-data: Suppress printing of data. -printexpression: Evaluate expression for

    each record displayed and print result. -startrecnum: Start data printing at record

    recnum. -endrecnum: End data printing at record

    recnum. -recordrecnum: Print data only for record

    recnum.Note: the first record is record number 1, andstart and end are inclusive. -partition: Print an individual partition of a

    multifile. This option must appear last onthe command line. Partitions are numberedin the range 0-n, inclusive.

    -report: Produce monitor reports asspecified in the variable XX_REPORT.

    m_attach:Ab Initio provides this shell command to facilitate remote startup on large parallelsystems.

    m_env:Displays the current settings of the Ab Initio environment variables. Invoke m_env withthe option h for added help (m_env h).

    Environment Variables:Set these environment variables if we want a value different from thedefault.

    XX_TIMEOUT=seconds The time-out interval for certain operations,such as starting a remote process. Default is30 seconds.

    XX_MAX_RECORD_BUFFER=bytes Maximum buffer size that certain parts of thesystem will use to hold a record. Default is 5million bytes.

    XX_NICE=priority Run jobs on remote nodes at the specifiedpriority.

    XX-SORT-MAX-CORE=megabytes The default value for the max-core argument

  • 8/12/2019 m_dump.doc

    2/4

    to the local-sort component. Default is 10megabytes.

    Special Ab Initio Facilities:

    HOST_ALIAS_FILE=path File containing hostname aliases.

    XX_CATALOG=path Location of user-created metadata catalogs.XX_REPORT=keyword Monitor the current job and produce reports.

    Debugging:

    IWAIT=true Enable debugging via interactive wait.

    XX_DEBUG=value Set debugging mode.

    DISPLAY=display_id The X Windows display. Used to pop updebuggers.

    TRACE_ALL_SOCS=path Trace al process SOC events to files namedprogram-name.soc in directory specified bypath.

    LAUNCHER_TRACE Enable trace output from the low-level layer

    that controls remote process control and jobrecovery.

    An Ab Initio application is a set of mpcommands, beginning with mp joband ending (usually)with mp run. In between are commands that identify the program components and indicate theflow of data from one to the next. Thus, the mp script usually defines and runs the job.

    When a script is invoked, the mp jobcommand executes. At this point, the system creates twofiles in the current working directory:

    jobname.job:As the rest of the script is read, a text representation of the applicationbeing defined is placed here. The file is a text file.

    .abinitio-current-job: This file contains jobname, it enables the system to know thename of the current job.

    If two or more mp jobs are running in the same directory at the same time, one job will overwritethe others .abinitio-current-job file. To avoid this problem, use the environment variableAB_JOB. When AB_JOBis set, all mpcommands use its value as the name of the current mp

    job, ignoring the name stored in .abinitio-current-job .An Ab Initio application may be designed to execute in sequential phases with or without checkpointing, which means saving state to disk between phases.

    Phased execution is enabled from within the application, if the script developer ahs inserted thecommand mp phaseor mp checkpointbetween one component and another.

    Phasingmakes a difference in how the application uses the system resources, often tradingoff performance for safety. Phasing inhibits pipeline parallelism but guarantees that resource-intensive stages will not compete with each other.

    When a job does not complete normally, it leaves a file in the working directory on the hostsystem with the namejobname.rec. This file contains a set of pointers to the log files on thehost and on every node. The log files are placed in the subdirectories that are created when theapplication starts and deleted when the application successfully completes.

    If the application encounters a software failure, all nodes and their respective files will berolled back to their initial state, as if the application were not run at all. If the program containscheckpoint commands, the state restored is that of the most recent checkpoint.

  • 8/12/2019 m_dump.doc

    3/4

    Specifically, the Ab Initio system will: Kill all processes running on all nodes, including control processes and processes that

    constitute the partitions of a parallel program. Cleanly shut down all data flows. Rollback the effects of all file changes. Report the state of the system. Exit.

    It is not possible for the Co>Operating System to restore the system to an earlier state. Forexample, a failure could occur because a node or its native operating system crashed. In thiscase, it is not possible to cleanly shut down flow or file operations, nor to rollback file operationsperformed in the current phase. In fact, it is likely that stray files(intermediate temporaries) willbe left lying around. To complete the cleanup and get the job running again, you must perform amanual rollback. For this, we use the command m_rollback.

    m_rollback [-d] [-I] [-h] recovery file

    -d:Delete the job along with its recovery file and any log files it created.-i:Display the state of the job and prompt the user whether the job should be deleted.If theioption is not used, jobs that have reached their first checkpoint will be rolled back to the

    checkpoint. Jobs that do not include checkpoints or that did not reach their first checkpoint will bedeleted.

    Monitoring

    Monitoring is controlled in either (or both) of two ways: From the shell, set the configuration variable XX_REPORT before running the job. Within the script, supply arguments to thereportoption to the mp runcommand.

    The keywords are: Verbose-errors Expanded-graph Flows

    Times Skew Skew=n Scroll=mode File=filename Interval=n Table-flows

    export XX_REPORT=flows times interval=10 (ksh)

    mp run report flows times interval=10 (in script)

    File Skew

    Skew is only of concern if its large (say, over 25%) and if large amounts of data or CPU timeare involved.

    Situations that might lead to skew are an overloaded node, unbalanced data, ordifferent node speeds.

    An overloaded node:If a node is overloaded, then data flows will tend to show upas initially skewed, but the skew will go to zero at the end of the run.

  • 8/12/2019 m_dump.doc

    4/4

    Unbalanced Data:If different partitions of a data flow have different amounts of data,then both data and CPU time will be skewed at the end of the run.

    Different node speeds:If some nodes are faster than others, then skew is likelyto result. In this case, CPU times will be skewed at the end of the run, but not datavolumes.

    Debugging

    The XX_DEBUGenvironment variable controls the tracing and debugging of processes.

    The IWAITmechanism is a simple job-tracking system that lets us detect and handle processesthat fail. We must set IWAIT in order to use any tracing or debugging features.

    Administration

    AB_SUPPRESS_HISTORY_CHECK: Permits changing parameters when restarting acheckpointed mp job.

    AB_CONNECTION, AB_CONNECTION_SCRIPT, AB_PASSWORD, AB_USER:controlaspects of remote connections.

    AB_NODESis used for defining node aliases

    Performance

    The m_attachutility accelerates job startup on IBM SP configurations of 9 or more nodes.