francesco vespignani discof universit a degli studi di...

41
Functions and text files Francesco Vespignani DiSCoF Universit` a degli Studi di Trento. [email protected] December 3, 2009

Upload: nguyenbao

Post on 23-Feb-2019

217 views

Category:

Documents


0 download

TRANSCRIPT

Functions and text files

Francesco VespignaniDiSCoF Universita degli Studi di Trento.

[email protected]

December 3, 2009

Today

Functions and Scripts

Latex

Strings

Text Files

Practice

Function and Scripts

Finalizing a script

Once you have solved a problem you can save it in a .m file andgeneralize the results.

Where to save it?

In the current directory or in the matlab path.

The matlab path is simply a list or directories where to look foruser-defined scripts and functions.

Types of scripts

Matlab has three ways to create new commands:

I scripts

I functions

I mex-files

Scripts and functions

Scripts are just a sequence of commands that run in the sameworkspace: all variables defined within the caller workspace maybe changed, every variables defined by the script are visible in thecaller workspace.

Functions strictly define which variables of the caller are visiblewithin the script (input arguments) and which one are returnedback to the caller workspace output arguments.

Moreover functions are compiled the first time that are called,while scripts are line-by-line interpreted as if they are typed on aterminal.

Advantages of functions: it guarantees informationencapsulation.

Disadvantages of functions: when a large amount of data is passedas an argument a copy is created in memory and this can reduceperformances.

Function’s syntax

The difference between scripts and functions depends on the firstline of the .m file

Syntax of a function:

function [out1, out2, ...] = MyFunctionName ( in1, in2, ...)

%MYFUNCTIONNAME makes something interesting

% MYFUNCTIONNAME(a, b) returns some values ...

% function, argument, output description ...

%

% example ...

%Written by Francesco Vespignani, version 0.98, date January 20, 2012....

Function body....

out1 = ’ciao’

....

end

A function must be stored in a file with the same name as thefunction name (in case of disagreement the files name dominates).

Global variablesThe function workspace is distinct from the caller’s workspace.

If you need you can create global variables, that are visible fromwithin every function that has the declaration.

Globals are useful to avoid a copy of a large data set at everyfunction call or to share values between functions.

Other programming languages (C) have a argument passing byaddress possibility.

Example from Mastering matlab:

function myTic

global myTicToc

myTicToc = clock;

function myToc

global myTicToc

disp(etime, clock,myTicToc);

Arguments

A function can have a variable number of input and outputarguments.

Function behavior can be programmatically changed as a functionof the number and types or arguments.

For these purposes there are special variables:

I nargin: number or input argument in the actual call to thefunction

I nargout: number or output argument in the actual call tothe function

I varargin: cell array containing the input argument values

I varargout: cell array containing the output argument values

Arguments

Cell arrays are similar to arrays but can contain different typesordata. You can access to the elements by using Braces instead ofSquare brackets: varargout{12} In order to use varargin,

varargout this must be explicitly declared in the functionstatement. Otherwise a function called with too much input oroutput arguments rises an error.

Good practice is to do argument checking to prevent unexpectedresults.

With if ... then ... else ... end construct you cancheck if the arguments are of the right datatype or if numbers arethey are within a specific range of values (e.g. positive, even ....).

A function typically exits at the end of the script. You can returnto the caller execution stream with the return or with the error

command.

Practice

We can try to generate a function that loads a file with a bwsnodgrass bmp image, transforms it in a rgb or indexed image andchanges foreground and background colors.

Let’s make some input and output optional (returning the image orwriting on a new file, changing only fg or both...) Provide an help.

Latex

A brief history of LATEX

LATEXis one of the evolution of TEX, written by Donald ErvinKnuth, Stanford University starting from 1978 (see the wikipediaentry: TEX).

TEXwas primary intended for typesetting mathematical formulas,given that new press technology were very poor compared toclassical typographical techniques.

∑i

~A · ~B = −P∫r · n dA = P

∫~∇ · r dV . (1)

Further examples of basics mathematical typesetting by Harvey

Gould at http://sip.clarku.edu/tutorials/TeX/

LATEXrepositories and documentation

Resources:

I The Tex users community web page : http://www.tug.org/

I the CTAN archive: http://www.ctan.org/

I A Windows distribution: http://www.miktex.org/

I A text editor with syntax highlights and buttons for runningthe program may be useful (Emacs, WinEdt, TeXnic, ....)

Manuals and tutorial:

I There are many on the web ...

I learning by example (many authors public their source .texfiles....)

I The not so short introduction to LaTeX

How does LATEXwork?

It is a markup language, similar to html and xml.

It is somewhat a script mixing data with tags (preceded by abackslash) that are commands that tell to the program how torender a particular type of text or how to use some data to creategraphic objects in the output file.

The output file is a vector graphic data file:

I dvi: (device independent) a typical latex generated format

I postscript: a language originally developed for interfacing inthe same way to different printers and plotters (seeGhostscript, Ghostview and GSview).

I pdf: the file format created for Acrobat Reader by Adobe

I svg: scalable vector graphics an xml format

Install and configure MikTeX and TeXnic

Two problems:

I access to the network

I installing software (user privileges)

We are not directly connected to the web, we need to manuallyconfigure the proxy in the applications that need a web connection:address: proxy.unitn.it port: 3128

Why we need thatOpensource languages and applications, but also Matlab,sometimes need to download and install additional components(packages).

Packages are bundles of scripts that provide additional features,typically contributed by users and not only by the application coredevelopment team.

Package differs from simple scripts in that they follows specificprinciples that minimize interference with the core functions andwith the other packages and scripts.

Packages may depend on other (user-contributed) packages, thisgive rise to a somewhat complex hierarchy.

Packages needs to be aligned in that some package needs aspecific version of the environment or of other packages.

Why not installing all the packages? They are a lot (more popularan application is, more packages, you will probably never use,exists).

How to install packages, manually

Typically packages can be installed either manually orautomatically.

Manual installation requires the download of the package fromsome web site, in the form of a tarball or a zipped bunch of files.The tarball has to expanded into a specific place within thedirectory tree of the application. After that, for most applications,some command has to be run in order to refresh an internal list ofthe available packages.

How to install packages, automatically

Usually applications such as MikTex have an automatic packagedownload and installation system.

This is available only for popular packages that are added in thesoftware distribution. (Some new or experimental packages maynot be available in the distribution and this is the only situationwhere manual installation is mandatory).

Distributions are usually stored on a large number of computersthat are kept aligned with the main distribution source (Mirrorsites).

Automatic installation guarantees that the right package versionfor the installed distribution/version is installed.

Knowing the structure of the application

On windows usually installing and configuring an applicationinvolves many system directories, on unix-style applications (latex,R, matlab) usually everything is installed in the applicationdirectory that may have a rather complex structure. Typically whatwe need to know of this complex structure is where the executablefiles are and where the add-on packages and documentation areplaced.

Let’s see the MikTex hierarchy

executables: MikTex X.Y/miktex/bin

latex packages: MikTex X.Y/tex/latex

packages documentation: MikTex X.Y/doc/latex

Halloworld in LATEX

Write in a file prova.tex the following:

\documentclass{article}

\begin{document}

halloworld!

\end{document}

}

Then run from a command line >pdflatex prova.tex.

A note about Miktex installation on windows (installing the fulldistribution with most of the packages takes time and disk space).On-the-fly package installation is possible (if allowed by thenetwork firewall settings).

LATEXDocument basic structure

The structure of a LATEXdocument:

\documentclass{article}

Preamble containing style-packages, definitions, macros

\begin{document}

%

% Body of the document with sections, tables, figures etc.

%

\end{document}

learn LATEX

See simple examples from the web: download small2e.texsample2e.tex

Read the html help (cs.wlu.edu/ necaise/refs/latex2e/ or a localone) to understand how to make paragraphs, sectioning,environments (equation, table, figure, tabular), cross-referenceswithin the document.

Figure numbering, section pages and bibliographic references areautomatically managed. Complete refresh (e.g. of table ofcontents) may require to run pdflatex multiple times, since theoutput creation is strictly sequential.

BibTeX is a different TEXvariant to print references on differentformats (apastyle ....).

Packages

Basic latex allow to do relatively few thinks. It evolves by packagesprovided by the developers community.

Some very useful commonly used packages:

I geometry: to personalize page borders, spaces etc.

I fancyhdr: full left-right alternating customization of pageheaders

I savetrees: basic latex has a elegant large borders page layout,however trees are important

I babel: support for many languages

I inputenc: defines the encoding of special characters (accents,umlaut ....)

I graphicx: to include vector and raster images

“modern” LATEXexamples

Making slides or posters: Beamer!

beamerBeamer User Guidea beamer quickstart

beamerposter (or sciposter)

Drawings, complete control of arrows, overlays etc.: TiKz

TikZ and PGF manualExamples web page

LATEXfor linguistics:Latex4Ling

If you want to write a real book in LATEXsee the class memoir:memoir

LATEX

I’ll provide some examples from my own works on the website.

In particular there are some basic document classes:

I letter

I article

I report

I book

LATEXis not easy to learn (steep beginners learning curve) but aftersome experience you can adapt many examples you find in the webto your purposes.

Unlike microsoft word or other similar application it is easy tointegrate into a script and execute it automatically, in order toextend graphic capability of a program (as matlab).

Practice

Configuring MikTex for network packages installation.

Telling TeXnic (or other editors) where the latex executables are.

Practice with on-the fly package download and installation.

Hoping that now the questionario and other latex examples works....

Strings

Text processing in matlab

Strings in matlab are just arrays of characters.

In other scripting languages there are more sophisticated textprocessing tools.

However there are specialized functions for strings: let’s see doc

strings.

For today’s practice see in particular strtok for segmenting astring.

Advanced string functions are regular expressions (see regexp).

Characters

Internal representation of characters is a numeric (integer code).

The standard encoding for the more common characters is (seeASCII).

For special characters are encoded using different standards (fordifferent schemes see Wikipedia Character Encoding).

Some codes correspond to non-printable characters (such as tab ornew-line). These can be produced by using the code or as escapecharacters for example in matlab they are preceded by the specialcharacter backslash and the string \n correspond to a new-linecharacter.

Text Files

Files

There are specific functions in matlab for reading and writing tofiles (similar to C). Basic concepts about files:

I special commands allow to open and close files. Thisguarantees that during file processing other applications doesnot change the file content.

I file access is typically sequential (read or write a piece afterthe other) or random (using special function that placesreading or writing operation at specific points of the file).

I files can contain text or binary data. The open instruction canspecify the type of file. Text files automatically recognizeend-of-line and other special characters. When opening a textfile you may specify the type of encoding (accent letters,chinese characters....).

Files

For reading and writing on files matlab has a full set of functions(see halp iofun) at different level:

I dlmread, dlmwrite read and write entire delimited files (as .csvtables), does not need open/close functions.

I fprintf, fscanf read and write text files using c-style formatting

I fgets fgets gets line-by-line from a text file

I fread, fwrite low level io functions for binary data

The same instructions can be used for virual files that can bepipeline communication with peripherals, processes or serial andparallel ports.

Files

When opening a file you can decide:

I whether to open in read-only or read-write mode.

I whether to open it as a binary or text file (text filesautomatically manage end-of-line characters)

I which is the specific machine format (how numbers andcharacters are encoded as binary numbers), the encodingstypically change on different operative system. If you don’tmind use native.

I which is the character encoding scheme (for text files)

See help fopen

Practice

Today and homework practice

Take a text (atext.txt) from a text file and format it using LATEX.

After this try to generate from matlab a number of pdf copies ofthe same text replacing at random words with space forcompletion.

Completion texts were used in the fifties as a psycholinguistic toolto evaluate text readability (application to propaganda during thecold war). The idea is to replace words and see it the peoplecompletes with the right words.

Our scripts should be able to produce many versions of the sametext with different random words substituted with spaces.

Files Input output

This practice is rather ambitious. Let’s split it up:

Practice with file reading and writing (both number and strings).

Try forst with high-level (dlmwrite, dlmread) and then lower level(gets, fprintf).

Practice with the formatting strings (sprintf).

System invocation

Automatically execute an external program:

help system

Please use a local working directory (not windows documents ordesktop, that are on a remote server ....)

LATEX

Read the text file and copy it in a tex document.

Execute within matlab pdflatex.

Substitute some (random) words with underscore in a tex file (userule.

Do it automatically from matlab (saving a text file with the wordsthat were deleted).

Advanced programmers: keep punctuation ....

A word about randomization

Matlab has a function rand that produces a pseudo-casualsequence of number uniformly distributed between zero and 1.

Pay attention that every time you launch rand after openingmatlab the exact same sequence is returned.

This is done so the same sequence can be produced when testingan algorithm. To change the sequence you have to change the seedthat is an initial value of the random algorithm that determinescausally the sequence.

To set a random seed as a function of the current date and timeyou can run:rand(’twister’,sum(100*clock)).See the help page for further information.