programming for geographical information analysis: core skills

48
Programming for Geographical Information Analysis: Core Skills Lecture 7:Core Packages: File Input/Output

Upload: vidal

Post on 08-Jan-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Programming for Geographical Information Analysis: Core Skills. Lecture 7:Core Packages: File Input/Output. This lecture. Files Text files Binary files. Files File types Dealing with files starts with encapsulating the idea of a file in an object. File locations. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Programming for  Geographical Information Analysis: Core Skills

Programming for Geographical Information Analysis:

Core Skills

Lecture 7:Core Packages:File Input/Output

Page 2: Programming for  Geographical Information Analysis: Core Skills

This lecture

FilesText filesBinary files

Page 3: Programming for  Geographical Information Analysis: Core Skills

FilesFile types

Dealing with files starts with encapsulating the idea of a file in an object

Page 4: Programming for  Geographical Information Analysis: Core Skills

File locations

Captured in two classes:

java.io.File

Encapsulates a file on a drive.

java.net.URL

Encapsulates a Uniform Resource Locator (URL), which could include internet addresses.

Page 5: Programming for  Geographical Information Analysis: Core Skills

java.io.File

Before we can read or write files we need to capture them. The File class represents an external file.

File(String pathname);

File f = new File("e:/myFile.txt");

However, we must remember that different OSs have different file systems. Note the use of a forward slash.Java copes with most of this, but “e:” wouldn’t work in *NIX / Mac / mobiles etc.

Page 6: Programming for  Geographical Information Analysis: Core Skills

Getting file locations

java.awt.FileDialog

Opens a “Open file” box with a directory tree in it. This stays open until the user chooses a file or cancels.Once chosen use FileDialog’s getDirectory() and getFile() methods to get the directory and filename.

Page 7: Programming for  Geographical Information Analysis: Core Skills

Getting file locations

import java.awt.*;

import java.io.*;

FileDialog fd = new FileDialog(new Frame());

fd.setVisible(true);

File f = null;

if((fd.getDirectory() != null)||( fd.getFile() != null)) {

f = new File(fd.getDirectory() + fd.getFile());

}

Page 8: Programming for  Geographical Information Analysis: Core Skills

The application directory

Each object has a java.lang.Class object associated with it. This represents the class loaded into the JVM.

One use is to get resources local to the class, i.e. in the same directory as the .class file. We use a java.net.URL object to do this.

Class thisClass = getClass();

URL url = thisClass.getResource("myFile.txt");

We can then use URL’s getPath() to return the file path as a String for the File constructor.

Page 9: Programming for  Geographical Information Analysis: Core Skills

Useful File methodsexists(), canRead() and canWrite()

Test whether the file exists and can be read or written to.

createNewFile() and createTempFile()Create a new file, and create a new file in “temp” or “tmp”.

delete() and deleteOnExit() Delete the file (if permissions are correct). Delete when JVM

shutsdown.

isDirectory() and listFiles() Checks whether the File is a directory, and returns an array of Files

representing the files in the directory. Can use a FilenameFilter object to limit the returned Files.

Page 10: Programming for  Geographical Information Analysis: Core Skills

Files

File types

As we’ll see, the type of the file has a big effect on how we handle it.

Page 11: Programming for  Geographical Information Analysis: Core Skills

Binary vs. Text files

All files are really just binary 0 and 1 bits. In ‘binary’ files, data is stored in binary representations of the

primitive types:

00000000 00000000 00000000 00000000 = int 000000000 00000000 00000000 00000001 = int 100000000 00000000 00000000 00000010 = int 200000000 00000000 00000000 00000100 = int 400000000 00000000 00000000 00110001 = int 4900000000 00000000 00000000 01000001 = int 6500000000 00000000 00000000 11111111 = int 255

8 bits = 1 byte

Page 12: Programming for  Geographical Information Analysis: Core Skills

Binary vs. Text files

In text files, which can be read in notepad++ etc. characters are stored in smaller 2-byte areas by code number:

00000000 01000001 = code 65 = char “A”00000000 01100001 = code 97 = char “a”

Page 13: Programming for  Geographical Information Analysis: Core Skills

Characters

All chars are part of a set of 16 bit international characters called Unicode.

These extend the American Standard Code for Information Interchange (ASCII) , which are represented by the ints 0 to 127, and its superset, the 8 bit ISO-Latin 1 character set (0 to 255).

There are some invisible characters used for things like the end of lines.

char back = 8; // Try 7, as well!System.out.println("hello" + back + "world");

The easiest way to use stuff like newline characters is to use escape characters.

System.out.println("hello\nworld");

Page 14: Programming for  Geographical Information Analysis: Core Skills

Binary vs. Text files

Note that :

00000000 00110001 = code 49 = char “1”

Seems much smaller – it only uses 2 bytes to store the character “1”, whereas storing the int 1 takes 4 bytes.

However each character takes this, so:

00000000 00110001 = code 49 = char “1”

00000000 00110001 00000000 00110010 = code 49, 50 = char “1” “2”

00000000 00110001 00000000 00110010 00000000 00110111 = code 49, 50, 55 = char “1” “2” “7”

Whereas :00000000 00000000 00000000 01111111 = int 127

Page 15: Programming for  Geographical Information Analysis: Core Skills

Binary vs. Text files

In short, it is much more efficient to store anything with a lot of numbers as binary (not text).

However, as disk space is cheap, networks fast, and it is useful to be able to read data in notepad etc. increasingly people are using text formats like XML.

As we’ll see, the filetype determines how we deal with files.

Page 16: Programming for  Geographical Information Analysis: Core Skills

Review

File f = new File("e:/myFile.txt");

Three methods of getting file locations:HardwiringFileDialogClass getResource()

Need to decide the kind of file we want to deal with.

Page 17: Programming for  Geographical Information Analysis: Core Skills

This lecture

Files

Text filesBinary files

Page 18: Programming for  Geographical Information Analysis: Core Skills

Input and Output (I/O)

So, how do we deal with files (and other types of I/O)?

In Java we use address encapsulating objects, and input and output “Streams”.

Streams are objects which represent the external resources which we can read or write to or from. We don’t need to worry about “how”.

Input Streams are used to get stuff into the program. Output streams are used to output from the program.

Page 19: Programming for  Geographical Information Analysis: Core Skills

Streams

Streams based on four abstract classes…

java.io.Reader and WriterWork on character streams – that is, treat everything like it’s going to be a character.

java.io.InputStream and OutputStreamWork on byte streams – that is, treat everything like it’s binary data.

Page 20: Programming for  Geographical Information Analysis: Core Skills

Character based streams

Two abstract superclasses – Reader and Writer.

These are used for a variety of character streams.

Most important are:FileReader : for reading files.FileWriter : for writing files.

Page 21: Programming for  Geographical Information Analysis: Core Skills

ExampleFile f = new File(“myFile.txt");

FileReader fr = null;

try {

fr = new FileReader (f);

} catch (FileNotFoundException fnfe) {fnfe.printStackTrace();

}

try {

char char1 = fr.read();fr.close();

} catch (IOException ioe) {ioe.printStackTrace();

}

Read one character out of the file.

Close the connection to the file so others can use it.

Page 22: Programming for  Geographical Information Analysis: Core Skills

ExampleFile f = new File("myFile.txt");

FileWriter fw = null;

try {

fw = new FileWriter (f, true);

} catch (IOException ioe) {ioe.printStackTrace();

}

try {

fw.write("A");fw.flush();fw.close();

} catch (IOException ioe) {ioe.printStackTrace();

}

Make sure everything in the stream is written out.

Note this boolean is optional and sets whether to append to the file (true) or overwrite it (false). Default is overwrite.

Page 23: Programming for  Geographical Information Analysis: Core Skills

Buffers

Plainly it is a pain to read a character at a time.

It is also possible that the filesystem may be slow or intermittent, which causes issues.

It is common to wrap streams in buffer streams to cope with these two issues.

BufferedReader br = new BufferedReader(fr);

BufferedWriter bw = new BufferedWriter(fw);

Page 24: Programming for  Geographical Information Analysis: Core Skills

ExampleBufferedReader br = new BufferedReader(fr);// Remember fr is a FileReader not a File.

int lines = -1;String textIn = " ";String[] file = null;

try {while (textIn != null) {

textIn = br.readLine();lines++;

}

file = new String[lines];

// close the buffer here and remake both FileReader and

// buffer to set it back to the file start.

for (int i = 0; i < lines; i++) {file[i] = br.readLine();

}br.close();

} catch (IOException ioe) {}

Run through the file once to count the lines and make a String array the right size.

Go back to the start of the file and read it into the array.

Page 25: Programming for  Geographical Information Analysis: Core Skills

ExampleString[][] strData = getStringArray();

BufferedWriter bw = new BufferedWriter (fw);// Remember fw is a FileWriter not a File.

try{for (int i = 0; i < strData.length; i++) {

for (int j = 0; j < strData[i].length; j++) {

bw.write(strData[i][j] + ", ");

}bw.newLine();

}bw.close();

} catch (IOException ioe) {}

Page 26: Programming for  Geographical Information Analysis: Core Skills

Processing data

This is fine for text, but what if we want values and we have text representations of the values?

There is a difference between 0.5 and “0.5”.

The computer understands the first as a number, but not the second

First, parse (split and process) the file to get each individual String representing the numbers.

Second, turn the text in the file into real numbers.

Page 27: Programming for  Geographical Information Analysis: Core Skills

java.util.StringTokenizer

String line = “Call me Dave”;StringTokenizer st = new StringTokenizer(line);while (st.hasMoreTokens()) { System.out.println(st.nextToken());

}

prints the following output: Callme

Dave Default separators: space, tab, newline, carriage-return

character, and form-feed.

Page 28: Programming for  Geographical Information Analysis: Core Skills

Processing data

There are wrapper classes for each primitive that will do the cast:

double d = Double.parseDouble("0.5");

int i = Integer.parseInt("1");

boolean b = Boolean.parseBoolean("true");

On the other hand, for writing, String can convert most things to itself:

String str = String.valueOf(0.5);

String str = String.valueOf(data[i][j]);

Page 29: Programming for  Geographical Information Analysis: Core Skills

Examplefor (int i = 0; i <= lines; i++) {

file[i] = br.readln();}br.close();

double[][] data = new double [lines][];

for (int i = 0; i < lines; i++) {StringTokenizer st = new StringTokenizer(file[i],", ");data[i] = new double[st.countTokens()];int j = 0;while (st.hasMoreTokens()) {

data[i][j] = Double.parseDouble(st.nextToken()); j++;

}}

Comma or space separateddata

Page 30: Programming for  Geographical Information Analysis: Core Skills

Exampledouble[][] dataIn = getdata();

BufferedWriter bw = new BufferedWriter (fw);

String tempStr = "";

try {for (int i = 0; i < dataIn.length; i++) { for (int j = 0; j < dataIn[i].length; j++) {

tempStr = String.valueOf(dataIn[i][j]); bw.write(tempStr + ", ");

}bw.newLine();

}bw.close();

} catch (IOException ioe) {}

Converts the double to a String.

Page 31: Programming for  Geographical Information Analysis: Core Skills

java.util.ScannerWraps around all this to make reading easy:Scanner s = null;

try {

s = new Scanner(

new BufferedReader(

new FileReader("myText.txt")));

while (s.hasNext()) {

System.out.println(s.next());

}

if (s != null) {

s.close();

}

} catch (Exception e) {}

However, no token counter, so not great for reading into arrays.

Page 32: Programming for  Geographical Information Analysis: Core Skills

Scanners

By default looks for spaces to tokenise on.Can set up a regular expression to look for.Comma followed by optional space:

s.useDelimiter(",\\s*");

Page 33: Programming for  Geographical Information Analysis: Core Skills

Data conversion

s.next() / s.hasNext() String

nextBoolean() / hasNextBoolean() booleannextDouble() / hasNextDouble() doublenextInt() / hasNextInt() intnextLine() / hasNextLine() String

If the type doesn’t match, throws InputMismatchException.

Page 34: Programming for  Geographical Information Analysis: Core Skills

Reading from keyboard

Scanner s = new Scanner(System.in);

int i = s.nextInt();

String str = s.nextLine();

Page 35: Programming for  Geographical Information Analysis: Core Skills

Parsing Strings

Usually with text we want to extract useful information.

Search and replace.

Page 36: Programming for  Geographical Information Analysis: Core Skills

String searchesstartsWith(String prefix), endsWith(String suffix)

Returns a boolean.

indexOf(int ch), indexOf(int ch, int fromIndex)Returns an int representing the first position of the first instance of

a given Unicode character integer to find.

indexOf(String str), indexOf(String str, int fromIndex)

Returns an int representing the position of the first instance of a given String to find.

lastIndexOf

Same as indexOf, but last rather than first.

Page 37: Programming for  Geographical Information Analysis: Core Skills

String manipulationreplace(char oldChar, char newChar)

Replaces one character with another.

substring(int beginIndex, int endIndex)substring(int beginIndex)

Pulls out part of the String and returns it.

toLowerCase(), toUpperCase() Changes the case of the String.

trim()

Cuts white space off the front and back of a String.

Page 38: Programming for  Geographical Information Analysis: Core Skills

Example

String str = "old pond; frog leaping; splash";

int start = str.indexOf("leaping");

int end = str.indexOf(";", start);

String startStr = str.substring(0, start);

String endStr = str.substring(end);

str = startStr + "jumping" + endStr;

str now “old pond; frog jumping; splash”

Page 39: Programming for  Geographical Information Analysis: Core Skills

Review

Use a java.util.Scanner where possible. Otherwise use a FileWriter/Reader.But remember to buffer both.

Page 40: Programming for  Geographical Information Analysis: Core Skills

This lecture

Files

Text files

Binary files

Page 41: Programming for  Geographical Information Analysis: Core Skills

Byte streams

InputStreamRead methods return -1 at the end of the resource.

FileInputStream(File fileObject)Allows us to read bytes from a file.

OutputStreamUsed to write to resources.

FileOutputStream(File fileObject)Used to write to a file if the user has permission.Overwrites old material in file.

FileOutputStream(File fileObject, boolean append)Only overwrites if append is false.

Page 42: Programming for  Geographical Information Analysis: Core Skills

Example

FileInputStream ourStream = null;File f = new File(“e:/myFile.bin”);try {

ourStream = new FileInputStream(f); } catch (FileNotFoundException fnfe) {

// Do something.}

The Stream is then usually used in the following fashion:int c = 0;while( (c = ourStream.read()) >= 0 ) {

// Add c to a byte array (more on this shortly).}ourStream.close();

Page 43: Programming for  Geographical Information Analysis: Core Skills

Byte streams II

There are cases where we want to write to and from arrays using streams.

These are usually used as a convenient way of reading and writing a byte array from other streams and over the network.

ByteArrayInputStream ByteArrayOutputStream

Page 44: Programming for  Geographical Information Analysis: Core Skills

ExampleFileInputStream fin = new FileInputStream(file);ByteArrayOutputStream baos = new

ByteArrayOutputStream();

int c; while((c = fin.read()) >= 0) {

baos.write(c); }

byte[] b = baos.toByteArray();

Saves us having to find out size of byte array as ByteArrayOutputStream has a toByteArray() method.

Page 45: Programming for  Geographical Information Analysis: Core Skills

Buffering streams

As with the FileReader/FileWriter:

BufferedInputStreamBufferedOutputStream

You wrap the classes using the buffer’s constructors.

Page 46: Programming for  Geographical Information Analysis: Core Skills

Other byte streams

RandomAccessFileUsed for reading and writing to files when you need to write into the middle of files as opposed to the end.

PrintStreamWas used in Java 1.0 to write characters, but didn’t do a very good job of it.Now deprecated as an object, with the exception of System.out, which is a static final object of this type.

Object Streams

Page 47: Programming for  Geographical Information Analysis: Core Skills

Serialization

Given that we can read and write bytes to streams, there’s nothing to stop us writing objects themselves from the memory to a stream.

This lets us transmit objects across the network and save the state of objects in files.

This is known as object serialization. More details at:http://www.tutorialspoint.com/java/java_serialization.htm

Page 48: Programming for  Geographical Information Analysis: Core Skills

Summary

We can represent and explore the files on a machine with the File class.

To save us having to understand how external info is produced, java uses streams.

We can read and write bytes to files or arrays.We can store or send objects using streams.We can read and write characters to files or arrays.

We should always try and use buffers around our streams to ensure access.