intro to lab 1 and simpledb overview. labs use: - java for code - github for version control...
TRANSCRIPT
![Page 1: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/1.jpg)
Intro to Lab 1 andSimpleDB Overview
![Page 2: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/2.jpg)
Labs
Use:
- Java for code
- github for version control
Advantages:- Java lets us focus on db internals- Github lets you submit as often as you like
![Page 3: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/3.jpg)
Java Overview
- All runs in Java Virtual Machine
- Everything an object or scalar
- JUnit tests
- Use Javadoc to navigate codebase
![Page 4: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/4.jpg)
Github
- Distributed version control
- Commit sets of changes - local
- Push changes to the server (github.com) – important!
![Page 5: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/5.jpg)
What is SimpleDB?• A basic database system• SQL Front-end (Provided for later labs)
– Heap files (Lab 1)– Buffer Pool (Labs 1-4)– Basic Operators (Labs 1 & 2)
– Scan, Filter, JOIN, Aggregate
– B-Tree Indexes (Lab 3)– Transactions (Lab 4)
• Javadoc is your friend!
![Page 6: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/6.jpg)
Module Diagram
![Page 7: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/7.jpg)
Database
Catalog=> List of DB tables
Singleton Database:
Database.getCatalog()
BufferPool=> Caches DB pages in memory
Database.getBufferPool()
LogFile(Ignore for Lab 1)
Database.getLogFile()
![Page 8: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/8.jpg)
Catalog
Table:
DbFile fileString nameString primary_key
DbFile ID Table
001 Table1
002 Table2
003 Table3
Catalog:
=> Stores a list of all tables in the database
![Page 9: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/9.jpg)
BufferPool
Buffer Pool:
Page:
PageId idTuple tuples[]Byte header[]
Page ID Page
001 Page1
003 Page3
007 Page7
=> Caches recently accessed database pages in memory
![Page 10: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/10.jpg)
HeapFile (Implements DbFile)
Heap File:
File fileTupleDesc tdDbFileIterator it
_______________
Field1 TypeField1 Name
Field2 TypeField2 Name
Field3 TypeField3 Name
…
File (on disk):
Tuple Descriptor:
Page1 Page2 Page3 …
Iterate through Tuples in Heap Pages:
![Page 11: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/11.jpg)
HeapPage (Implements Page)
Heap Page:
HeapPageId pidTupleDesc tdByte header[]Tuple tuples[]
Field1 TypeField1 Name
Field2 TypeField2 Name
Field3 TypeField3 Name
…
Tuple Descriptor:
01100110 11111111 11101101 …
Empty Tuple1 Tuple2 …
Field1 Field2 Field3 …
Header:
Tuples:
Tuple:
Fields and Tuples are Fixed Width!
![Page 12: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/12.jpg)
SeqScan(Implements DbIterator)
• DbIterator class implemented by all operators– open()– close()– getTupleDesc()– hasNext()– next()– rewind()
• Iterator model: chain iterators together– Use DbFileIterator from HeapFile
![Page 13: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/13.jpg)
// construct a 3-column table schema
Type types[] = new Type[]{ Type.INT_TYPE, Type.INT_TYPE, Type.INT_TYPE };
String names[] = new String[]{ "field0", "field1", "field2" };
TupleDesc descriptor = new TupleDesc(types, names);
// create the table, associate it with some_data_file.dat
// and tell the catalog about the schema of this table.
HeapFile table1 = new HeapFile(new File("some_data_file.dat"), descriptor);
Database.getCatalog().addTable(table1);
// construct the query: we use a simple SeqScan, which spoonfeeds
// tuples via its iterator.
TransactionId tid = new TransactionId();
SeqScan f = new SeqScan(tid, table1.id());
// and run it
f.open();
while (f.hasNext()) {
Tuple tup = f.next();
System.out.println(tup);
}
f.close();
Database.getBufferPool().transactionComplete();
![Page 14: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/14.jpg)
HeapFileEncoder.java
• Because you haven’t implemented insertTuple, you have no way to create data files
• HeapFileEncoder converts CSV files to HeapFiles
• Usage:– java -jar dist/simpledb.jar convert csv-file.txt numFields
• Produces a file csv-file.dat, that can be passed to HeapFile constructor.
![Page 15: Intro to Lab 1 and SimpleDB Overview. Labs Use: - Java for code - github for version control Advantages: -Java lets us focus on db internals -Github lets](https://reader036.vdocument.in/reader036/viewer/2022062320/56649d6e5503460f94a4f5af/html5/thumbnails/15.jpg)
Compiling, Testing, and Running
• Demo on running tests and debugging with Eclipse