web application engineering data modeling

54
Web Application Engineering Data Modeling Matthew Dailey Information and Communication Technologies Asian Institute of Technology Matthew Dailey (ICT-AIT) Web Eng 1 / 54

Upload: others

Post on 16-Oct-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Web Application Engineering Data Modeling

Web Application EngineeringData Modeling

Matthew Dailey

Information and Communication TechnologiesAsian Institute of Technology

Matthew Dailey (ICT-AIT) Web Eng 1 / 54

Page 2: Web Application Engineering Data Modeling

Readings

Readings for these lecture notes:

- Greenspun, SQL For Web Nerds.

- Fowler, Patterns of Enterprise Application Architecture,Addison-Wesley, 2003.

- Ruby, Copeland, and Thomas, Agile Web Development with Rails,6th edition, 2020.

These notes contain material © Greenspun, 2006; Fowler, 2003; Ruby,Copeland, and Thomas, 2020.

Matthew Dailey (ICT-AIT) Web Eng 2 / 54

Page 3: Web Application Engineering Data Modeling

Outline

1 Introduction

2 SQL basics

3 Useful PostgreSQL features

4 Database normalization

5 Object-relational mapping

6 NoSQL (Mongo)

Matthew Dailey (ICT-AIT) Web Eng 3 / 54

Page 4: Web Application Engineering Data Modeling

Introduction

To this day, the RDBMS is the king of data storage.

NoSQL databases have important use cases (very large datasets,semi-structured or unstructured data, document oriented processing), butthese aren’t relevant for small and medium sized applications.

We will thus learn (or review for some of you) how to use the RDBMS asan effective means of persistence for our Web applications.

Later, we will take a look at NoSQL databases such as MongoDB.

For all practical purposes, a “relational database is a big spreadsheet thatseveral people can update simultaneously.” (Greenspun).

Matthew Dailey (ICT-AIT) Web Eng 4 / 54

Page 5: Web Application Engineering Data Modeling

Outline

1 Introduction

2 SQL basics

3 Useful PostgreSQL features

4 Database normalization

5 Object-relational mapping

6 NoSQL (Mongo)

Matthew Dailey (ICT-AIT) Web Eng 5 / 54

Page 6: Web Application Engineering Data Modeling

SQL basicsTables

Each table in the database is a spreadsheet with fixed columns, eachhaving a name and a data type. The rows are unordered. Example:

create table mailing_list (

email varchar(100) not null primary key,

name varchar(100)

);

The primary key constraint means this column must be unique, and inPostgreSQL causes an index to be created on the column.

Indices allow efficient search of one or more columns in a table.

Matthew Dailey (ICT-AIT) Web Eng 6 / 54

Page 7: Web Application Engineering Data Modeling

SQL basicsPopulating and modifying tables

We use SQL’s insert command to add data to a table:

insert into mailing_list ( name, email )

values (’Philip Greenspun’,’[email protected]’);

We can add and delete new columns:

alter table mailing_list add phone_number varchar(20) not null;

alter table mailing_list drop phone_number;

For queries we use select:

select * from mailing_list;

Matthew Dailey (ICT-AIT) Web Eng 7 / 54

Page 8: Web Application Engineering Data Modeling

SQL basicsMany-to-one relationships

Most folks have more than one phone number. Should we put a list in thephone number column? It might work but our data would not be in“normal” form (more on normalization later).

For many-to-one relationships we normally use a separate table:

create table phone_numbers (

email varchar(100) references mailing_list,

phone_type char(1) check ( phone_type in

( ’W’, ’H’, ’M’, ’F’ )),

phone_number varchar(20)

);

The keyword references creates a consistency constraint between thetwo tables. Try adding phone numbers for email addresses that are not inthe mailing list table.

OK, insert some data into the table.

Matthew Dailey (ICT-AIT) Web Eng 8 / 54

Page 9: Web Application Engineering Data Modeling

SQL basicsJoins

A join combines information from more than one table:

select * from mailing_list, phone_numbers;

But we don’t get what we want — we get the cross product of the rows inthe two tables. We have to be more selective:

select * from mailing_list, phone_numbers

where mailing_list.email = phone_numbers.email;

Other useful commands: delete from mailing list and update

mailing list.

Matthew Dailey (ICT-AIT) Web Eng 9 / 54

Page 10: Web Application Engineering Data Modeling

SQL basicsData types

We saw a few of SQL’s data types already. Here is a more complete butstill partial list, for PostgreSQL:

Fixed-length strings (char(len))

Variable-length strings (varchar(len))

Variable-length strings, no limit on length (text)

Variable-length binary data (bytea)

Dates and times (date, time, timestamp)

Numbers (integer, numeric, real precision, doubleprecision, serial, others)

Other more complex, less-used types

Matthew Dailey (ICT-AIT) Web Eng 10 / 54

Page 11: Web Application Engineering Data Modeling

SQL basicsConstraints

Values can also be constrained:

not null

unique

primary key

check

references

That’s all you need for some simple data modeling!

Matthew Dailey (ICT-AIT) Web Eng 11 / 54

Page 12: Web Application Engineering Data Modeling

SQL basicsKeys: natural or surrogate?

A key is an attribute or group of attributes what uniquely identifies a rowof a table.

Composite keys are made up of more than one attribute.

Natural keys are attributes in the real world: citizen ID number, etc.

Surrogate keys are artifical keys introduced into the data model that haveno relationship to the real-world entities being modeled.

Many analysts prefer natural keys because surrogate keys are artificial andunrelated to the business logic.

But natural keys may be coupled to the business logic and might thereforechange when requirements change.

Most Web application frameworks are easiest to work with when you allowthem to define their own surrogate key for every table.

Matthew Dailey (ICT-AIT) Web Eng 12 / 54

Page 13: Web Application Engineering Data Modeling

Outline

1 Introduction

2 SQL basics

3 Useful PostgreSQL features

4 Database normalization

5 Object-relational mapping

6 NoSQL (Mongo)

Matthew Dailey (ICT-AIT) Web Eng 13 / 54

Page 14: Web Application Engineering Data Modeling

Useful PostgreSQL featuresUser-defined functions

PostgreSQL provides the PL/pgSQL language for specification ofuser-defined functions. As a simple example consider f (x) = 2x :

create or replace function doubleint( x integer )

returns integer as $$

declare y integer;

begin

y := 2 * x;

return y;

end;

$$ language plpgsql;

Before creating a first PL/pgSQL function in your database, you must usethe shell command createlang plpgsql apache (use your database’sname instead of apache).

Now queries like select doubleint( 10 ); should work.

Matthew Dailey (ICT-AIT) Web Eng 14 / 54

Page 15: Web Application Engineering Data Modeling

Useful PostgreSQL featuresTriggers

PL/pgSQL functions returning trigger can be set to executeautomatically when a table is changed.

Example: automatically create a change log entry every time a studentchanges projects:create table project_changes (

studentid integer references students,

oldproj integer references projects,

newproj integer references projects,

update_timestamp timestamp

);

create or replace function proj_log() returns trigger as $PROC$

begin

if ( NEW.studentid = OLD.studentid and

NEW.projectid <> OLD.projectid ) then

insert into project_changes (

studentid, oldproj, newproj, update_timestamp

) values (

NEW.studentid, OLD.projectid, NEW.projectid,

current_timestamp

);

end if;

return NEW;

end;

$PROC$ language plpgsql;

drop trigger proj_log_post on students;

create trigger proj_log_post after insert or update on students

for each row execute procedure proj_log();

Matthew Dailey (ICT-AIT) Web Eng 15 / 54

Page 16: Web Application Engineering Data Modeling

Outline

1 Introduction

2 SQL basics

3 Useful PostgreSQL features

4 Database normalization

5 Object-relational mapping

6 NoSQL (Mongo)

Matthew Dailey (ICT-AIT) Web Eng 16 / 54

Page 17: Web Application Engineering Data Modeling

Database normalizationIntroduction

A normalized database only stores atomic data in a non-redundant form.

The concept of normal form for relational databases was proposed by E.F.Codd in 1970.

Normalizing a database means ensuring that all data in every table isatomic and depends only on the primary key for that table.

Normalization means all dependencies are explicit in the data model. Thismakes it easier to maintain the database in a consistent state.

There are many levels of normalization. The most important are first,second, and third normal form.

Matthew Dailey (ICT-AIT) Web Eng 17 / 54

Page 18: Web Application Engineering Data Modeling

Database normalizationFirst normal form

Criteria for first normal form:

All columns in every table are atomic (nondecomposable).

Every row of every table has a unique primary key.

Example: conference program committee website:

Papers are submitted by potential authors

Papers are reviewed by committee members (who can also be authors)

The program chair makes acceptance and rejection decisions based onthe reviews.

Papers have an author list, a title, a list of keywords, a link to the PDFsubmission, a set of reviews, and a decision.

Reviews have a single author, a paper being reviewed, comments, andratings from 1–5 for technical quality, originality, and presentation.

Matthew Dailey (ICT-AIT) Web Eng 18 / 54

Page 19: Web Application Engineering Data Modeling

Database normalizationFirst normal form

1NF procedure:

Consider each relation and break non-atomic attributes into separatetables.

Add the relationships between the tables.

Determine the primary keys.

Matthew Dailey (ICT-AIT) Web Eng 19 / 54

Page 20: Web Application Engineering Data Modeling

Database normalizationFirst normal form

For atomicity, we need separate tables for (at least):

papers

people

keywords

reviews

Relationships:

Papers to authors: many to many. Requires a new table,papers authors relating the two.

Papers to keywords: many to many. Requires a new table,papers keywords relating the two.

Papers to reviews: one to many. Requires a foreign key reference inreviews.

People to reviews: one to many. Requires a foreign key reference inreviews.

Matthew Dailey (ICT-AIT) Web Eng 20 / 54

Page 21: Web Application Engineering Data Modeling

Database normalizationFirst normal form

Keys:

papers: no natural key. Introduce surrogate paper id.

people: no natural key. Introduce surrogate person id.

keywords: the keyword itself must be unique, so it is a natural key.

reviews: the paper, reviewer pair is unique. It is a natural(composite) key.

With a unique key for all tables, and only atomic data, our database is infirst normal form.

Matthew Dailey (ICT-AIT) Web Eng 21 / 54

Page 22: Web Application Engineering Data Modeling

Database normalizationSecond normal form

Criteria for second normal form:

The database is in 1NF

There should be no columns dependent on only part of a compositekey.

Example: suppose we had a column reviewer home page in the reviews

table. This would be atomic but redundant, and should be moved to thepeople table.

Matthew Dailey (ICT-AIT) Web Eng 22 / 54

Page 23: Web Application Engineering Data Modeling

Database normalizationThird normal form

Criteria for third normal form:

The database is in 2NF

There should be no columns dependent on non-key columns.

Example: suppose for each review, we have a field originality (aninteger between 1 and 5) and originality desc (“Groundbreaking”,“Novel”, “Somewhat new”, “Minor variation of existing work”, and“Complete ripoff”) describing what the rating means.

We can see that originality desc depends directly on originality

which is not a key for reviews.

To achieve 3NF we should move originality desc into a new table andmake originality be a foreign key reference.

Matthew Dailey (ICT-AIT) Web Eng 23 / 54

Page 24: Web Application Engineering Data Modeling

Database normalizationDenormalization

Normalization simplifies data updates and changes to the data model.

Normalization leads to more complex queries with many joins. This hasimplications for performance.

Databases that are primarily transactional should emphasize normalization.

Databases that are primarily read only might use denormalization toimprove performance and simplify the queries sent to the RDBMS.

The preferred denormalization technique is to use indexed views.

If denormalization is done at the data model level, constraints should beused to ensure consistency of the redundant data.

Matthew Dailey (ICT-AIT) Web Eng 24 / 54

Page 25: Web Application Engineering Data Modeling

Outline

1 Introduction

2 SQL basics

3 Useful PostgreSQL features

4 Database normalization

5 Object-relational mapping

6 NoSQL (Mongo)

Matthew Dailey (ICT-AIT) Web Eng 25 / 54

Page 26: Web Application Engineering Data Modeling

Object-relational mappingIntroduction

Most SQL APIs return an array of hash arrays or a similar structure inresponse to queries.

Ruby example: the Sequel database API provides a row abstraction fordatabase rows.

Next page: Sequel example.

In Ubuntu, you need gems pg and sequel.

You’ll also need a username with password for the database as theconnection is through a network socket. In psql runalter user username with password ’password’;

(note that you don’t put quotes around the username).

Matthew Dailey (ICT-AIT) Web Eng 26 / 54

Page 27: Web Application Engineering Data Modeling

Object-relational mappingIntroduction

Ruby Sequel example (put in a text file such as db access.rb and runfrom the command line using ruby db access.rb:

require "sequel"

dbh = Sequel.connect(

"postgres://mdailey:password@localhost/wae_students_development")

dbh[:students].each do |row|

row.keys.each do |key|

printf "%s: %s ", key, row[key]

end

print "\n"

end

Matthew Dailey (ICT-AIT) Web Eng 27 / 54

Page 28: Web Application Engineering Data Modeling

Object-relational mappingIntroduction

In object oriented analysis and design (OOAD) we normally construct adomain model containing the entities in the business domain.

If we are using OOAD and an object-oriented programming language likeRuby or Java, we want to work with objects, not database rows.

But what if we are stuck with a RDMBS? The simplest thing to do is tomap database rows directly to domain model objects.

The Active Record pattern for enterprise applications is one of thesimplest approaches to so-called object-relational mapping.

Matthew Dailey (ICT-AIT) Web Eng 28 / 54

Page 29: Web Application Engineering Data Modeling

Object-relational mappingActive Record

Active Record

An object that wraps a row in a database table or view, encapsulates thedatabase access, and adds domain logic on that data.

Fowler (2003), Fig. 3.3

Matthew Dailey (ICT-AIT) Web Eng 29 / 54

Page 30: Web Application Engineering Data Modeling

Object-relational mappingActive Record

Some popular Active Record implementations:

Ruby ActiveRecord (decoupled from Rails way back in version 3.0)

CakePHP

.NET Castle

There are many others.

Matthew Dailey (ICT-AIT) Web Eng 30 / 54

Page 31: Web Application Engineering Data Modeling

Object-relational mappingActive Record

What Active Record implementations can do for us:

Automatically construct an instance of Active Record from a SQLresult row.

Automatically construct a SQL insert from a given instance of ActiveRecord.

Provide static finder methods via reflection that return Active Recordinstances.

Map getters and setters to SQL selects and updates, transformingSQL data types to reasonable native types.

Matthew Dailey (ICT-AIT) Web Eng 31 / 54

Page 32: Web Application Engineering Data Modeling

Object-relational mappingActive Record

Some advantages of Active Record:

It works very well when the domain model and business logic aresimple.

Some disadvantages:

It cannot handle complex mappings from objects to relations.

It couples the domain logic to the database schema.

Matthew Dailey (ICT-AIT) Web Eng 32 / 54

Page 33: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails

Some key features of Rails ActiveRecord:

Object schema is constructed on the fly from the database schema.

Transparent lazy fetching.

Transparent optimistic locking via row versioning.

Simple support for associations between classes.

Transaction support.

Validations.

Value objects.

Single table inheritance.

Matthew Dailey (ICT-AIT) Web Eng 33 / 54

Page 34: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails

Conventions

Each database table has a surrogate primary key, id.

The model class name is singular and UpperCamelCase (e.g.Student); the table name is the plural form of the object name (e.g.students).

Foreign key reference names are written classname id.

Join tables for many-to-many associations are named for the twotables they join, e.g., projects students.

Default behavior can be changed as necessary (e.g. invoke class methodset table name to use a non-standard table name).

Matthew Dailey (ICT-AIT) Web Eng 34 / 54

Page 35: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: one-to-many associations

Example: students and their projects. Domain model:

Student

+studentid: integer

+name: string

Project

+name: string

+url: string

+students

*

+project

1

Corresponding database schema:

Matthew Dailey (ICT-AIT) Web Eng 35 / 54

Page 36: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: one-to-many-associations

After creating the database tables (through a direct admin tool or via Railsmigrations), we create the model classes:

app/models/project.rb:class Project < ActiveRecord::Base

has_many :students

end

app/models/student.rb:class Student < ActiveRecord::Base

belongs_to :project

end

The method calls belongs to and has many set up the one-to-manyrelationship between projects and students.

Other methods for associations include has one, andhas and belongs to many.

Matthew Dailey (ICT-AIT) Web Eng 36 / 54

Page 37: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: one-to-many associations

To thoroughly test your ActiveRecord classes, it’s easiest to work from theconsole. Try the following for an example:

% script/console

>> s=Student.create

>> s.project = Project.create :name => "Soi Cats and Dogs",

:url => "web13.cs.ait.ac.th"

>> s.name = "Matthew Dailey"

>> s.studentid = 123456

>> s.save

>> s = Student.find(1)

>> Project.find(:all)

>> s = Student.find_by_name("Matthew Dailey")

>> s = Student.find_by_name_and_studentid( "Matthew Dailey", 123456 )

>> Student.find_by_sql( "select * from students where students.name like ’Matt%’" )

You might want to tail -f log/development.log.

Note that new creates an instance in memory only, but create creates aninstance and commits it to the database.

Matthew Dailey (ICT-AIT) Web Eng 37 / 54

Page 38: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: many-to-many associations

Now for a many-to-many relationship.

Suppose I need to record information about peer evaluations of yourprojects.

We need to set up a many-to-many relationship between students andprojects.

Since the association has an attribute (the score) we have to create anActiveRecord model for the join table:1

% script/generate model ProjectEvaluation score:integer project:references \

student:references

1There is an ActiveRecord method has and belongs to many that may be moreconvenient if you don’t need any attributes on the association.

Matthew Dailey (ICT-AIT) Web Eng 38 / 54

Page 39: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: many-to-many associations

In app/models/project evaluation.rb, add:

class ProjectEvaluation < ActiveRecord::Base

belongs_to :project

belongs_to :student

end

To the Student and Project models, add the method call

has_many :project_evaluations

That’s it! From the console, try

s = Student.find(1)

p = Project.find(2)

pe = ProjectEvaluation.create :student => s, :project => p, :score => 3

s.project_evaluations

Lastly, try adding

has_many :evaluators, :through => :project_evaluations, :source => :student

to the Project model (what is the purpose of this?).Matthew Dailey (ICT-AIT) Web Eng 39 / 54

Page 40: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: transactions

Oftentimes it will be important to group multiple database operations intoa single atomic transaction.

For example:

% script/generate model Student name:string account_balance:float

% rake db:migration

% script/console

>> bill = Student.create :name => ’Bill G’, :account_balance => 10000000.0

>> matt = Student.create :name => ’Matt D’, :account_balance => 100.0

>> bill.account_balance -= 10000

>> matt.account_balance += 10000

>> bill.save

>> matt.save

If an exception occurs while saving the second updated student, Bill G.loses 10,000 baht.

Matthew Dailey (ICT-AIT) Web Eng 40 / 54

Page 41: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: transactions

It would be safer to encapsulate both operations in a transaction:

>> Student.transaction do

?> bill.save

>> matt.save

>> end

If any exception occurs during the transaction, it is rolled back.

Matthew Dailey (ICT-AIT) Web Eng 41 / 54

Page 42: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: optimistic locking

Transactions, except with strict serializable isolation (the highest level ofisolation provided in the SQL standard, which locks data read by anytransaction), don’t help with the problem of lost updates.

Consider the following code executing concurrently in two threads:

Thread 1

s = Student.find_by_name "Bill G"

s.account_balance += 1000000

s.save

Thread 2

s = Student.find_by_name "Bill G"

s.account_balance += 1000000

s.save

What should happen, and what actually happens, with snapshot isolationand serializable isolation?

Note that PostgreSQL does not support full serializable isolation.

Matthew Dailey (ICT-AIT) Web Eng 42 / 54

Page 43: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails: optimistic locking

Optimistic locking means we allow concurrent users to perform any actionthey like but track updates to the database.

When one user attempts to update an old version of a record, anexception and transaction rollback should occur.

In Rails, optimistic locking can be enabled on any ActiveRecord class byadding a version column to the database table:

alter table students add column lock_version int default 0;

The versions are transparently updated and checked by the ActiveRecordbase class.

Try the concurrent access scenario again with this change.

Matthew Dailey (ICT-AIT) Web Eng 43 / 54

Page 44: Web Application Engineering Data Modeling

Object-relational mappingActive Record in Rails

We’ve covered many of the features of Rails’ implementation of ActiveRecord. There are a few others of note:

Value objects

Single-table inheritance

Polymorphic associations

Matthew Dailey (ICT-AIT) Web Eng 44 / 54

Page 45: Web Application Engineering Data Modeling

Object-relational mappingData Mapper

Active Record maps directly between database tables and domainobjects.

Data Mapper is an alternative pattern that decouples the domain modelfrom the database schema.

Data Mapper

A layer of mappers that moves data between objects and a database whilekeeping them independent of each other and the mapper itself.

Fowler (2003), Fig. 3.4

Matthew Dailey (ICT-AIT) Web Eng 45 / 54

Page 46: Web Application Engineering Data Modeling

Object-relational mappingData Mapper

Data Mapper is widely implemented:

Hibernate for Java

MassiveJS and many others for JavaScript

SQLAlchemy for Python

DataMapper for Ruby

Even if there is no existing implementation for your preferred environment,it is easy to roll your own, starting small and gradually improving theimplementation over time.

Matthew Dailey (ICT-AIT) Web Eng 46 / 54

Page 47: Web Application Engineering Data Modeling

Outline

1 Introduction

2 SQL basics

3 Useful PostgreSQL features

4 Database normalization

5 Object-relational mapping

6 NoSQL (Mongo)

Matthew Dailey (ICT-AIT) Web Eng 47 / 54

Page 48: Web Application Engineering Data Modeling

NoSQL (Mongo)Introduction

Applications dealing with “big” data:

High volume: we need to store millions or billions of records.

High velocity: the data are arriving and need to be processed at avery high rate such as thousands of records per minute.

High variety: we have potentially many sources providing data thatare structured, unstructured, and semi-structured.

Under these conditions, designing schemas, migrating every time we add anew data source or data format, ensuring consistency, and guaranteeingisolated transactions may all be bottlenecks.

A possible solution: throw away your schemas, your consistency rules,and/or your isolated transactions!

[Think about where SQL and NoSQL would be best used: a bankingapplication and a Facebook post analysis engine.]

Matthew Dailey (ICT-AIT) Web Eng 48 / 54

Page 49: Web Application Engineering Data Modeling

NoSQL (Mongo)Types of NoSQL databases

There are several types of NoSQL databases:

Key-value: dictionaries wherein values are indexed by a single key

Document: key-value databases in which the value is a documentrepresented in JSON, XML, etc.

Wide column: row-oriented tables with dynamic columns

Graph: data are nodes with edges

MongoDB is probably the most popular NoSQL database. It is documentoriented.

Matthew Dailey (ICT-AIT) Web Eng 49 / 54

Page 50: Web Application Engineering Data Modeling

NoSQL (MongoDB)MongoDB features

Besides simple key-value storage and retrieval, MongoDB adds

Sharding: distributing the data across multiple machines for highthroughput

Replication, duplication, load balancing for high availability at scale

Document validations: imposing consistency rules where necessary

Fine-grained locking: reader and writer locks at the global, database,or collection level to deal with concurrency issues.

Matthew Dailey (ICT-AIT) Web Eng 50 / 54

Page 51: Web Application Engineering Data Modeling

NoSQL (Mongo)Quick MongoDB tutorial

To get a feel for MongoDB, first install it:

$ sudo apt install mongodb

Start a shell:

$ mongo

MongoDB shell version v3.6.8

connecting to: mongodb://127.0.0.1:27017

Implicit session: session { "id" : UUID("fca1c52f-ca00-4819-9821-7f9576077b33") }

MongoDB server version: 3.6.8

Server has startup warnings:

...

>

Figure out what db we’re connected to:

> db

test

Matthew Dailey (ICT-AIT) Web Eng 51 / 54

Page 52: Web Application Engineering Data Modeling

NoSQL (Mongo)Quick MongoDB tutorial

Switch to the studentdb database:

> use studentdb

Insert some data into a new collection:

> db.projects.insertMany([

... { name: "Soi Cats and Dogs", url: "http://scad.org" },

... { name: "ICT Infosystem", url: "http://ict-info.ait.ac.th" }

... ])

{

"acknowledged" : true,

"insertedIds" : [

ObjectId("612eba9d84616b17e76630a4"),

ObjectId("612eba9d84616b17e76630a5")

]

}

>

Search the collection for a document:

> db.projects.find({name: "Soi Cats and Dogs"})

{ "_id" : ObjectId("612eba9d84616b17e76630a4"), "name" : "Soi Cats and Dogs", "url" : "http://scad.org" }

>

Matthew Dailey (ICT-AIT) Web Eng 52 / 54

Page 53: Web Application Engineering Data Modeling

NoSQL (Mongo)Quick MongoDB tutorial

Generally we should avoid references where possible, but when we needone document to refer to another, we can use the id field:

> var project = db.projects.find({name: "Soi Cats and Dogs"}).next();

> project

{

"_id" : ObjectId("612eba9d84616b17e76630a4"),

"name" : "Soi Cats and Dogs",

"url" : "http://scad.org"

}

> db.students.insertMany([

... { name: "Matt Dailey", studentid: "123456", project_id: project._id },

... { name: "Bishal Khanal", studentid: "123457", project_id: project._id }

... ]);

{

"acknowledged" : true,

"insertedIds" : [

ObjectId("612ecf3a84616b17e76630a6"),

ObjectId("612ecf3a84616b17e76630a7")

]

}

Matthew Dailey (ICT-AIT) Web Eng 53 / 54

Page 54: Web Application Engineering Data Modeling

NoSQL (Mongo)Quick MongoDB tutorial

> db.students.find()

{ "_id" : ObjectId("612ecf3a84616b17e76630a6"), "name" : "Matt Dailey",

"studentid" : "123456", "project_id" : ObjectId("612eba9d84616b17e76630a4") }

{ "_id" : ObjectId("612ecf3a84616b17e76630a7"), "name" : "Bishal Khanal",

"studentid" : "123457", "project_id" : ObjectId("612eba9d84616b17e76630a4") }

Things to note here:

The shell interprets our input as JavaScript

The find() method returns a cursor, i.e., an object that has to beiterated to extract its data.

The cursor’s next() method returns the next record in the cursor’sunderlying collection. Use code such as

while (cursor.hasNext()) {

var record = cursor.next();

printjson(record);

}

to iterate over the query’s results.Matthew Dailey (ICT-AIT) Web Eng 54 / 54