chapter 2 theoretical foundation 2.1 general …library.binus.ac.id/ecolls/ethesisdoc/bab2/bab...

9

CHAPTER 2

THEORETICAL FOUNDATION

2.1 General Theories

2.1.1 XYZ Clinic

Clinic is defined as a healthcare facility intended for patient’s routine medical

needs (such as immunizations, or routine consultations), or for outpatient

treatments (diagnosis or sickness treatment) [8]. Clinics typically have less

features or functionalities, and are smaller in the scope of work compared to a

full-fledged hospital (such as clinics not providing hospitalization treatment or

service), thus clinics are purposed to solve simpler issues, whereas patients

with issues that are complex for the clinic will be referred to a hospital instead

for a more complete treatment.

XYZ Clinic is one of the clinics in Indonesia, and is the client of this project in

which the final product of a clinic management system is going to be delivered

to. As a measurement of the clinic’s size, the clinic has four rooms that are

dedicated for doctor’s consultation, an average of 1 to 2 active doctors per day,

and an average of 25 visiting patients per day.

2.1.2 Clinic Management System

Clinic Management System is a software system purposed to help handle the

major operational areas of typical clinics [4]. The need for a clinic management

system comes from the problems faced by clinics, such as in the sector of

record management alone, the significant use of paper for keeping records

resulted in difficulties in searching for a specific record (as computational

search cannot be done if the data is on paper). The effort of writing the records

to the paper and the planning of proper filing of the papers could also be

considered as inefficient, not to mention the probability of data loss caused by

improper filing or accident that damages the whole records’ repository.

Furthermore, reports made based on manual records are likely to be limited as

computational works that may help provide additional results are unavailable

with such manual approach [9].

The benefits of implementing a clinic management system is represented in the

figure below [10].

10

Figure 2.1 Benefits of Clinic Management System

By using a clinic management system, clinics can work with their records

management without the use of paper, thus not needing to figure out the filing

techniques and reducing the risk present in paper-based management, such as

the physical damage that may occur to paper records. The use of clinic

management system also allows computational work to help deliver quicker

access to specific data, a feature that cannot be accomplished with using paper-

based data. The integrated digital database applied in the management system

allows for better coordination between departments, as all the user roles will

be retrieving data from an integrated database, in which all kinds of updates

will be applied upon and the users will be able to receive the same version of

data (improving the data integrity). Computational work done by clinic

management systems allows for improved overall working efficiency in the

clinic, as processes that can be automated will be handled by the management

system instead of being done manually by the employees. Patients can also be

served by the system, such as the ability to do appointments (if exist in the

system) and the automation done by the system further reducing the waiting

time for patients (such as in the billing of prescriptions), thus increasing their

satisfaction. Exchanging data digitally instead of by paper is also likely to

11

increase the data accuracy of the clinic, such as better tracking of patient’s

health records and avoiding the illegible handwriting that doctors may make.

The major operational sectors of a clinic that the clinic management system

handles include (but not limited to) the management of doctor’s appointments,

patient’s medical records, prescription inventory, and accounting matters such

as the revenue records.

2.1.3 Web Application

Web application is defined as a software running on the Internet that is based

on the implementation of Hypertext Transfer Protocol (HTTP), Uniform

Resource Identifiers (URI), and a web architectural style such as REST.

Typically being deployed to the Internet, a web application must also be

compatible to the concurrent usage by undetermined number of users [11].

Web applications typically are based on a client-server architecture, the client-

side handling the display of the application (or also called as the Graphical

User Interface), while the server-side handling the business logic or data

processing of the application [12]. Web application also typically incorporates

a database where the data is stored, and where any changes to the data is applied

unto. The visualization of the architecture and its general flow of work is shown

in the figure below [13].

Figure 2.2 Web Application's Architecture and Process Flow

The model of such architecture allows for the separation of work and the

independent development of both sides. In addition, such separation also

enables the business logic or the server-side to provide its services to not only

12

a single application, but to also share its services to other applications by using

methods such as the REST API. This increases the reusability of the back-end’s

services, a concern that has become a growing importance in the development

of web applications. The examples of languages used in the application’s

server-side are PHP, C++, Java, and JSP, whereas the languages used in the

client-side are HTML, CSS, AJAX, jQuery, and JavaScript.

Although not fully definitive, the way web applications differ compared to

websites is that web application is considered to be a platform for data

processing, in which the web application considers user’s input and produces

the proper output (putting major emphasis on user’s interactions). Whereas

websites are considered more as a platform for information showcase (such as

a company profile website) [14].

2.2 Development Technologies

2.2.1 MERN Stack Architecture

The term “MERN” is short for MongoDB, Express.js, React.js, and Node.js,

while “Stack” refers to the multiple layers of technologies implemented. In its

entity, the MERN Stack is an architecture implementing all four technologies

for web application development, each of them having their own specific tasks

or purpose [15]. The development and usage of such stack comes from the

increasing demand of users wanting more features from web applications that

could bring them to similar experiences as what can be given by desktop and

mobile applications, while the developers needing the development tools that

would help them in producing the applications faster, and also enabling easier

modification and scalability [16]. What JavaScript may have been known

previously as a tool for simply producing animations for Web pages, has

changed into a development tool that can even handle front end and back end

processes, as what it has been doing in stacks such as MERN and MEAN

(replacing React.js with Angular.js).

The way MERN Stack divides each of the technologies’ purposes is:

MongoDB: Database that implements the document-based approach

from NoSQL, accessed by the back-end application through the usage

of JSON (JavaScript Object Notation).

13

Express.js: The Node.js web application framework that is able to

handle the routing of requests and also to make a RESTful API that can

be used by front end to get the response or data that it needs.

React.js: The front-end JavaScript library that renders the displays

client-side, although it is also able to render in the server-side.

Node.js: An asynchronous and event-driven JavaScript runtime

environment that runs processes for the back end of the web

application.

Thus, the overall working flow of the MERN Stack can be visualized as shown

in the figure below.

Figure 2.3 MERN Stack Architecture

It is important to implement stack approaches such as this, in order to provide

better standardization of the set of technologies used, and to also get more

support because of the common standardization. In addition, the usage of

RESTful APIs by the stack helps in not only separating the front end and back

end’s work, but also further expanding the reach of the web application by

being able to implement external, public APIs [16], such as enabling

registering or logging in to the web application from another social media, or

accessing external services for their processing such as chat bot services.

These are the benefits of using MERN stack [15]:

The use of JavaScript across the stack means that a single language is

used across the web application, even for the database processing of

MongoDB. This allows for benefits such as single path of learning for

developers (which is the overall JavaScript itself) and similar concepts

used for both the front end and back end programming. Although such

14

benefits are also applied to other stacks implementing similar

environments such as the MEAN stack, the usage of React.js allows

developers to not define a template in order to make pages. Instead, the

HTML is generated using JavaScript. This allows for better flexibility

and could lead to faster development rather than having to define the

needed templates beforehand. In addition, React considerably also

provides better performance compared to Angular, as it gives a solution

to a performance problem that was the reason why Facebook developed

it. Angular uses two-way data binding that could be costly if there are

larger data to be dynamically updated. React uses one directional data

flow that results in a more efficient state changes checking, which fits

better for larger data.

The use of JSON as the data’s form for transfer across the stack

provides more ease in data transformations, not having to fit a model

into a specific table’s structure. Using JSON also enables developer to

imagine the data more naturally in the form of object and its properties.

Node.js asynchronous or non-blocking I/O approach is likely to result

in faster web server processes, as waiting by elements does not need to

occur when an asynchronous function is processing. This is

increasingly important as the web application receives more traffic. The

npm support existing in Node.js environment also may provide a

significant help in giving modules for specific features of the web

application.

2.2.2 Node.js

Node.js is a JavaScript runtime environment based on Chrome’s V8 Javascript

engine that does processing in the server-side using JavaScript, implementing

an event-driven, asynchronous (or non-blocking I/O) approach [17]. Initially,

JavaScript may be known for requiring a platform in order to include multiple

JavaScript files, such as in a HTML file where JavaScript is commonly used to

make animations for Web pages. With the existence of Node.js, it uses its own

module system such that JavaScript file can stand on its own and is able to

import other JavaScript modules into it, with the help of CommonJS as the

basis of the module system [15]. This also helps to promote the separation of

15

task done by each module (or can also be called as library) and importing them

later on by using the require() function. This approach of implementing

modules also enables Node applications to use third-party libraries that are

available for Node, such as the ones provided by npm.

npm is the package manager for Node that is used for implementing external

libraries or modules. The libraries that need other dependencies can also be

handled by npm in their installations. Furthermore, the npm holds the highest

number of packages or libraries being handled by them of more than 250,000,

surpassing Maven who held the highest number before. This makes npm a great

environment for developers for developing their application’s back end as there

are high chances of the application’s problems or features already solved or

handled by the existing modules.

The typical practice of concurrency in processing nowadays implements the

Operating System’s threads, using other available thread when a thread must

be blocked because of a certain event such as file reading. The issues with such

approach is that it is difficult to develop applications that implements multiple

threads processing, not to mention the probability of deadlocks happening if

the application is not designed carefully beforehand [18]. Such approach of

concurrency would not be available. On the other hand, Node.js implements

asynchronous processing that does not require multiple threads. It depends on

callbacks to define whether a specific process has finished, and is able to do

other processes that do not require waiting while waiting for another task that

is being processed. Node achieved such approach by implementing an event

loop that will take requests from events and handle them to the appropriate

processing, thus their so-called characteristic of event-driven. Some processes

such as executing query in database may take some time. Node is able to

multitask then by executing such query in an asynchronous method, and then

getting back to the event loop to handle other requests while waiting for the

query execution’s result to be given. Thus, the overall working flow of Node.js

in the back end can be visualized in the figure below [19].

16

Figure 2.4 Asynchronous Working Flow of Node.js

2.2.3 Express.js

Express.js is a web framework implemented on top of Node.js that can help

developers in simplifying back-end codes development through the common

pattern in developing with Express, and also the provided features such as

routes definition. Express enables developers to set what to do when a specific

HTTP request (such as the GET or POST method) with the matching pre-

defined route is received [15]. Responses can also be set to be sent back as the

result from the request retrieval and processing, such as for sending the

processed data, or sending custom headers. The Express middleware can also

be used to implement additional methods in a processing path, such that some

tasks will implement similar processing (such as for logging in and defining

sessions). In addition, third-party middleware is also available to be installed

from npm, such that developers would not need to define common tasks from

scratch.

Although Express is a framework, it states itself as unopinionated. It means

that it has less restrictions to the way the codes must be constructed, thus

developers can still have flexibility on implementing and structuring codes

with Express (even still being able to put various codes into a single file or the

more recommended multiple files) [20]. This is different compared to

opinionated frameworks where the code structure is more rigid and is

documented well such that it is more standardized for multiple programmers

17

to work together. The disadvantage of opinionated frameworks is that because

of such restrictions, developers may find the framework limited in certain

specific functions of an application. This is why some opinionated frameworks

typically are more implemented towards specific purpose(s) or domain(s).

2.2.4 MongoDB

MongoDB is a database created and developed by MongoDB, Inc. that is

document-oriented and is based on NoSQL concept of non-relational database

[21]. The way MongoDB differs in compared to Relational Database

Management System (such as MySQL) can been seen from its implementation

of collections rather than tables, documents rather than rows, fields rather than

columns, and embedded documents rather than joins. The more flexible model

of document enables related data that were typically joined in RDBMS

approach to be embedded directly into a single document instead. This enables

even complex data relations and hierarchy to be defined in a single document.

Added with the implementation of JSON form in defining the document, this

promotes a more natural way of structuring data as an object, containing

multiple properties and even “sub-properties”. The definition of a document

structure for a specific purpose can even be done without any schema; a

document may have different set of fields compared to others of the same

collection. This may result in faster application development as developers

only need to think of the database definition in the application-side, not needing

to take care of any database definition changes in MongoDB (such as changing

the column name to fit with application’s changes if using RDBMS approach)

[15]. Such flexibility can be benefitted to experiment with various data

structures, to work quickly in the prototyping phase, or to adapt to data changes

that may happen often. If schema definition is needed, such as to standardize

specific document’s format for production or to reach a shared understanding

for all developers, Mongoose ODM (Object Document Mapper) can help with

such requirement.

Another difference of MongoDB compared to other RDBMS is MongoDB’s

non-relational approach, meaning that it does not promote data joins by default

like what RDBMS does. Instead, related data can be embedded into the

correlated document directly, thus the application only needs to get the single

18

document to also retrieve other related data (also referred as denormalization).

Even though such approach has some concerns, such as the data integrity that

must be always checked when changing a certain value that should affect the

related documents (as it is cannot be done automatically because of no relation

definition), having no implementation of relations could help in improving

application’s performance. Join method done by prior relation definitions can

be costly in performance; this is why denormalization is the common solution

to increase performance [21]. By designing the document structure well and

implementing MongoDB in the suitable use cases (such as applications with

no complex transactions), the use of MongoDB can result in better database

performance for the application.

MongoDB is also set to be used for big data, which can be seen from its

characteristic of horizontal scaling. MongoDB is purposed to scale

horizontally, by means of implementing more servers into the cluster rather

than improving the current server(s). MongoDB’s approach of using the

document model enables easier distribution of data to multiple servers.

MongoDB is also able to automatically do functions for multiple servers’ setup

such as load balancing, which further help such scaling method [21].

2.2.5 Mongoose ODM

Mongoose ODM (Object Document Mapper) is one of the ODM implemented

for MongoDB database and for working in an asynchronous approach. Object

Document Mapper in general is purposed to form the data in the application

into JavaScript Object Notation (JSON) format that is used throughout the

application and the database [22].

Mongoose ODM enables the definition of schema for MongoDB documents,

defining specific structure of a certain document along with their data types.

Implementing such schema helps in standardizing and clarifying certain

documents’ structures, which can be suited for cases such as the production

stage. In addition, Mongoose ODM also enables the definition of data

validation from the schema definition and business logic that would further

help developers in in supporting the usage of MongoDB with the application

[23].

19

2.2.6 JSON

JSON, short for JavaScript Object Notation, is a data interchange format that

is used to exchange data between programming languages or systems [24].

Although it holds the name of “JavaScript” and is based on JavaScript, JSON

is not dependent of any programming languages because in the end, it is still a

data format that is purposed mainly for data exchange, thus it must be accepted

by many programming languages or systems in order to be useful. Putting data

into object-based format such as JSON enables a more natural representation

of data; processing data in terms of objects that have specific properties.

JSON implemented name-value pairs as its main concept of formatting. The

name must be defined in string data type, while the value can be of a string, a

number, a JSON object (thus making a nested object), an array, a Boolean, or

null. The name-value pair(s) is/are then enclosed by curly brackets, thus

forming an object. An example of JSON can be seen in the figure below [25].

Figure 2.5 JSON Example

The example above is a JSON consisting of name-value pairs, such as a pair

consisting of the name “Age” (in string data type) and the value of 999 (in

number data type).

2.2.7 RESTful APIs

API, short for Application Programming Interface, is a tool that enables system

to share its services to other systems by only giving the appropriate data and

access. Developers can define further the access allowed from specific API

services such as read-only, into even adding or editing specific data. This

access control enables the implementation of API as processing services

sharing without compromising the system’s own security [26]. The examples

20

of API are the APIs provided by Facebook and Twitter, such as the Twitter

API that enables external applications to get a user’s feeds in the form of JSON

for further processing.

While REST stands for Representational State Transfer, a web architectural

style pioneered by Roy Fielding that is used for designing and managing

distributed systems [27]. The overall concept of REST is to use the HTTP

protocol as a way for a computer to access resources from another system,

affecting the way URL is formatted to help represent data model (such as

getting users list from /users). REST is purposed to generally increase the

performance, scalability, simplicity, modifiability, portability, reliability, and

visibility of distributed systems implementing such style. To achieve such

benefits, REST defines the constraints that are needed to be fulfilled:

Client-Server

Client-server architecture enables the separation of tasks between the

front-end side that focuses more on data display and the back-end side

that focuses on data processing. By separating in such model, it also

enables either side of the application to develop independently of each

other.

Stateless

In addition to the previous constraint, communication between the

client and the server must be stateless, by means of a request being sent

from the client must contains all the needed data required for the server

to process. This brings benefits such as better monitoring, as all the

information is contained in the request and there is no need for data

storing. The disadvantage of such approach is that network

performance may suffer from the complete data that must be sent,

especially if some of data is repeated.

Cacheable

This constraint states that response to a request should be cacheable.

The implementation of cache can bring benefits for both the client-side

and server-side, such as performance improvement as requested data

that are cached can be used again, without the need of processing to get

21

the data again. The disadvantage that may be caused from this approach

is the probability of problems occurring because of poor caching rules.

Uniform Interface

Uniform interface must be applied in the system’s interactions such that

various client types can still interact with the system. By providing a

clear standard, different client types will be able to follow the

uniformed rules set by the system in order to connect to it. The downfall

of this constraint is that by making the system having the uniformed

approach to accept various client types, it may harm the performance

compared to approaches that are focused on optimizing specific client

type.

Layered System

The concept of layered system separates the system’s concerns (such

as the business logic and storage) into layers. This enables clear

separation of concerns for better monitoring, and is suited for complex

systems, especially the ones that are growing rapidly as scalability is

better monitored in this layered form. Such approach may be seen as a

disadvantage for smaller systems, as layering the system could cause

latency whereas the system is not actually in need of layering the

system yet.

Code-On-Demand

This constraint is an optional constraint of REST, meaning that systems

can actually use it or not, depending on the system’s function or need.

Code-on-demand means enabling system to provide codes for the client

if requested, for the client to use. This may be unsuitable for

applications implementing RESTful APIs, as clients who use API

typically only need to get the resulting data from the processing, rather

than the processing code itself.

The visualization of system implementing all the constraints of REST is shown

in the figure below [27].

22

Figure 2.6 System Implementing REST Constraints

A RESTful API is an API implementing HTTP requests to do the creation,

reading, updating, and deletion of data (CRUD) [26]. REST API accepts

common HTTP verbs with the endpoint URL, and does the processing if the

HTTP verbs and the URL matches the definition set beforehand. Using REST

API usually will result in the processed data as a response back from the

request. The examples of HTTP verbs used in REST APIs are:

GET for data retrieval.

POST for data insertion.

PUT for updating data.

DELETE for deleting data.

The actions or processing done by the RESTful API must correspond to the

defined verbs of that API, such as data updating cannot be done using REST

API that implements the GET method.

2.3 Security Concerns and Technologies

2.3.1 JSON Web Token

JSON Web Token is an open, industry standard (also referred as RFC 7519)

used for securely transferring claims or information between two parties in the

form of JSON object [28]. JSON Web Token implementation can be done by

using a secret, or a public/private key as the signing method. JSON Web Token

(JWT for short) is often used for authentication purposes and information

exchange. In authentication, JWT would be generated when the user

23

successfully logged, in which the token will be used by the client to access

routes that require special permission in the web application. JWT is also used

in achieving Single Sign On (being able to access multiple applications with a

single login), because of JWT’s ease of usage for different servers having the

correct secret. JWT can also be used for secure information exchange, as the

token is signed using a specific secret or key, it can strengthen the information

transfer security and also avoiding any tampering in the data being transferred.

The JSON Web Token contains three parts that are separated by a period sign:

1. Header

The header, when decoded, contains the type of the token (which

would contain “JWT” for a JSON Web Token) and the signing

algorithm (such as HMAC256).

2. Payload

The payload contains the information or the data that is embedded

to the token when it is generated. It is advised that the data being

embedded to the token should not be a sensitive data.

3. Signature

Lastly, the signature part contains the resulting signature from the

header, the payload, the defined secret, and the algorithm. The

signature is used to ensure that the data in the token is valid, and

that there is no tampering happened in the data transfer.

In authentication, typically there are two ways of handling a login, which are

cookie-based authentication and token-based authentication. In cookie-based

authentication, a session is typically created in the server and a cookie will be

returned for the client to be used for all later requests. A database access would

be needed to validate and also access the user data. While in a token-based

authentication, the approach uses the signed token that will be passed into the

application’s API for validation to enter protected sections of the application.

By using token-based approach, various ways can be implemented to use the

token, thus making the approach more flexible. The use of JSON Web Token

can also help with performance as the token can be self-contained (containing

the needed user data in itself), thus there would be no need for database access

24

on every request. Most importantly, the use of token for authentication

provides a clear separation of purpose between the client and the server, as this

time the token generated from the server will be given to the client, and that

the server will only need to wait until a client sends a request with its token to

be processed by the server. The clear separation of purpose enables server to

act as the provider, while the client acting as the requester.

2.3.2 Encryption

Encryption is the process of converting a data or information that was

originally readable or understandable, into an encrypted form which is

unreadable [29]. In its encrypted form, the original meaning of the data would

not be able to be obtained because of its unreadability. Whereas decryption can

be considered as the opposite of encryption, which is the process of converting

an encrypted data into its original form to obtain the actual meaning.

Encryption is considered as two-way because of data that is encrypted can be

converted back into its original form through decryption. Encryption is

implemented to data that would need to be converted back in order to get its

actual values back, such as for text messaging application. In such application,

the text messages would be encrypted so that the message’s privacy can be

kept, but the messages should be able to be decrypted so that the original

meaning can be given to the retriever.

There are two types of encryption methods [30]:

1. Symmetric encryption, where the same key (secret key) is used for

both the encryption and decryption processes.

2. Asymmetric encryption, where specific keys are assigned for

encryption and decryption separately. The public key would be

shared for the encryption, but the secret key (for decryption would

only be owned by the receiving side. The asymmetric encryption is

found to be almost a thousand times slower compared to symmetric

encryption, as the encryption of public key itself requires high,

intensive computational processing. This would affect the usage of

such technique on smaller devices.

25

There are also various methods of implementing encryption. Some examples

of the encryption methods are Triple DES (Triple Data Encryption Algorithm),

RSA (Rivest-Shamir-Adleman), Blowfish, Twofish, and AES (Advanced

Encryption Standard).

2.3.3 Hashing and Salting

Hashing is a method of getting a variable-length input or data, and generating

a fixed-length data with different value (typically a fixed-length integer value)

that eliminates the original meaning of the data [31]. Hashing is used for

security purposes, such as for securing sensitive data by not storing its actual

value into the database. The resulting value from hashing method is called the

hash value. Although encryption and hashing share the similar purpose of

securing data by altering its value visually, the difference between the two

methods is that hashing is referred as one-way. Hashing method enables the

encoding of a data into a hash value, but the resulting hash value cannot be

decoded back, thus the original value cannot be retrieved from a hashed data.

Such characteristic of being one-way makes security attack likely to be harder,

because this time the attempt of getting the value back from the same data is

impossible. Instead, the attacker would have to either guess the secret used in

the hash, or by guessing the actual value of the data itself (in which both

approaches are done by brute force exhaustively) [32].

The problem with the hashing approach is that if hashing is the only method

implemented for securing a data, a same data would produce the same hash

value. This enables the possibility of attackers to guess the actual value of the

password by brute force until the same hash value is found. An example of this

attacking method is called the dictionary attack. A dictionary attack is an

attempt of finding the actual value from a hashed data, by comparing the

hashed data with words from the dictionary or words used commonly as

passwords that are hashed too, by brute force. If the actual value of the

password stored is common enough, such attack would be successful. To

overcome with the weakness found in only implementing hashing method,

salting is used to complement hashing method. Salt is a random data (typically

of significant length) that is used as an additional input for the hash function.

It is implemented to ensure the uniqueness of any hash value generated [33].

26

By implementing both methods, even though there are data with the exact same

value, the resulting hash values from those data would all be unique, thus

overcoming the problem that was present beforehand and further securing the

data from attacks such as the dictionary attacks. It is important to note that the

salt value used should always be randomized, and not to be reused, as a reused

salt would produce the same hash value from the case of same data.

Some examples of hashing functions are bcrypt, PBKDF2, and scrypt. In

particular, the PBKDF2 (Password-Based Key Derivation Function 2) is a

function used to derive a secret key from the given password (to be stored)

along with a salt, in which the process is done many times. By repeating the

process multiple times, it makes password attacks harder to find the actual

value. PBKDF2 hash function has some benefits such as being developed from

an academic background and having been tested and researched over the years.

It is also established as a standard crypto function in Node.js, thus making it

easier to be implemented in such environment.

2.3.4 SQL Injection

SQL injection is a hacking technique of attempting to access the data in the

database through SQL syntax manipulation. Databases that are based on SQL

(Structured Query Language) such as MySQL and PostgreSQL send queries

using string of dedicated and structured statements [34]. This creates a problem

as attackers can manipulate the queries that are given to the database, and get

information out of the queries (such as by doing blind SQL injection; just trying

out random parameters that would eventually produce information from the

database) [35]. For an example, a query for accepting user’s login would look

like:

SELECT * FROM userTable WHERE uName = ‘$uName’ AND pass =

‘$pass’

If the query is not handled well, the query can be manipulated by attackers in

order to bypass the login, such as by adding partial string to become:

SELECT * FROM userTable WHERE uName =’’ or 1=1--’ AND pass = ‘’

27

This would create a “confusion” in the database, especially with the

manipulation of the use of the quote and dash symbols. Such manipulation

would make the password parameter to be ignored, and to ultimately always

result as true (with the parameter 1=1 always resulting in true), thus allowing

the attacker to bypass the login.

In contrast to SQL, MongoDB does not send queries by using strings. Instead,

each of the operations has its own dedicated method, and that no other

operation besides of the method’s own purpose can be done with the method

[36]. For example, the login query mentioned before would look like the query

shown below in MongoDB:

db.users.find({uName: uName, pass: pass});

The find method itself is only purposed for finding data, and nothing else. By

implementing such approach, MongoDB is considerably more secure by

default compared to SQL-based databases.

To further make the implementation of MongoDB secure, careful sanitization

of data can be implemented in the application such that the inputted request

would have the desired format, thus preventing the system from request

injection attacks (submitting request containing system-reserved words). In

addition, specific use of parameter in MongoDB queries can also be

implemented such that specific actions will only affect the specific

corresponding data, thus further reducing the possible scope of work that the

attackers can approach [36].

28

2.4 Agile Software Development

Agile software development is a software development philosophy based on the Agile

Manifesto’s values and principles that acts as the base concept of its methods and

practices. The values from Agile Manifesto is shown in the figure below [37].

Figure 2.7 Values from Agile Manifesto

29

While the principles from Agile Manifesto is shown in the figure below.

Figure 2.8 Principles from Agile Manifesto

From the values and principles in Agile Manifesto, it can be seen that the agile software

development emphasizes on self-organization, team interactions, openness and the

adaptation to changes, the pursue of working product rather than running straight to

the final product, and also the collaboration with the client to achieve a valuable

software. Such philosophy comes from the realization that changes and evolution in

technology, in how businesses are done, and even in how humans behave are

happening quickly, and it is not showing any sign of slowing down too. However, such

quick changes in technology actually play an important role in keeping up with

changes from businesses and human behaviors, as technologies such as software are a

key component in those sectors; helping out business’ problems into even specific

human’s needs and wants [38]. Although it does bring benefits, such quick changes in

technology would mean problems to the developers, as the need of new software and

30

also software updates increases, but time given or available for the development is

decreasing in order to cope up with the continuous changes. This would mean that

developers cannot work the same way as before anymore, and that having improved

individuals (in terms of skills) alone may not be the solution too. Developers would

have to change the way they think and work with software development, in order to

cope up with such rapid changes in the world that have not been experienced before.

By implementing agile software development, the resulting software quality can be

enhanced because of the involvement of the client early in the development process.

By having the client working together with the development of the product, the

developers will be able to get feedbacks directly from the actual client that is going to

use the product, and essentially working towards developing features that are made to

fit with that the client really wanted.

By enabling the client to interact with developers regularly, developers would be able

to get exactly the client’s feeling about a certain progress, such as whether they liked

it, whether it needed some revisions, or whether the features are not needed. This

would bring benefits to the team’s commitment, as now the developers can have an

actual goal close to them, which is to fulfill the client’s actual needs that would

eventually resulted in their positive feedbacks. This is achieved by enabling both

parties to interact during the development phase, rather than the developers only being

tasked to deliver a final product in which they might not receive any feedbacks from

the client at all in the end. In addition, involving client in the development is also likely

to improve customer relationship, as the connection between the developer and the

client is not only focused on contract anymore, but the agile software development

also builds a connection that focuses on getting the customers satisfied with an

accurate final product efficiently developed by the developer.

As its name already suggest, being agile (moving swiftly) also allows for reduced

waste, which can be seen from their approaches, such as getting the working product

as soon as possible for early feedbacks. By receiving early feedbacks, developers can

work accurately on scaling the product’s development into the right path, thus

decreasing the probability of any rework that could happen in the later stage because

of inaccurate requirements. Another example is agile software development’s

31

approach in writing only sufficient documentation enough to serve the concerned

actors [39].

One of the key differences between agile software development and other development

concepts, such as the waterfall model, is that it emphasizes on testing and feedback

retrieval from during the development phase (even early in the development), whereas

waterfall software development is likely to separate the development and the testing

phase (typically conducting testing after the development phase only), thus only

involving the client at the end of the development. Agile software development

involves the clients in the development of the product, and works with such

development and testing with the client work flow repeatedly until the product is

acceptable to the client. The agile software development work cycle is shown in the

figure below [40].

Figure 2.9 Agile Software Development Cycle

2.4.1 Scrum Software Development Methodology

As agile software development is only a base concept in developing products,

companies would need an actual method that is practical to be actually

implemented by the company. Scrum is one of the development methodology

that is based on the concept of agile software development, designed to be able

to tackle complex issues into a usable product through the implementation of

the iterative characteristic of agile.

32

There are three roles in the Scrum methodology [41], each having its own

purposes in the execution of Scrum:

1. Product owner

The product owner acts as the representative between the stakeholders

or the client to the team, with the main purpose of driving the

development team such that they work toward bringing value to the

business. Aside from the funding management (receiving and

managing funds for the project), product owner is also responsible for

using the given initial requirements to the team such that the

development team will be working on finishing features that are more

valuable first.

2. Scrum Master

The Scrum Master is the person that is responsible for making sure that

the team is working by following the Scrum software development

method, including teaching the method to the team, making sure that

the method’s implementation fits the team, and also to deal with any

obstacles that hold the team from implementing Scrum method.

3. Development team

The development teams are self-managing and cross-functional teams

that are responsible for fulfilling the list of requirements through the

iterative process of Scrum, working toward not only the success of the

final process, but also the success of each iteration.

The way Scrum method works is by the product owner firstly receiving the

goal from the client and also the initial requirements. The product owner then

creates a Product Backlog, which is a list of requirements that when fulfilled

will be able to fulfill the goal given from the client. The Product Backlog is

created such that the requirements are ordered by their priority (higher priority

means bringing higher value to the business). After that, the product owner

would hold the initial Sprint planning meeting. In Scrum, Sprint is a term of an

iteration period in which a certain list of tasks is done (can be set in weeks or

even months). In a Sprint planning meeting, the product owner defines the list

of requirements that should be developed into functionalities in the product

(based on higher priority) for the Sprint period, and the team would be able to

33

argue or question the purpose. The development team would then need to

determine the requirements that will be the target for the Sprint period to the

product owner, and then start to plan how to develop the requirements (the

Sprint period starts here).

In addition to the Sprint period mentioned before, daily meeting called Daily

Scrum must also be held by the development team. In Daily Scrum, the team

members will share each of their progress, next progress plan, and any obstacle

that they may face. This enables the team members to have the same

understanding to the current team’s progress, and to also foster the self-

management in the team.

A Sprint review will then be held at the end of a Sprint period, where the

development team will present the progress to the Product Owner and also any

stakeholders who may want to attend. At that meeting, the next tasks along

with any revisions for the development team will also be discussed. In addition,

a Sprint retrospective is also held between the development team and the Scrum

Master, discussing about any idea or changes that must be done in order to

further improve the implementation of Scrum method in the team.

All the workflow of Scrum mentioned above, as also shown in the figure

below, is done iteratively until the product is accepted by the client (ready for

production).

Figure 2.10 Scrum Software Development Method Workflow

chapter 2 theoretical foundation 2.1 general …library.binus.ac.id/ecolls/ethesisdoc/bab2/bab...

Documents