international journal of informative & futuristic research...
Post on 05-Jun-2020
6 Views
Preview:
TRANSCRIPT
9215 This work is published under Attribution-NonCommercial-ShareAlike 4.0 International License
International Journal of Informative & Futuristic Research ISSN: 2347-1697
Volume 5 Issue 8 April 2018 www.ijifr.com
Abstract
Relational databases such as SQL Server, Oracle and MySQL have almost
forty five years of experience in real time production environment. For last
few decades, these databases have been successfully used by large banks
and other institutions/organizations throughout the world for transaction
processing, handling structured data, build and manage intelligent and
mission-critical applications. But today, there is remarkable growth in
heterogeneous and unstructured data due to availability, speed of internet
and connectivity of devices through IOT. So companies are progressively
considering alternatives to relational infrastructure to deal with Big Data.
NoSQL databases have coined themselves as alternative solutions. This
paper explains the need of transition from traditional databases to NoSQL
databases.
1. INTRODUCTION
Relational databases management systems follow relational data model. Database is
composed of relations or tables. Table is collection of rows and columns. Each row
represents a record and each column represents a field. Tables are linked with each other
based on some defined relationships such as foreign key. These relationships enable user
to retrieve and join data from one or several tables using a single query. Abstractly, tables
and relationship between tables represent some real time entities which are used in
designing the database schema. Relational databases such as SQL Server, Oracle and
MySQL have almost forty five years of experience in real time production environment.
For last few decades, these databases have been successfully used by large banks and
TRANSITION FROM TRADITIONAL
DATABASES TO NOSQL DATABASES Paper ID IJIFR/V5/ E8/ 010 Page No. 9215-9223 Subject Area
Computer
Engineering
Key Words Big Data, RDBMS, NoSQL, Cloud Computing
1 Subita Kumari
Research Scholar, Department of Computer Sci. & Engineering, University Institute of Engineering & Technology, Maharishi dayanand University, Rohtak-Haryana
9216
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
other institutions/organizations throughout the world for transaction processing, handling
structured data, build and manage intelligent and mission-critical applications [1]. Also,
there is easy availability of skilled and experienced programmers who can work on
relational systems. That's why most organizations are not going to transfer their
transactional systems from relational databases to NoSQL databases. But today, there is
remarkable growth in heterogeneous and unstructured data due to availability, speed of
internet and connectivity of devices through IOT. Under the explosive increase of global
data, the term of big data is mainly used to describe enormous datasets generated by
widely distributed data sources which require newer technologies and architectures to
store, process and manage these datasets. So companies are progressively considering
alternatives to relational infrastructure to deal with Big Data. These NoSQL databases
have coined themselves as alternative solutions [3].
2. RELATIONAL DATABASE MANAGEMENT SYSTEM
Relational databases management systems follow relational data model. Database is
composed of relations or tables. Table is collection of rows and columns. Each row
represents a record and each column represents a field. Tables are linked with each other
based on some defined relationships such as foreign key. These relationships enable user
to retrieve and join data from one or several tables using a single query. Abstractly, tables
and relationship between tables represent some real time entities which are used in
designing the database schema.
A. ACID properties of Relational Databases
A transaction is a set of logically related operations performed on database to perform unit
of work. The four main features of a relational database transaction that guarantee its
integrity is referred to as ACID (Atomicity, Consistency, Isolation, and Durability)
properties. Conventional RDBMS applications have focused on ACID transactions.
Atomicity
Atomicity refers to the execution of all operations of the transaction or none of
them. Recovery management component of RDBMS ensures atomicity.
Consistency
Consistency means database should remain in some consistent state before and
after execution of transaction. Concurrency control mechanism of RDBMS ensures
consistency.
Isolation
If two or more transactions are executing concurrently then isolation guarantees
that a running transaction is isolated from another transaction performing similar
task. In other words, transactions operating on the same data do not interfere with
each other. Concurrency control mechanism of RDBMS ensures isolation.
Durability
9217
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
Transactions should be able to recover under any case of failure. Recovery
management component of RDBMS ensures durability. This is very important
property in cases of failure of the systems.
B. Advantages of Relational Databases
After For last few decades, relational database management systems have presented fairly
robust information management tools to software developers and businesses. The
following are some of the advantages of relational database model:
Data Structure
The data of relational database is in tabular form, which is easy for users to
comprehend and use. The database structured queries can search for matching
entries in columns of the tables efficiently.
Data Independence
Various users of database can access data without physical details. Various levels of
database shown in figure 1 follow abstraction and lower level layer hide details
from lower level layer. This is called data independence.
Indexing
RDBMS allow various kinds of indexes to reduce I/O cost and to increase speed of
data access.
Multiple User Access
RDBMS allows multiple users to concurrently access the database. This is made
possible through concurrency control mechanism of RDBMS. It prevents users
operations from accessing partly updated records.
Authentications and Privileges
RDBMS provide authentication feature that allows database administrator to limit
database access to only authorized users. Also, RDBMS provide privilege control
feature that allows administrator to grant access on the basis of the task the user
needs to perform.
Language
RDBMS have build in maintenance tools that allow database administrators to
easily test, repair and maintain database.
Network Access
In RDBMS users can access and use the database without logging into the physical
computer system. RDBMS use server daemon programs that listen for requests on a
network and connect clients to the database.
Relational database management systems such as Microsoft SQL Server, Oracle, MySQL,
and Sybase are the key database management systems which have been widely used for
last few decades by individuals and organizations for managing structured data. However,
horizontal scaling is a big challenge in the contemporary era of web technologies.
Recently, with an increase in web application and diversity in data, there is need to
explore non-relational options which can provide a schema-less data structure, horizontal
9218
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
scaling, high availability and simple replication. These newly explored options are called
NoSQL databases.
3. NOSQL DATABASES
NoSQL is the term used to express data stores that do not follow the relational model and
do not use SQL (Structured Query Language) as the data query language. NoSQL is a
class of databases which allows better application development through the use of flexible
schema. These databases scale horizontally and dynamically to support a large number of
users and a big amount of data. These databases allow complex and distributed processing
of data, so they provide improved performance for highly responsive applications. [2]
They are categorized into various classes based on how they store data.
Key-Value (KV) Store
Key-value databases are based on the concept of the distributed hash table and Amazon's
Dynamo [4]. Amazon uses its Dynamo key-value store for its shopping carts. They store
data as values and pair each value with an alpha-numeric identifier (key) in simple
standalone tables called as hash tables [5]. Examples of various Key-Value databases are -
Dynamo, Tokyo Cabinet, Redis, Riak, Voldemort, and MemcacheDB.
Column-Oriented Databases
Column-Oriented data stores utilize a column-oriented data structure that accommodates
multiple attributes per key [5]. They are also more scalable because the user can add new
columns in the database in future. There is no need to supply values for already existing
rows for the new columns. Some examples of the various Column-Oriented database are -
Hypertable, HBase, and Cassandra.
Graph Store
Graph databases handle highly interconnected data called nodes. These are useful when
relationships between data sets are more important than the data itself. They replace
relational tables with structured relational graphs of interconnected key-value pairs. They
are almost similar to object-oriented databases as the graphs are represented as an object-
oriented network of nodes, edges, and properties [5]. Examples of the various Graph Store
database are - Neo4J, InfiniteGraph, Sones GraphDB, InfoGrid, AllegroGraph, and
FlockDB.
Document Oriented Databases
Document-oriented databases store data in the form of object like documents. They are
good for storing and managing big data-size collections of documents like text documents,
email messages, product or customer details [5]. They use JSON (Java Script Object
Notation), BSON (Binary Serialized dOcument Notation) or XML (Extensible Markup
Language) as data exchange formats. MongoDB and CouchDB are famous open source
9219
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
document-oriented databases. SimpleDB is a proprietary document-oriented database of
Amazon.
4.1 The Benefits of NoSQL
NoSQL databases are more scalable than relational databases. They carry out agile development
and quick iteration. They work on large volumes of structured, semi-structured and unstructured
data. They use object-oriented programming that is easy to use and flexible. They are efficient and
scale-out architecture instead of expensive monolithic relational architecture. Some of the benefits
of NoSQL databases are explained as below-
Dynamic Schema
Relational databases need defined schema or structure before adding data to the
database. NoSQL databases are designed to permit the adding of data without a
predefined schema.
Auto-sharding
Relational databases scale vertically means a single server has to take care of the
entire database to ensure availability and consistency of data. This single server
becomes expensive and places restrictions on scalability. The solution to this
problem is to scale horizontally means adding more servers instead of adding
additional capability in a single server. NoSQL databases support auto-sharding
mechanism means it automatically spread data across an arbitrary number of
servers.
Replication
Most NoSQL databases support automatic replication, means they are highly
available databases and manage to recover from disasters without involving
separate applications.
Integrated Caching
Many NoSQL databases have excellent integrated caching capabilities, keeping
frequently used data in system memory as much as possible without needing
separate caching layer.
4.2 What's causing transition from traditional databases to NoSQL databases?
There is no single motive or technology that is causing the move to NoSQL technologies.
There are four interrelated megatrends that are driving the embracing of NoSQL
technologies. These are Big Users, Big Data, Cloud Computing and Internet of Things.
A. Big Users
The easy availability, low cost and high speed of internet, throughout the world, has
created big users. Today, Almost 5+ billions global online population using computers,
laptops, and smartphones spend 40+ billion hours online daily. A recently launched app
can go viral, growing from zero to a million users overnight. The numbers of users and
hours spent swing on festivals like Christmas or Diwali. So the technologies dealing with
varying population have to be scalable and flexible.
9220
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
B. Big Data
Big Data is defined as the huge amount of heterogeneous type of data that is being
generated with high speed and analysis of this data requires new technologies and
architectures. Figure 4.2.1shows various dimensions of big data.
Figure 4.2.1: Various Dimensions of Big Data
Volume
The big word in Big Data itself defines the volume. The volume of data has grown
from gigabytes to zettabytes. Figure 4.2.2 shows the trend of growth of data in last
two decades.
Figure 4.2.2: Trend of Growth of Big Data
Da
ta i
n Z
ett
a B
yte
s
(Tri
llio
ns
of
Gig
ab
yte
s)
Year
Trend of Growth of Big Data
Un/Semi Structured Data
Structured Data
9221
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
Variety
Today, Data being generated is not of single category. It may be raw, structured,
semi-structured (web pages, web log files, e-mails, social media sites etc.) and
even unstructured (audio files, video files etc.). Figure 4.2.2 shows that growth of
structured data is linear but the growth of unstructured data is highly exponential
over a period of last 15 years.
Velocity
Velocity in context of big data means the speed of the data coming from various
sources. [6] It also means that data collection and analysis must be rapidly and
timely conducted so as to maximize the commercial value of big data.
Variability
Variability considers uneven data flow. [7] In his era of internet, data loads
become challenging to maintain during peak hours of specific events.
4.3 Cloud Computing
Cloud computing is defined as the delivery of computing services such as storage, servers,
software, databases, networking and analytics over the Internet. Applications today are
cloud-based and developed using a three-tier internet architecture as shown in figure 4.3.1.
They need to support the combined needs of millions of customers. Also, there have been
tons of changes in database management system since the instigation of cloud computing.
The need for scalable databases has been increased and these needs are being satisfied by
the NoSQL databases with their high availability, scalability and easily programmable
models [8].
Figure 4.3.1: Conventional 2-tier versus New 3-tier Cloud-Based architecture
9222
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
4.4 Internet of Things (IoT)
The Internet of Things is a world where all physical smart devices and things are
connected to each other through the internet and share information as shown in figure
4.4.1. All these devices come with a variety of new sensors. New sensors create new data
and there arise the need for new functionality. Relational databases make it hard to
incorporate new data. 40 billion sensors generate huge volumes of data. Relational
databases were not designed for handling that volume of data. In IoT one need to analyze
rapidly changing and multi-structured data in real time. Lengthy ETL (Extract, Transform,
Load) processes of relational databases to cleanse data for reporting won't work [9].
Figure 4.4.1: Internet of Things
5. CONCLUSION
Under the explosive increase of global data, the term of big data is mainly used to describe
enormous datasets generated by widely distributed data sources which require newer technologies
and architectures to store, process and manage these datasets. So companies are progressively
considering alternatives to relational infrastructure to deal with Big Data. NoSQL databases have
coined themselves as alternative solutions. There are following motivations to consider
alternatives - First is technical, because there is need to scale or perform ahead of the capabilities
of the existing systems. Secondly, there is desire to identify possible alternatives to expensive
proprietary software. A third motivation is agility or speed of development as today market
embrace agile development methodologies more quickly. So, above reasons suggest the need of
transition from traditional relational databases to NoSQL databases as they can ship new
functionality without redesigning the existing database and they can also scale out as the sensors
data grow.
9223
ISSN: 2347-1697 International Journal of Informative & Futuristic Research (IJIFR)
Continuous 56th Edition, Volume - 5, Issue -8, April 2018 Page No. : 9215-9223
Subita Kumari :: Transition from Traditional Databases to NoSQL Databases
6. REFERENCES
[1] Kumari, S., & Gupta, P. (2015). Document store NoSQL Databases. International Journal of
Artificial Intelligence and Knowledge Discovery 5(3).
[2] Kumari, S., & Gupta, P. (2017). Proposed Architecture of MongoDB-Hive
Integration. International Journal of Applied Engineering Research, 12(15), 5000-5004.
[3] Kumari, S., & Gupta, P. (2018). Implementation of CouchDBViews. In Big Data
Analytics (pp. 241-251). Springer, Singapore.
[4] Burtica, R., Mocanu, E. M., Andreica, M. I., & Ţăpuş, N. (2012, March). Practical application
and evaluation of no-SQL databases in Cloud Computing. In Systems Conference (SysCon),
2012 IEEE International (pp. 1-6). IEEE.
[5] Moniruzzaman, A. B. M., & Hossain, S. A. (2013). Nosql database: New era of databases for
big data analytics-classification, characteristics and comparison. International Journal of
Database Theory and Application Vol. 6, No. 4.
[6] Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. Mobile Networks and
Applications, 19(2), 171-209.
[7] Katal, A., Wazid, M., & Goudar, R. H. (2013, August). Big data: issues, challenges, tools and
good practices. In Contemporary Computing (IC3), 2013 Sixth International Conference
on (pp. 404-409). IEEE.
[8] Gulia, P. & Hemlata (2017). Novel Algorithm for PPDM of Vertically Partitioned
Data. International Journal of Applied Engineering Research, 12(12), 3090-3096
[9] Hemlata, Gulia, P. (2018). DCI3 Model for Privacy Preserving in Big Data. In Big Data
Analytics (pp. 351-362). Springer, Singapore.
TO CITE THIS PAPER
Kumari, S. :: “Transition from Traditional Databases to NoSQL Databases”
International Journal of Informative & Futuristic Research (ISSN: 2347-1697), Vol. (5)
No. (8), April 2018, pp. 9215-9223, Paper ID: IJIFR/V5/E8/010. Available online
through- http://www.ijifr.com/searchjournal.aspx
top related