database_management.pdf

Upload: swaroop666

Post on 04-Nov-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

  • Managing Data to Improve Business

    Performance

  • Learning Objectives

    Recognize the importance of data, managerial issues and life cycle

    Describe sources of data, collection, and quality

    DBMS

    Describe Data Warehousing and Analytical Processing

  • Data

    What is Data ?

    Importance of Data ?

    Data Quality ?

    Result of Dirty Data ?

  • Difficulties of Managing Data

    Amount of data increases exponentially.

    Data are scattered and collected by many individuals using various methods and devices.

    Data comes from many sources including internal sources, personal sources and external sources.

    Data security, quality and integrity are critical.

  • Types of Data in Organizations

    Transaction Records

    Documents

  • Document Management System (DMS)

    Consists of hardware and software that manage and archive electronic documents and also convert paper documents into e-documents.

    DMS, besides capturing and storing the documents takes care of indexing which facilitates searching of documents from the repository.

  • Data(transaction records) Management

    Conventional file system vs DBMS:

    Disadvantages of Conventional file system

    Redundancy Inconsistency Security issues etc.

  • Database Management System (DBMS)

    DBMS Storage of data and program tomanage it

    Examples of DBMS software Oracle, MSSQLServer, DB2, MySQL, PostGRESQL

  • Database

    Collection of storage objects like tables.

    EMPNO NAME SALARY DEPTNO

    10 ARUN 30000 10

    20 KIRAN 40000 20

    TABLE

    RECORD

    FIELD

  • DBMS Key Concepts

    Data Models Hierarchical

    Network

    Relational

    etc

    Data Views

    KeysPrimary, Foreign, Candidate, Alternate

    Indexes

    SQL(Structured Query Language)

  • Data Life Cycle

  • Transactional vs. Analytical Data Processing Transactional processing takes place in operational systems(TPS) that provide the organization with the capability toperform business transactions and produce transaction reports.The data are organized mainly in a hierarchical structure andare centrally processed. This is done primarily for fast andefficient processing of routine, repetitive data.

    Analytical processing involves the analysis of accumulateddata. Analytical processing, sometimes referred to as businessintelligence, includes data mining, decision support systems(DSS), querying, and other analysis activities. These analysesplace strategic information in the hands of decision makers toenhance productivity and make better decisions, leading togreater competitive advantage.

  • Data WarehousingData warehouse is a repository of historical dataorganized by subject to support decision makersin the organization and include:

    Online analytical processing(OLAP) which involvesthe analysis of accumulated data by end users.

    Multidimensional data structure which allows datato be represented in a three-dimensional matrix(or data cube).

    Unlike the data tables in the database which aredesigned to optimize storage, the data tables in awarehouse are designed to respond to analysisquery.

    Data warehousing entails an ETL process:

    Extracting data from various sources

    Transforming it to fit operational needs

    Loading it into the end target (Data mart)

  • BIG DATA

  • Background :

    For decades companies have beenmaking business decisions based ontransactional data stored in relationaldatabases.

    In the recent years, companies haverealized that the non traditional, lessstructured data in the form ofweblogs, social media, email, sensorsis trove of potential treasure as thiscan be mined for useful insights.

  • How big is big data ?

    Misconception of big data :If it is data and it is big, it is big data

    What is big today may not be big tomorrow

    Big data has attributes that challenges constraints of a system or business needs.

  • 4 Vs of big data

    Volume

    Velocity

    Variety

    Value

  • Volume

    Machine generated data is produced in much larger quantities than non-traditional data.

    For example, a single jet engine can generate 10 TB of data in 30 minutes

  • Velocity

    Social media data streams while notas massive as machine-generated dataproduce a large influx of opinions andrelationships valuable to customerrelationship management. Even at 140characters per tweet, the high velocity(or frequency) of Twitter data ensureslarge volumes.

  • Variety

    Traditional data formats tend to be relatively well described and change slowly. In contrast, non-traditional data formats exhibit a dazzling rate of change.

  • Value

    The economic value of different datavaries significantly. Typically there isgood information hidden amongst alarger body of non-traditional data;the challenge is identifying what isvaluable and then transforming andextracting that data for analysis.

  • Why Big Data ?

    When big data is distilled and analyzed in combination with traditional enterprise data, enterprises can develop a more thorough and insightful understanding of their business.

    It leads to enhanced productivity, a stronger competitive position and greater innovation.

  • Examples The proliferation of smart phones and other GPS devices offers

    advertisers an opportunity to target consumers when they are in

    close proximity to a store, a coffee shop or a restaurant. This opens

    up new revenue for service providers and offers many businesses a

    chance to target new customers.

    Retailers usually know who buys their products. Use of social media

    and web log files from their ecommerce sites can help them

    understand who didnt buy and why they chose not to.. This can

    enable much more effective micro customer segmentation and

    targeted marketing campaigns, as well as improve supply chain

    efficiencies.

    Social media sites like Facebook and LinkedIn simply wouldnt exist

    without big data. Their business model requires a personalized

    experience on the web, which can only be delivered by capturing

    and using all the available data about a user or member.