(deemed to be university), · 2018-10-04 · 20 •network-attached storage (nas) •dedicated file...

33
Assistant Professor, Deptt. of CSE, Jamia Hamdard (Deemed to be University), New Delhi, India. http://www.jamiahamdard.edu https://Syedimtiyazhassan.org [email protected]

Upload: others

Post on 14-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Assistant Professor, Deptt. of CSE, Jamia Hamdard

(Deemed to be University), New Delhi, India.

http://www.jamiahamdard.edu

https://[email protected]

Page 2: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

1. Types of Data

2. Data Sources

3. Data Collection

4. API

5. Data Storage

6. Data Storage Management

7. Storage Security

2

Page 3: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

• Unstructured

• Structured

• Semi Structured

3

Page 4: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

4

Unstructured

Page 5: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Structured

5

Page 6: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Semi structured

6

• JSON(JavaScript Object Notation)

• BibTex

• .csv

• tab-delimited text

• XML

• etc.

Page 7: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Structured

7

• Numerical

• Categorical

Page 8: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Numerical

8

•Continuous

•Discrete

Page 9: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Numerical

9

• Continuous

• Interval Data

• Ratio Data

Page 10: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Categorical

10

• Nominal Data

• Ordinal Data

Page 11: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

11

Page 12: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

• Twitter and Facebook

• Blogs and comments

• Instagram, Flickr, Picasa, etc.

• YouTube

• Internet searches

• Mobile data content (text messages)

• User-generated maps

• etc.

12

Social Networks

Page 13: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

• Commercial transactions

• Banking/stock records

• E-commerce

• Credit cards

• Medical records

• etc.

13

Traditional Business Systems

Page 14: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

• Sensors: traffic, weather, mobile phone location, etc.

• Security, surveillance videos, and images

• Satellite images

• Data from computer systems (logs, web logs, etc.)

• etc.

14

Internet of Things

Page 15: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

Variables of interest

15

Page 16: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

• Which features should be included?

• All Vs Targeted

• How can we obtain ground truth for the target variable?

• Manually

• Crowdsourcing

• Controlled Experiments

• How much data is required?

• Is the data set representative enough?

16

Page 17: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

• Web API

• Twitter REST API,

• Facebook Graph API,

• Amazon S3 REST API

• OS based API

• Cocoa,

• Carbon,

• WinAPI

• Database API

• Drupal Database API,

• Django API

• Hardware

• Google PowerMeter

• CubeSensore

17

Page 18: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

18

Page 19: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

19

• Redundant Array of IndependentDisks (RAID)

• Way of storing the same data in differentplaces on multiple hard disks.

• Direct-attached storage (DAS)

• Connects directly to a server (host) or agroup of servers in a cluster.

• Storage area network (SAN)

• A separate network of storage devices forblock-level communication between serversand storage; not accessible through thelocal area network (LAN) by other devices.

Evolution of Storage Technology and Architecture

Page 20: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

20

• Network-attached storage (NAS)

• Dedicated file storage that enables multipleusers and heterogeneous client devices toretrieve data from centralized disk capacity.

• Internet Protocol SAN (IP-SAN)

• One of the latest evolutions in storagearchitecture.

• Convergence of technologies used in SANand NAS.

• Provides block-level communication acrossa local or wide area network (LAN or WAN),resulting in greater consolidation andavailability of data.

Evolution of Storage Technology and Architecture

Page 21: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

21

Generic cloud storage architecture

Page 22: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

22

Characteristic Description

Manageability The ability to manage a system with minimal resources

Access method Protocol through which cloud storage is exposed

Performance Performance as measured by bandwidth and latency

Multi-tenancy Support for multiple users (or tenants)

Scalability Ability to scale to meet higher demands or load in a graceful manner

Data availability Measure of a system's uptime

ControlAbility to control a system—in particular, to configure for cost, performance, or other

characteristics

Storage efficiency Measure of how efficiently the raw storage is used

Cost Measure of the cost of the storage (commonly in dollars per gigabyte)

Cloud storage characteristics

Page 23: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

23

• Know your data

• Don't neglect unstructured data

• Understand your compliance needs

• Establish a data retention policy

• Look for a solution that fits your data

• Use a tiered storage approach

• Know your clouds

• Carefully choose storage providers

• Make sure your data is secure

• Leverage technologies that use deduplication,snapshotting and cloning

• Have a disaster recovery plan

Steps to be take to choose the right data storage solution(s)

Page 24: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

24

• Monitoring

• Continuous

• Security, performance, accessibility, andcapacity

• Reporting

• Periodically

• resource performance, capacity, andutilization

• Provisioning

• Hardware, software, and other resourcesneeded to run a data center

• Include capacity and resource planning

Key management activities

Page 25: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

25

• Capacity planning

• Ensures that the user’s and the application’sfuture needs will be addressed in the mostcost-effective and controlled manner.

• Resource planning

• is the process of evaluating and identifyingrequired resources, such as personnel, thefacility (site), and the technology.

• Ensures that adequate resources areavailable to meet user and applicationrequirements.

Key management activities

Page 26: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

26

• Exploding digital universe

• Increasing dependency on information

• Changing value of information

Key Challenges

Page 27: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

27

• The information lifecycle is the“change in the value of information”over time.

• ILM is a proactive strategy thatenables an IT organization toeffectively manage the datathroughout its lifecycle, based onpredefined business policies.

• This allows an IT organization tooptimize the storage infrastructure formaximum return on investment.

Information Lifecycle Management (ILM)

Page 28: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

28

• Business-centric

• Centrally managed

• Policy-based

• Heterogeneous

• Optimized

ILM strategy characteristics

Page 29: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

29

• Classifying data and applications toenable differentiated treatment ofinformation.

• Implementing policies by usinginformation management tools.

• Managing the environment by usingintegrated tools.

• Organizing storage resources in tiers.

ILM implementation activities

Page 30: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

30

• Accountability

• Accounting for all the events and operations

• Confidentiality

• Provides the required secrecy of information

• ensures that only authorized users haveaccess to data.

• Integrity

• Ensures that the information is unaltered.

• Availability

• Ensures that authorized users have reliableand timely access to data.

Primary services of security

Page 31: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

31

• Controlling User Access to Data

• Protecting the Storage Infrastructure

• Data Encryption

• Securing Backup, Recovery, and Archive

• Firewall

• Access Control Switch

• Know the Vulnerability

Primary services of security

Page 32: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

32

Page 33: (Deemed to be University), · 2018-10-04 · 20 •Network-attached storage (NAS) •Dedicated file storage that enables multiple users and heterogeneous client devices to retrieve

33

THANK YOU!