what big data can do for an entire nation …...what big data can do for an entire nation robson...

52
WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC [email protected]

Upload: others

Post on 24-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

WHAT BIG DATA CAN DO FOR AN ENTIRE NATIONRobson SerafinTechnical Support Engineer IIIDell [email protected]

Page 2: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 2

“The great impact is taking place in finding ways to do things differently, where it was never possible

before, giving to these organizations a new potential of innovation driven by Software - BIG DATA.”

(Rodrigo C Gazzaneo)1

In common with business, a nation’s government needs to deal with an “avalanche” of data to make

well-informed decisions. Having these decisions in practice faster could deliver the right moves and

bring solutions that may benefit an entire nation at several levels, i.e. economic, educational,

cultural, social, etc.

The purpose of this article is to share in a simple and easy way what and how the new technologies

of Computer Science like Big Data and Cloud computing can assist politicians and leaders in the

management of tasks to address issues predictable and likely to happen during their administration.

These tasks involve hearing and watching what is happening, correlating them with the past

historical data through computing power analyzes and from the result takes the right direction to

solve them and proactively prevent future collapses. Being born a Brazilian, my article is based on

what I see every day in my country. However, all topics described herein are very much aligned with

the events that surround other societies in general. The solutions presented and discussed in this

article can safely be applied to any Nation with minimum customization.

Many aspects of Big Data and Cloud Computing use will be left out of this document, as they are

huge topics to explore. Nevertheless, an idea of how Big Data and Cloud can assist in a Nation

performing routine analysis on a daily basis will be explained and its results will be discussed. This

article will not go deep into the processing mechanisms or solutions available.

Disclaimer: Examples on the product design, published in this document do not necessarily reflect

real government infrastructure

Page 3: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 3

Table of Contents

Explaining Big Data and Cloud Computing .................................................................................. 5

Tools to handle Big Data ..................................................................................................................................7

The way Data is processed with Hadoop .........................................................................................................8

A little info about the Hadoop Projects .............................................................................................................9

Cloud Computing ...............................................................................................................................................9

Differences Between Traditional Datacenter and Cloud Datacenters .......................................................... 10

What Big Data can do for a Nation ................................................................................................................. 11

Computer Science for the society ................................................................................................................. 11

Nation Macro Statistics ................................................................................................................................... 11

Solutions Overview ......................................................................................................................................... 12

Solutions: Overall view .................................................................................................................................. 13

Economy ...................................................................................................................................................... 13

A cup of joy means higher productivity .................................................................................................... 15

Education ..................................................................................................................................................... 17

Big Data and Cloud working as a service for Education .......................................................................... 17

The value of Big Data in the Education market ........................................................................................ 18

Adaptive Learning .................................................................................................................................... 20

Society .......................................................................................................................................................... 20

What is the gain with Big Data for the Society? ........................................................................................ 21

Creating an Intelligent Nation, Community By Community ...................................................................... 22

Culture .......................................................................................................................................................... 22

Security and National Defense ................................................................................................................. 23

Health Services ............................................................................................................................................ 27

Precision Medicine ................................................................................................................................... 28

Electronic patients medical records ......................................................................................................... 28

Internet Of Things ..................................................................................................................................... 29

Large Amount of Collected Data ≠ Right Collected Data ......................................................................... 30

Science and Researches .......................................................................................................................... 31

Page 4: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 4

Big data in biomedicine ............................................................................................................................. 32

Politics Administration ................................................................................................................................... 32

Infrastructure ............................................................................................................................................... 33

Smart Cities .............................................................................................................................................. 33

Giant Infrastructure behind a Nation Territory that needs powerful data analysis .................................... 35

Disaster Management ............................................................................................................................... 36

Transparency vs. Corruption ........................................................................................................................ 39

Open Data means more government transparency ..................................................................................... 41

Big Data and the Challenges for the Future ................................................................................................. 41

First steps to start with Big Data Solution .................................................................................................... 42

Few products Available and Architecture examples ..................................................................................... 42

Big Data Processing Architecture Design .................................................................................................... 46

Storage Products .......................................................................................................................................... 47

Conclusion ....................................................................................................................................................... 48

Disclaimer: The views, processes or methodologies published in this article are those of the

authors. They do not necessarily reflect Dell EMC’s views, processes or methodologies.

Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries.

Page 5: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 5

Introduction

Based on Joseph Stalin’s statement, a Nation is a historically constituted, stable community of

people, formed on the basis of a common language, territory, economic life, and psychological

make-up manifested in a common culture.2

A Nation forms its own public political administrative system that can be called State3 also known as

an organized political community living under a single system of government.

Under this statement comes the information where this document was elaborated.

Big Data and Cloud Computing may be relevant when one analyzes the following points:

Large historical content

Large number of people sharing all the content in common system

Vast territory of information that in some cases can extend abroad

Tremendous economic factors that power the economy and exchange of goods between

other nations

Such huge content in its own culture that is almost impossible to measure and can dictate

several aspects of its formation, some not even known or not well documented

Highly complex society organization

Massive environment decisions that affect everybody directly and indirectly

Education system that is responsible for passing knowledge to the next generation and can

be considered as the pillars of current society

Military Security mechanisms to defend the Nation in all sectors (politic, economically and

socially)

Demanding Health system to cover all its people

Incredibly complex infrastructure to keep its livelihood

Interrelationship between communities

The above-mentioned points are not static information but on-the-fly data, generated everyday like a

live “organism” constantly moving. Managing it all has just become a huge challenge. Government

Employee responsible for different levels of society MUST have accurate data information ready

and available at all times to generate the reports that enable the right decision to be made and

furthermore, cross-reference them with past and present reports to predict future actions.

There is no better place to apply Computer Science / Big Data technology schemes than a Nation.

Data analysis can provide more details about its resources and administration in a variety of ways

from macro to micro views, either isolated or mixed, and cross-referenced information.

Big Data and other Computer Science strategies cannot replace politics-governance. It should be

seen as a powerful framework tool that can provide greater transparency and assist in common

decisions for the development and improvement of society.

With great Power comes great responsibility!

Politicians are responsible for administering their Nation. The use of power by some people affects

the behavior of others in several different ways – some unwanted. This is one of the reasons why

organization of ideas and best approaches with minimum mistakes possible in the analysis of

Page 6: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 6

information are essential. The right Information available and clearly presented after careful analysis

can assist its governance team members on their daily tasks. The result of this will be a great place

to live and to interact with others in a common sense of equilibrium.

Explaining Big Data and Cloud Computing

Before we begin where Big Data Analytics and Cloud Computing can

help a Nation, first let’s give some background on Big Data and Cloud

and how data is processed.

Big Data4 is large and complex data processing, which traditional systems cannot handle due to

limitations on several layers. The main focus of Big Data is on:

Analysis of content in real time and cross-referencing it with historical information or other

data content.

Data captured using a variety of ways (sensors, the internet, personal devices, etc.)

Accurate Searches

Shared content among several sectors and organizations (education, economic, social, etc.)

Operating on modern Storage systems that can process the amount of data and make it

highly available anytime, anywhere.

Fast transfer of the volume of data across the sectors for processing analysis of the

information

Privacy of content where only the right people will have access.

This document will show application for Big Data and Cloud Computing – some real and some

illustrative – focusing on the public sector and research institutions. For Big Data we use an actual

scenario with the real example of a large amount of data is from CERN, the Nuclear Atomic facility

based in Europe that generates 40 TB per second of data during a phase of research captured from

several sensors installed on their appliances.

Another example much closer to our daily activities is an Airbus plane, whose turbines can generate

10 TB every 30 minutes – around 640 TB of logs from internal sensors in a single long flight. These

two examples give an idea of how much data is being generated on a daily basis and how this

amount of data is handled.

Page 7: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 7

How big is your data? – David Hellman @ Myriad Genetics

Byte of data: one grain of rice

Kilobyte: cup of rice

Megabyte: 8 bags of rice

Gigabyte: 3 container Lorries

Terabyte: 2 container ships

Petabyte: covers Manhattan

Exabyte: covers the UK (3 times)

Zettabyte: fills the Pacific Ocean

Big Data challenge points also known as 5 Vs

Big Data is based on 5 V’s: Velocity | Volume | Veracity | Variety | Value.

From that, it is possible to generate innumerous calculation,

collection, and storing of a huge volume of data to develop

knowledge and direction for possible results.

Data Science or Data Analytics platforms can be broken down

into three distinct parts: Acquisition, Computation and Serving.

In other words, Collection > Processing > Results.

Tools to handle Big Data5

Nowadays, data is being generated from everywhere. An individual using a smartphone, uploading

photos and videos to social networks, sensor systems spread around the globe (Internet of Things)

sending precious log information almost about everything, surveillance video cameras creating

hours and hours of video images, audio recording systems, etc. Beyond a large amount of content

information being generated the data type also can be unstructured –not created under a relational

database structure with tables and columns – which create a layer of complexity to be processed. If

it was just structured data the amount of information to be processed is so big that none of the

current Database Application on supercomputers could process it. To sort that out, there is a new

range of Tools created to handle these large Big Data datasets called Hadoop and its amazing

Projects. As part of open source, Hadoop is offered in numerous distributions like Hortonworks,

Cloudera, MapR and Pivotal. Storage companies such as Dell EMC offer products to operate with

Hadoop where the focus is on scalability to handle the data with efficiency. I will present some case

scenarios with the tools for Big Data in place.

Page 8: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 8

How Data is processed with Hadoop

The current supercomputers and infrastructure designs present limitations as they do not allow

scalability for such large amount of data. To overcome such limitation, Google introduced a new

model of processing called MapReduce. Later, Yahoo, inspired by Map Reduce papers and Google

File System, came up with an open source toolset called Hadoop with a special File System called

HDFS where a Dataset coming from MapReduce is spread on workers computers to be processed

in a scalable way that is almost limitless The idea behind it is not to work on super power

computers, but in numerous low-cost x86 computers also known as commodity hardware.

A basic understanding of how data is processed in Hadoop:

Hadoop is just the framework for Data Analytics to take place as it takes care of a layer on

Infrastructure needs to process a large amount of data. Previously, one would need to have a huge

team to support the infrastructure while always evaluating the need to increase performance.

Fighting within the limits of the appliances while working within the IT budget was another source of

headaches for IT.

With the Cloud and Big Data framework tools, the infrastructure is simplified and the IT budget can

be directed to the Data Analytics team – also known as the Data Scientists – who will use predictive

and prescriptive analytics to create value in the areas requested.

Leverage "BY" Analysis

This is an exploratory technique of examining a strategic entity by its data attributes. The analysis

takes place with:

Additional data sources

Additional dimensional entity characteristics

Additional areas for analytics exploration

Page 9: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 9

Analytics into Action

Deliver analytics-driven scores and recommendations to the key units of teams involved.

A little info about the Hadoop Projects

Along with a set of tools to work with Big Data Applications, Hadoop has – in addition to its core

components MapReduce and HDFS – additional functionality that operate on multiple levels, called

Projects.

The most popular Projects and its brief description:

Hive: Data warehouse infrastructure for providing data summarization, query, and analysis.

HBase: Open source, non-relational, distributed database.

Mahout: Distributed and scalable machine learning algorithms that provide recommendations on

users’ taste.

Pig: a high-level platform for creating Map Reduce programs using the language called Pig Latin.

Oozie: Workflow scheduler system to manage Hadoop jobs.

Flume: Distributed service for collecting, aggregating and moving large amounts of log data for an

online analytic application.

Sqoop: Tool designed to efficiently transfer bulk data between Hadoop and structured data stores

such as relational databases.

Cloud Computing6

Cloud computing is a new IT model based on

on-demand infrastructure also known as

converged infrastructure and shared services.

Within this new technology shared resources,

data and information are provided to computers

and other devices on-demand. The main focus

is sharing resources to achieve high

performance and economies of scale similar to

electricity grid. The difference is it is offered on

the network.

This new model appears as convenient on-

demand network access to a shared pool of configurable computing systems such as networks,

servers, storage, applications and services that can be rapidly provisioned and released with

minimal management effort.

The National Institute of Standards and Technology defines Cloud Computing as:

On-demand self-service: Resources such as server time and network storage can be provided as

required, as needed automatically without requiring human interaction.

Broad network access: The system can be accessed by heterogeneous thin or thick client

platforms (e.g. mobile phones, tablets, laptops and workstations).

Page 10: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 10

Resource pooling: System resources are served as a pool of multiple consumers through multi-

tenant model assigned on-demand.

Rapid elasticity: Resources can be elastically provisioned and released to allow scale rapidly

outward and inward, all on-demand.

Measured service: Cloud systems automatically control and optimize resource use by leveraging a

metering capability at some level of abstraction appropriate to the type of service.

Differences Between Traditional Data Centers and Cloud Data Centers

Aging Infrastructure

High Maintenance Cost

High Risk

No strategic focus

Siloed infrastructure

End-of-service systems

Complex Support

Support Business demand

Rollout applications faster

Improve performance

Deliver better end-user experience

Minimizing risk

High available

Ready to go

Fit in strict project timelines

Page 11: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 11

What Big Data can do for a Nation?7

After having this basic presentation about Big Data, Cloud Computing and the ways data can be

processed let’s see how that can help a Nation.

For any leader data accuracy leads to more confident decision-making and better decisions mean

greater efficiency, making the right investment and taking proactive action against future problems.

Analysis of information can bring correlations. For instance, in the health system it can be of use to

prevent diseases, on the security sector, to combat crime, and in education, to provide detailed

information on the discipline applied, reviews to prepare the next generation of future leaders where

they will already have had the education needed to perform their activities with a sense of public

welfare.

Computer Science for the society

Computer Science can act as an engine for eradicating poverty and improving the quality of life in

terms of better homes, strong education outcomes, and quality health. The only requirement is to

make good use of information, build upon that project and simulation through Big Data analytics

solutions and results. A list of fields and examples from other nations where Big Data can help on

the analysis and build most accurate reports for better administration are shown below.

This document will focus on Brazilian Nation Macro Statistics and Geographic Information.

Before we apply Computer Science and Big Data analysis, let’s look at some macro information

about the Brazilian territory division and geographic information. Going through the entire article you

will be able to find areas where Big Data is already in use and see many more that can have it

deployed. The interesting bit is that they all correlate in one way or another and all interact together.

Administrative division8: 1 Federation Unit Institution that rules over states units 26 states total managed by the Federal Unit + Federal district 5 570 Cities

Geographic Information

8 515 767,049 km² Total size area

7 491 km of marine coast

200+ Million habitants

4x Time Zones

6x Weather types (Equatorial, subtropical, tropical, semidry, tropical Atlantic and tropical of Altitude)

8x main Vegetation types all they linked with weather in the area

4x main Soil types

Page 12: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 12

Solutions Overview9

Economy Advanced Data Analysis can help make decisions in real-time with efficiency, performance, and

scalability. It allows achieving conformity, reduction of penalties, to be more efficient, to be more

understanding of markets, to predict future

performance, and most importantly, to be ready to

take care of data diversity and increase loyalty and

investments.

Education Big Data and Cloud Computing Services can bring

Education to a completely new model combining

educational curricula built based on the top aspects

of students’ profile, great classroom planning, and

studies, discipline. Furthermore, information

needed to review results and give high-quality education to everybody will be handy for everyone.

Security

Big Data will provide real-time camera images that can be analyzed and combined with police

records to enable an efficient anti-crime police ready to operate and catch the criminals. Cloud

Computing can bring a unique high performance platform of service to the entire police department

across the country that is scalable. It also offers analysis for the army, air force and navy on all the

territory security preventing unwanted situations.

Health Services

In this area, Big Data can manage increasing growth of patients’ data. This will allow the processing

of massive data information in real-time to combine and detect fraud in the public health service,

keeping conformity, accelerating consults based on the reason of current speed, reports on several

sub-sectors, develop cost-reduction models, service quality improvements and obtain insight to

medical analysis. Also, Cloud Computer models can offer Application as a Service that could be

used by all hospital and clinics administered by the medical society.

Government The application of Big Data analysis helps to create a flexible infrastructure that can manage,

protect and analyze large amounts of data. This helps to successfully address situations of

conformity, budgetary limitation, and economic crises, even when times are tough and social,

economic problems and natural disasters require quick and effective solutions. Automated manual

processes consume a lot of resources and are likely to fail. However, with the use of Big Data

results can be achieved and thus improve services, empower small towns and reduce costs.

Page 13: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 13

Infrastructure Here, Big Data will help to understand and analyze the actual needs of public transport, electricity

generation, disaster management, etc. Learning from problems will help develop a new strategy for

the future with the lowest latency over expected and unexpected circumstances.

Solutions: Overall view Economy Brazil is the second largest economy in Latin America and 7th in the world. Brazil’s economy is a Mix economy with huge natural resources. Economic active sectors are: agricultural, livestock, mining, manufacturing and services. Main export products per area are: Agriculture: coffee, orange juice, soybeans, ethanol from sugar cane. Livestock: Beef, Chicken, and Pork Manufacturing: aircraft, electrical equipment, automobiles, textiles, footwear. Mining: iron, ore, and steel.

Agriculture10 One of Brazil’s main financial bases is

agriculture production. Five years ago, an

Agriculture company called UTEVA was

reaping corn, soy, beans and wheat but on

every season, their technicians started to

collect more than just crops. They also

started to collect data information in their

company-owned farms, totaling territory size

over 3250 hectares.

UTEVA’s operational manager informed that

they have over 30 Gigabyte of data of

reports on soil mineral analysis, harvest

mapping, rain index, physical and chemical

ground analysis ingested on occasion.

The profitability overall after data analysis changed the way business is ruled. Nowadays, they

confirm that their harvest and business decision depend on collected data. All the data evaluated

together provided the answers on how to increase productivity and efficiency in the field.

Another important use is to have the necessary information to

master control on irrigation technics, fertilizing and genetic

engineering. This important feature will allow humankind to

increase production in the fields and reduce dependency on

rain periods and soil natural characteristics.

E.g.: Air Mapping about Soil Quality Analysis

Page 14: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 14

In the Agro business area, Big Data is the key to achieving great results, as the information

collected can guide any business to the correct product production and estimate growing areas for

future investment.

There are plenty of sensors spreads all over collecting air data, soil, wind, plant development, etc.,

and the level of detail grows exponentially.

There seems to be no end to Data Analysis on agriculture. Precision Plating from Monsanto

Corporation sells and supports one product that, based on data collected, can provide correct space

between the seeds and best depth to plant with 99% accuracy for each area analyzed.

Another agriculture company, Stara, is producing an appliance that scans the corn plantation. The

scanning reads the leaves and immediately can evaluate if the crop is ready. If that is not the case,

it can also provide a detailed analysis of soil nutrients, helping in the correction with fertilizers when

there is a need and, in a short period, increase the productivity to achieve the top harvesting results

for the year.

In a nation where the main economic source comes from agriculture, data on the weather as well as

the market for their products is vital for growth. Farmers and breeders must optimize yield, reduce

waste, maintain food safety, and understand the environmental impact, supplier interaction, and

product delivery.

Livestock11 Brazil has an estimated flock of over 205 million heads

constantly growing. Production increased by 25% per

hectare over the last 10 years. Beef production increased

by 38% and exports by 731%. Unfortunately, even though

they use high technology combined with integration of

livestock-agriculture-forestry, pasture area decreased by

2%. This event resulted in some missing data collection of

field distribution.

In 2010, 80% of nearly 40 million heads slaughtered went

straight to internal market giving a per capita consumption of 37,4 kg and 20% exported to over 180

countries.

The beef production process comprises a wide variety of

stages that includes highly capitalized farmers and small

producer of meatpackers’ plants with the high

technological standard.

They are fully able to meet all external demands, as the

slaughterhouses need to meet the health legislation

requirements.

Page 15: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 15

Types of beef: Lean beef, grain- fed, certified, grass-fed, and marbled beef.

Processing plants in several countries: Brazil, Argentina, Uruguay, Paraguay, Chile, United States,

Australia, United Kingdom, France, Netherlands, Italy, and China.

The Brazilian modern beef industry is responsible for over US$ 5 Billion exports and one million

jobs.

The amount of data that requires processing in this subject

is beyond any reference. Areas where Big Data analysis

can improve efficiencies:

In supply chains, items can be individually tracked.

Broadcasters can analyze how viewers react to shows on

social media.

Retailers can build customer demographics by collecting details and looking at them on a large

scale.

Animal monitoring, using collars that can monitor cows’ activity and detect changes in their behavior

(Remember, we are talking about 205 million heads if all get registered in the system). The

information collected from the collars could be used as a trigger that can alert farmers when cows

will yield the most of milk or vet requirement.

The analysis of these data can lead to incredible farm productivity boost on productivity and

completely change the traditional industry. Project and data analysis are key factors that contribute

to profit as they are present in every step of production thus contributing to efficiency and

consequent overall growth.

A cup of joy means higher productivity!12

According to Ron Shani, CEO of AKOL (Agricultural Knowledge Online), their Big Data Platform

allows farmers to know exactly what to do to take care of the harvest, when and how to do it, to

extract the best out of their fields even though what they need to do is just drink a cup of coffee in

the morning.

From data analysis performed with farmers in Serbia, they

noticed a clear relation defined between drinking coffee and rural

production. All the farmers that did not drink coffee in the

morning were not as productive as those who drank a cup of

coffee before starting their daily chores.

Chinese Authorities signed a contract with AKOL to start using a

technology called “Cloud Agriculture” for farmed fish. AKOL’s

system can provide analysis to fish farmers’ on when to clean

the lakes, feed the fish, and the amount of food through several sensors spread over the lakes.

AKOL systems are already operating on grape vineyards, crops, farmed fishing, chicken industry,

livestock, apiculture, milk production and in numerous kibbutz and moshav agricultural platforms.

Page 16: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 16

The sensors are distributed and installed in trees, vineyards, fields, cows’ collars, milk extraction

systems, automated food systems and so on. All data is registered including environment

temperature, humidity, animal food consumption, soil, and plantation including “Integrated Smart

Pest Control”.

One example that applies to dairy producers, analysis can recommend liquid ingestion to the cows

during a very hot day to guarantee high milk production or offer special food supplements to prevent

pest or bacterial disease identified in the region. All the products are based on Microsoft Azure and

offered on SaaS Model (Software as a Service).

Big Data not only offers a raw massive productivity boost but also touches on cultural questions,

analyzing workers’ behavior for each area. After this analysis, it can offer a recommendation on the

satisfaction level of the workers to give them a cup of coffee before they go to work in the field.

Previously, analysis at that level was not possible in such a detailed manner with results based on

the cultural information. Only Big Data algorithms can process the cultural information as well.

The idea behind AKOL is simplification where possible and fixed price as a service so it is not

directed to big companies but to all producers as the application was created for smartphones, thus

simplifying access. All data collection is processed in the cloud systems. Then results are forwarded

to the farmers.

Access to accurate information is incredibly useful not only for the producers but to the entire export

mechanism, especially to the European Union. As agro-business became a global business there

are several questions and concerns about the source of commodities sent from remote locations. All

normative about the amount of pesticides used on the production, kind of fertilizers, kind of food and

plants given to animals, milking model used, storage of eggs for chicken products, including country

laws about slavery workers need to be available on few clicks.

The system also includes full compliance with Global Agricultural Practices (GAP) specific methods

which, when applied to agriculture, create food for consumers or further processing that is safe and

wholesome. Each product receives an ID card that contains detailed history of production from

genetic details till its delivery through all the process with documentation at each stage.

Large Agro-Business companies state that Big Data will certainly be responsible for the next

Agriculture Revolution. Assistance on data processing of all information collected will be the ground

where this revolution will take place

Other National Production

Brazil has also appeared to be a potential player in the Oil & Gas

industry in the future. As natural reserves for Oil & Gas worldwide

have their countdown running, the news that Brazil Oil & Gas under

the sea could be bigger than Russia’s shows a new picture for the

future. However, there is a big problem and Big Data is the new

technology available to solve it. Several local universities are

dedicating their analytical scientists to develop analyzing programs

in order to analyze tons of data about soil on 3D models.

Page 17: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 17

Economic growth is not based only on product extraction, sale and profits It is much more complex

than that and it also requires constant researchers and efficient production with the lowest waste,

delivering the right product to the right place at the right time. The normal analytic calculation cannot

take all items involved in this formula analyzed correctly nor combine historical information from last

decade. Data information involves other sets of complex information such as rain period and

quantification, mineral and fertilizer supply, temperature best growth period, great logistic, even

potential prediction on natural disasters.

Education

13

Big Data and Cloud working as a service for

Education

Do you want to know how a student studies at home?

What is their preference? How he/she learns? In the

2014 World Cup, Germany won the championship and

everybody asked what the formula for success was.

The answer was simple: …combine long-term

planning, discipline, good players and lots of

information about what happens on the field

The same formula applies to Education. The only thing needed is to combine good teachers,

dedicated students, great classroom planning and studies, discipline and above all, lots of

information to achieve the results as expected with high quality. The question is: how to collect this

information?

The new customized learning platforms

built on Cloud Design allows teachers to

have all their material organized. An

example of this is that there is no longer

a need to have flash drives. Most of the

materials used are uploaded to the

cloud. Homework is self-corrected

online. From student interaction with

online material teachers may evaluate

student behavior online, measure

content accessed and material involved,

if questions received answers and the

instructions to get them answered.

Moreover, evaluation of teachers is possible as well, assessing if the content presented meets the

criteria and if the dialogue among the students meets expectation. All concerns about what teachers

offer and how this can affect the learning process and the activities performed by the students can

all be subject of Big Data. This is Big Data serving Education.

Page 18: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 18

Big Data makes it possible to understand students’ aspirations. The hybrid teaching service where

students can have a mix of an in-seat classroom with online remote sessions is growing

exponentially in Brazil and worldwide. It allows a collaborative learning scenario where

customization is possible and it is open to receive feeds in a number of new ways.

Based on the information collected, we can learn and answer several questions such as: Did

students learn? Did the material presented interest the students? What are the areas that need

more explanation? Analyzing data stored previously will make it easy to see all the differences in

the learning process and create action plans. All this opens a wide range of opportunities that with

the proper tools will help to develop new education standards and also customize learning

processes.

The value of Big Data in the Education market Researchers show that the correct use of data analysis can provide a huge benefit to teachers,

government, and students. A university adopted a system to work on the proactive side rather than

the reactive, called Early Warning System. It collected several variables from the students such as

academic history, grade achievement, class attendance, homework, time taken for correction,

internal electronic academic material consumption, etc.

To their surprise, they found a pattern of behavior that warned them of student success or failure in

the course taken. They came up with an action plan that included an advice program aimed to help

students on different ways to study to achieve success. Not only did these benefit students as they

succeeded in their courses, it increased the number of students per teacher and university’s

financial boost as well.

On another research, students with low grades were also the students that had the worst team

relationship and the worst grade markers had a relationship with other low-grade markers.

Meanwhile, students with high marks had a much higher relationship with classmates. From these

results, teachers could work to promote new relationship interactions between the high mark

students with low-grade students.

Not all researchers have reached a

conclusion on behavior or root causes but

they brought to attention some particular

situations. For example, another school that

had online e-books collected data about how

many times e-books received access, how

many pages were read, if pages were skipped

randomly, how many times students went

back and forth and if there marks in the text.

This is interesting. For some unknown reason

still under investigation, all the students that

marked texts in the e-books got the worst

grade score. From our own experience of life,

we have always assumed that the more

someone marked a text, the smarter this person was. However, that was not the case here. So, the

Page 19: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 19

questions one may ask are: is the material not good enough? Or, is the order of the chapters

affecting the understanding?

Big Data’s new way of Education analysis is building a new educative model and breaking down

old-fashioned models. This is a completely new market for education not only in Brazil but

worldwide.

EXPECTED EDUCATIONAL CHANGES TILL 2020 WORLDWIDE!

Page 20: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 20

Adaptive Learning14 Some US companies specialized in data analysis and created the auto-adaptive method of

education based on student profile. The engine behind it is based on student analysis and provision

of content in different ways as student advance on the discipline. This model is called Adaptive

Learning and it applies to almost of all disciplines.

Adaptive Learning and Data Mining is converging into one common point where it will be possible to

explore the correlation between learning and content. To facilitate adaptability of users, content will

be generated in pieces on different formats like text, video, and audio.

To understand what is happening a US company,

Knewton Software, is following individuals as they

improve and comparing them to other students of

the same discipline, semester after semester. The

pattern of the compared results can measure levels

of difficulty, format, interaction type, teachers

involved, classroom type, etc., and build specific

conclusions.

An example of the result could be described like this: student X analysis could show that he can

obtain higher productivity when the subject studied regardless of the difficulty is presented during

morning classes through videos with teachers who had

recently graduated. The point that can be applied is that

content will be automatically adapted to the student’s

profile. All this information came from the analysis of the

data gathered from the study conducted on the students.

Another example was student Y having difficulties with discipline Z during his third year and had bad

grades on final exam W. That analysis can allow teachers to improve their effort on the specific

subject before the third year. Actually, the software will generate reports with alerts and warnings for

every calculated pattern.

Society (Social livelihood)15 Big Data for improved diagnosis of social conditions

In any Nation, distribution of wealth points out the poverty macro view. However, it is almost

impossible to assess and analyze real distribution. The following case is based on an example of

the Brazilian society. The government presented a distribution of wealth in the country that showed

10% of the people were rich and 90% lived in different levels of poverty. These reports are created

using nationally representative household surveys, which require labor and time and conducted

after long periods, sometimes every 4 years. To fully understand the distribution of wealth, the first

step is to have the current real distribution of poverty maps.

Page 21: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 21

To analyze wealth distribution in society, we can make use of some known information such as:

> A set of income thresholds that can vary by family size and composition

> An income-based method created by some government programs to help people with low

incomes

> A consumption-based method to measure what households actually spend

Moreover, there is also a new model that helps to evaluate society based on the mobile phone’s

accessibility and use that can create a large volume of data on social interactions, mobility, and

more. Using that information correlating to other models, governments can deploy more accurate

actions based on real society conditions.

The power of mobile phone call data records (CDRs)16 is an immeasurable source of information. CDR allows a view of the communication and mobility patterns of people at an unprecedented scale. Such maps can facilitate improved diagnosis of poverty and also assist public planners in initiatives with appropriate interventions, specifically at the decentralized level, to conduct human poverty eradication ensuring as a consequence higher quality of life. Big Data analysis can also include gender, the urban/rural gap, or ethnic/social divisions. The accuracy of the poverty maps can assist in policy planning for inclusive and sustained growth of all sections of society. As mentioned earlier, analysis can get much more accurate as more socio-economic indicators are inserted for the analysis. Mapping call

data records, mobility, and economic activity is just the beginning of the data collection.

What is the gain with Big Data for Society?17 The high deployment speed of Big Data solutions and all its implication are directly related to the benefits it produces for the modern society that uses Data Analysis. Research conducted by Economist Intelligent Unit in 2012 with business executives that had annual profit above U$ 500 million, confirmed that 70% of these executives reached their objectives based on Data Analysis while 45% believed that they could get even greater results if they had more access to Big Data. If the analytic applications used on corporate business have shown extraordinary results, the public sector in all levels of governments can bring excellent results too. According to a US report, the public health system estimates a gain of over US$ 300 million annually. Based on that, a governmental initiative worth to be mentioned is the Massachusetts Big Data Initiative. The mission for the Massachusetts State project is to become the world leader in Big Data innovative solutions. Two years after its creation, the report presented is impressive and inspiring. Within 2 years with high public power incentives, the initiative involved 500+ companies in all stages from startups to corporations that work on several areas such as Application, Analytical Tools, Data System management, etc., also on vertical business solution (e.g. manufacturing, energy, retailer, etc.)

Page 22: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 22

Over 200 million US$ were applied in education for researchers and teaching of Big Data training courses and from that over 5600 professionals are being educated to implement analytical solutions that improve productivity and assertiveness on company decisions and public institutions. The State project scopes have stimulated private investment in companies in the State to help to create a virtuous circle and achieve the project mission. In Brazil, the gold rush “AKA, data rush”, to analyze the past, predict the future and take better decisions to improve business processes or increase public investment return is just starting. The results obtained so far are clearly about Big Data’s value for society.

Creating an Intelligent Nation, Community By Community (From I-Canada)18 Intelligent Community Development Plan includes:

A collaboration and process framework and governance model Core i-community platforms Benchmarks Sustainability planning Measurement models to track progress and returns on investment. Meeting Community Goals

On every community, different priorities can come up when the transformation is an ongoing process. To become an Intelligent Community i-Community proposes a few programs that focus on the community’s inherent skills and advantages. Some communities like Singapore have a strong manufacturing sector and logistics is important to them. Others such as Stratford in Canada are building new digital media capacity based on their heritage as a creative theatrical community. Each should have a design and application of Intelligent Community plan, flexible but with a target defined.

Culture19 The Cultural sector can join Big Data as well. At the moment

in the UK, there is the audiencefinder.org web page that is a

free national audience data and development tool that

assists a cultural organization to understand, compare and

apply audience insight. This web page and a few others

have an approach based on cultural policy to aggregate

information about cultural consumer behavior as well as the allocation of public funding and

measurement impacts. The value is a two-way exchange, to be regular and honest between funder

and funded.

Page 23: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 23

The tools used in the analysis compare behavior like demographic characteristics. The idea is to

use some element of big data-type approaches to rethinking components of traditional decision-

making and data-driven approaches to drive insight and change behavior.

Big Data comes into place analyzing the social networks commentary about the performance to

show the connections made about the subject, public achieved, etc. The cross-reference about the

show analysis needs to bring data scientists from other non-cultural fields into the sector to explore

the needs and build the capacity.

That analysis benefits not only a performance show but also artists like painters or writers that need

their work released and gets a feedback on it.

Few people in the sector would say that data analysis and art can’t or even shouldn't be mixed.

Around the world, artists are working with data in amazing ways and people are using dashboards

to control their social media feeds or popularity decrease. Assuming that the foundation of the raw

materials is strong and the analytics robust, data-driven decision-making could be a key element to

increasing artistic impact and commercial resilience both for individual organizations and the sector

as a whole.

Experience, creativity, and necessity are a powerful combination. Looking for new ways, such as

sentiment or semantic analysis, to measure aspects of artistic impact could also be an important

new tool for the cultural sector. The measurability of everyday life is growing at an amazing rate.

The developing expectations of audiences for personalization and the levels of service provided by

digital companies such as Amazon demonstrate that data is already a key tool for the cultural

sector.

For the Rio2016 Olympic Games there is an online portal available to register popular artists and

literature projects to give exposure to new writers and poets from the periphery.

The registration will allow the organization team called “Celebra“ to map and distribute the artists

and traditional festivals, with typical Brazilian foods from various regions during the Games so

visitors can have much more interaction to the mixture of culture that forms this nation.

Security and National Defense20 Security Public Sector is seeking new efficient technologies worldwide to support their operations.

All institutions that work in this area have one or more complex structures dedicated to Intelligence

Agencies.

Currently, investigation schemes are divided into activities to receive, store, and process data

information from all sort of sources (structured and unstructured data), such as individual National

Registration, social network profiles, and vehicles, audio files to combine them and offer accurate

results.

Page 24: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 24

To increase complexity to this huge Data Lake, it is necessary to understand the link between them

and find how they are geographically distributed. To achieve all this data analysis and obtain

efficient results, Intelligence Agencies are working with powerful Big Data tools that enable all the

information processing.

From fiction movie, “Minority Report”(2002), into

reality of security defense, Big Data is the new

Technology to help police departments from small

to large cities, states, and countries to match

individual behavior and prevent or at least provide

proactive analysis that police can use against

crimes that can most likely happen in that area.

At the moment, there is a trial solution in place used

by London Police Department where a large volume

of data available on the Internet is the target for

public security investigation. The solution is so real

that is bringing up several concerns about limits on

data collection and ethnic use of monitoring systems

by authorities to avoid misuse of the information.

Accenture developed the solution used by the police department. They get information from

Facebook pages of individuals and other data from various applications to match information and

generate a report on risk assessment.

According to the public security chief from Accenture, the software does not predict actions. It just

directs analysis and presents results of people with a high-risk combination. It is a tool designed to

help Police more efficiently.

The tests ran used data collected during 20 weeks and related to 5 years of historical information.

On this test, the software combined information from 32 London criminal groups over 4 years and

generated the probability that those people could commit new crimes. The results were compared to

the incidence of crimes during the 5th year to evaluate software precision. It proved to be efficient.

Big Data analysis cannot reduce crime by itself, and should be used carefully to prevent misuse of

power by the authorities. For example, a misuse of information can be unfair when making targets

of a given group of people, classifying them as potential criminals, says the director of ONG Big

Brother Watch who fights for civilian freedom protection.

Image from Web site: http://www.thewrap.com/steven-spielberg-hiring-

godzilla-writer-for-minority-report-tv-series-exclusive/

Page 25: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 25

In Brazil, an agreement signed between Secretary of

Public Security from Sao Paulo state and Microsoft

promised that over this coming year they will make

use of Data Analysis and help police by warning

them for criminal pattern matches. Similar solutions

are offered to Spain and Singapore; Spanish police

force is using applications to identify areas with

potential crimes while in Singapore the software is

generating reports based on video monitoring of

crowded places, street traffic, and regional

commemorative events.

The company PredPol developed Intelligent Crime

Mapping software which is in use in 12 US cities, UK

and Uruguay. Making use of high database

information, it estimates days and hours that crimes are most likely to happen in some areas.

According to an FBI report, the experiments are successful. For example in Santa Cruz the system

helped local police reduce 19% of assaults and 24 in the act crimes in the areas indicated on the

map.

Until now, traditional crime mapping could only point out areas where issues occurred in the past

while PredPol can make use of historical data and new information analysis to build a potential

mapping for future criminal actions, according to researcher Jeff Brantingham, University of

California Los Angeles and co-founder of the tool. The PredPol system can predict around 200%

more crimes than any current method used by police according to reports from Los Angeles and

Kent, England.

Karin Breitman, from EMC R&D in Rio de Janeiro, says that data prediction is not something new. It

has been in use by the private sector for a few years but now due price reduction and much more

capacity on data collection and processing, data analysis is expanding exponentially on several

public sectors like security.

The processing and combination of massive data information through mathematic algorithms can

generate patterns on people moving and point out potential risk zones.

According to Carlos Tunes, IBM Brazil Big Data Executive, the use of this new technology in the

security field is helping authorities work with data and dynamically generate real-time reports based

on video camera images, social network and several sensors from IoT devices. The solution is able

to look at someone and evaluate not that specific circumstance but a whole set of information in a

context like the 2016 Olympic Games where data analysis showed its power on preventive and

reactive measures.

At the beginning of this article, one can read “With Great Power Comes Great responsibility”. This

sentence comes from a manager from Technology Society Center – FGV-Rio University. Brazil is

facing a growing public security sector privatization and a lack of basic principles on the limits of

citizen surveillance which is a bad mixture when comes to Big Data analysis. According to this

manager, a Big Data solution can bring some surveillance assistance that benefits society, but for

that to happen this monitoring must have limits proportional to its requirement on real cases where it

is necessary. It doesn’t seem correct to monitor the entire society to catch a group of individuals.

Page 26: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 26

In Sao Paulo city, a Domain Awareness System called locally as “Detecta”, will integrate with

several databases from several Public Security sectors to index them and classify on predefined

models. As the security surveillance cameras will also be integrated into the system, it will allow

sending warning alerts based on the criminal patterns identified. That will allow a better distribution

of police among the city on areas where criminal situations are most likely to happen, says Glauco

Carvalho, commander from Sao Paulo Military Police.

At the moment, the system contains data from Transit Department and Military Police Department.

The commander gives an example where they can make a basic search for someone in the

“Detecta” system and it will point out all vehicles registered by that person and any criminal relation

that this person had in the past, including a photo if already registered by the police. Another

interesting alert, already in place, is related to abnormal moves, for example if a large number of

people run to some direction it immediately sends an alert to the Operational Center of the Police

Department that can look in the cameras around the area to validate potential police requirement.

Security in the Air21

There is a huge development on systems and devices for the aerospace sector with IoT and Big

Data. This allows monitoring and data analysis of satellites, airplanes and any flying device that

could elaborate on potential failure or attack reports based on software and hardware data

collection historically and real time.

Several countries are already studying and building automated airplanes, which will fly routes

predefined with no pilot totally managed by a computer system. Sounds crazy but it is proved that in

the last decades almost all the civilian airplane accidents were caused by a wrong human reaction

to the circumstances. These wrong reactions occur on all levels from parts exchange when sensors

alerted to replace parts that were neglected by the mechanical teams, to wrong manual command

changes by pilots.

How to guarantee the Security Information about individuals is not stolen or misused? Continuous Auditing Mechanisms is needed on those who can access government data and fully

monitoring activity those auditing reports can be analyzed and related to a potential malicious

context on external information. Non-authorized travel, credit fluctuation classification changes, new

startup companies investments, etc…, all these data collections could work for auditing big data

analysis to identify risks on the individuals that handle others personal information.

The Splunk22 system can monitor all these kind of IT data, matching the abnormal behavior and

correlating to other on-demand external sources of data inside or outside of an office. It not only can

point out the potential malicious action from an individual but also help to differentiate an accidental

policy violation.

Page 27: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 27

Health Services23 Taking care of people is highly intensive and involves a huge amount of variables. Imagine

monitoring hospitals, doctors, clinics, etc. Sao Paulo city alone has 12 Million inhabitants, 42x

Hospitals and 120+ thousand Doctors registered. Now if we look the entire state of Sao Paulo, the

numbers get bigger; 645 cities, 42+ Million inhabitants and 881x Hospitals (Data extracted from

2011 census done by the Secretary of Health Services from Sao Paulo state).

The health system in Brazil is divided into 2 main

streams; the Public Service and the Private Service.

Let’s concentrate on the Public Health Service24 to

show the Big Data and Cloud possible application. To

control all the expense of Public Health Service and

apply analysis will first require a collection and

registration of all the medical institutions in a single

portal. From this registration each patient that uses the

Public Health system will automatically appear in the

system with data/hour of use, and if any procedure is

needed or medical product used it can be linked to the

patient record. This will be a huge relational database and from a macro perspective, this

information can clearly indicate real product consumption. Furthermore if it is compared to the

invoices of hospital purchase it will work as an anti-fraud tool in the health service.

The data collected is not limited to the patient and resources consumption but to the people

involved like nurses, doctors, cases attended, specialization required, time of the high volume of

people requesting assistance, areas where more doctors should be allocated, etc… The

combinations are just incredible.

The use of Big Data is bringing a full IT

transformation analytics by improving

patient care, eliminating waste and

coordinating treatment and care plans.

These changes can bring new

information about the Health System

administration, transparency, and

responsibilities from the health services

providers as well as important data from

patients for researchers on diseases.

The 3 main areas of Health Service Big data can work:

Precision Medicine

Electronic patients medical records

Internet Of Things

Page 28: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 28

Precision Medicine

Most scientific knowledge is still based on large averages.

For example, a recent analysis of strokes cases showed

that use of new oral anticoagulants reduces the risk of

strokes and systemic embolic events by 19%. The

average doesn’t say that the risks were lowered to 19%, it

says some people had risks reduced 100% (did not have

the stroke) while 0% had the stroke.

That means the new oral anticoagulants lowered the

chance of strokes in the population as whole but do not

show to whom it worked and no other information if there

is something combined that produced the results. In the

tests taken, in a group of nearly 30 thousand patients almost a thousand had a stroke even with the

medication.

Who are the people for whom it did not work? Which other cause could be related? Maybe they

were women above 60 or have a particular ethnic background or had smoked the whole life or lived

in an area where the air could be contaminated by an industrial chemical product, etc… The reality

is they don’t know.

The precision medicine objective is to have much more accurate information with a full registration

of the patient who will take the medicine and compare it to other databases like doctor records in an

attempt to find patterns. That will not bring 100% precision but it can double the efficiency in the

analysis and the results of lives saved can increase a lot.

Electronic patient medical records

It is critically important that clinicians, staff, and patients have information, tools, and resources at

their fingertips at all points of care. Unfortunately, the

patient records in several countries like Brazil are still far

from the ideal model. At the moment, there are patient

records for each unit completely isolated from others, in

some cases they are still under paper registration. Like

paper registration, the records are very hard to get

updated or transferred or easily comprehended.

Furthermore, these records are badly stored due lack of

proper management.

There is a strong movement to unify the patient record and scan all the current paper records to

make all them electronically searchable. In big cities, they are already using online application that

can be accessed by other health services which makes the consulting much faster as they have a

precise data history handy.

The idea to have the record unified and used all over the country will result in a phenomenal

improvement in the entire health system. It should reduce time taken to fill up the forms and queue

Page 29: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 29

on consulting also will help in disease research and preventive diagnoses. The online unified patient

record is already in use in the UK and confirms the real benefits mentioned here.

Internet Of Things (IoT)

This probably is the most expected item in Big Data world,

as the collection of information come from almost everything

allowing precise analysis combined together. In the health

system, the possibilities are incredible. To a medical

caregiver, a patient's vitals and behaviors may be constantly

monitored, which increases the effectiveness and efficiency

of treatment. Another example is wearable devices that can

report real time heart analysis and in the case of a potential

incident, can alert the person and direct them to the nearest

hospital.

In the epidemiologic field, the analysis can identify diseases at early stages based on patterns or

maybe potential vectors of viruses and areas affected within a very short period alerting local

authorities to take immediate action.

Machine learning

In the Data Analytical area there is already a growing segment to create algorithms that can

automatically calculate data and upon result take next actions without human intervention.

A current example of Machine Learning is the decision trees that can be used when variables

depend on classification tree or regression tree. In Brazil, they performed analysis to predict the

reason a town could have Infant Mortality Rate (IMR) below the National rate average. (14,7 deaths

for each 1000 born alive)

Two extra variables added were prenatal consults above 7 and illiteracy rate, both from the year

2010. The period selected from 2008 to 2012.

For this analyze on regression tree they used rpart from R. Below you will see the source-code

used.

ML <- read.csv("https://sites.google.com/site/alexandrechiave/mlexemplo/mlexemplo.csv")

IMR <- ML$IMR

IMR[IMR==0] <- "IMR below"

IMR[IMR==1] <- "IMR above"

prenatal<- ML$prenatal

illiteracy <- ML$illiteracy

install.packages ("rpart")

install.packages ("rpart.plot")

library ("rpart")

library("rpart.plot")

Page 30: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 30

model.rpart <- rpart (IMR ~ prenatal + illiteracy)

rpart.plot (model.rpart, type=0, extra=2, varlen=10)

png ("IMR.png")

rpart.plot (model.rpart, type=0, extra=2, varlen=10)

graphics.off()

Without human intervention, the algorithm identified 2 predictive points also known as the nodes of

the tree.

Proportion of women with more than 7 prenatal consults above or 67%

Illiteracy rate lower than 8.1%

The graphic on below shows that the algorithm making use of the two variables identified the correct

position of towns where the National average is 64.9% of cases (3.610 from 5.565).

The most popular machine learning methodology still shows some limitations, due to the over-fitting

and the possible increase in the number of spurious associations. Anyway, the scientists expect to

have it solved in the near future.

Large Amount of Collected Data ≠ Right Collected Data

Determining the difference of the importance in quantity and quality of data is not an easy task.

From this statement, it is possible to divide it into 3 groups:

Group of Individuals without statistics knowledge

Group of Individuals with low statistics knowledge

Group of Individuals that works with statistics

The first group believes that the solution to research problems is to increase the number of data

collected (usually they believe in an election that the mistakes during the previews were caused due

a low number of people evaluated).

Page 31: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 31

The second group believes opposite of the Big Data approach. They think that a large amount of

data causes incorrect analysis results due to sampling problems.

The third group dealt with biased samples which always occupied a good part of the scientists’

time.

It is true that Big Data results may not represent the reality of population due to the fact that the data

sampling may not come from all the population layers. For example, data from smartphones and

wearable devices will come in great part from medium to high economic class of individuals. The

same may occur in the medical records as not all the health professionals will have the knowledge

to use it.

Some traditional methodologies are being incorporated into big data in an attempt to solve the

sampling issues. One of them is to add an extra value according to the individual representation in

the population evaluated.

Science and researchers “If the bee disappeared off the surface of the globe then man would only have four years of life left. No more bees, no more pollination, no more plants, no more animals, no more man. If the bee disappears from the surface of the earth, man would have no more than four years to live” - Albert Einstein

Big Data visualization reveals several world changes on animal and insect behavior. Scientists are

aware of climate changes and know there are several changes in the environment but some of them

are so dramatic – like birds migration changes – that it is creating some concerns like: Is migration

increasing? Is temperature the root cause? Or rain? Are there correlations that point to climate

change?

The power of Big Data and the new models of analysis will allow scientists and researchers to

provide accurate information on global warming, diseases dissipation, deforestation and plague

occurrences on a level that everyone could understand. Anyone will be able to join the fight to

prevent bad things from happening and make a difference.

Current isolated researchers models are inefficient while some things are highly analyzed others are

left behind and they just might be all related. The idea behind new technologies is to provide

government with tools to build an online catalog where all universities that perform research

programs could upload information. Authorities can study this information thoroughly and as a

result, faster investment applied in the area.

Suppose there are four education centers studying a special disease that is spreading all over the

country. They could join the research team members and distribute among them the investment for

that research. There is a scenario like that in Brazil, started at the end of 2015; Brazilian authorities

reported a high dissemination of pathogen Zika Virus all over the country. However, lack of

integration of health system and research centers resulted in a massive delay on how to isolate the

epidemic chaos.

Page 32: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 32

In this example mentioned above, let us emulate the use of Big Data and new IT technologies

available like Cloud Computing. Using Cloud technology a unique system managed by the

government health system could allow Hospitals all over the country to register patients with their

symptoms and data collections could trigger alerts to local authorities of the high volume of cases

matching the specific symptoms. Having that information immediately available can effectively allow

authorities to take decisions quickly and the application of resources where it is needed as

immediate prevention while others could be directed to universities matching studies on the area.

Sounds like basic and simple information but if other information could be ingested in the system –

like epidemic cities size, life quality, economic information, travel flow mapped – it could provide a

rich and detailed report that several points could be covered by the preventive system like

increasing garbage collection on the area where most of the cases were identified, immediate public

system advised to deliver leaflets containing a mosquito combat action plan, etc. While mosquito

growth takes place within few days, Big Data analytic reports would take hours, resulting in a full

proactive action plan to combat the epidemic at very earlier stages.

Big data in biomedicine

Genomic data translated into treatment. Nowadays scientists are putting together a massive high

volume of information coming from genomic sequence projects, patient records and research in the

laboratory. This data brings a new era of technology and medicine alignment called “precision

medicine“ that can deliver treatment on individual needs.

The challenges faced in the past are no longer a problem due to the new tools of analytical data that

can process millions of Gigabytes of data bringing clear answers to the medicine questions.

The ability to work in parallel with genome biomedical analysis smartphones and other wearable

devices are generating continuous flows of health data from a large number of people.

This data analysis allows a much more detailed understanding of a disease. Numerous research

organizations are assembling cloud-based 'information commons' to standardize, store and share

the data.

Politics Administration25

According to McAfee and Bryjolfsson (MCAFEE and BRYNJOLFSSON, 2012)26, decisions should

be made on data analysis and they will be the best decisions taken ever.

Making use of Big Data, administrators decisions will be based on evidence rather than intuition. It

is proved the more an organization adopts decisions based on data analysis the greater operational

and financial results. The same statement applies to any government that adopts the use of Big

Data in their administration. The results should show an incredible improvement in the country

management overall providing a solution now and simulate reactions in the future.

Page 33: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 33

Infrastructure – Smart Cities

Environment

Google Earth is a powerful tool that is helping Brazilian scientists track Amazon Forest

deforestation. Unfortunately, the researchers thinking don’t seem to be wide enough as that is just a

dot in the image of what it can do.

Let’s improve this image, not only animated thin blue lines and rivulets showing which wind is

blowing around the globe. Having a closer look, orange dots point out fires while a thick haze of red

boxes can highlight poor air. Global Forrest Watch ONG designed a tool in the late 90s to map and

help track down illegal forest fires and provide up-to-date information where deforestation occurred.

The map uses satellite data to track forest change, and provide information on forest fires around

the world. It also tracks the total amount of forest cover on earth.

The focus on new researchers on medicine treatment using compound extracted from Amazon Forest is also being targeted for the analytical data where concerns are not only to the investment or profit but on an overall sustainable extraction and production with no more forest destruction. Analysis on soil, water, plant distribution, mapping areas are the key to making large-scale production keeping the minimum impact on nature.

Smart Cities27 They are Cities or communities that are making use of IT technologies and communication to improve their public services, cost reduction and improve the contact between citizens and the government.

How is that possible? Big Data Analytics is the answer The Smart City concept gained attention over the last

few years during the global urbanization. Back in

2014, 54% of the world population lived in urban

areas, the growth rate was 1.84% year-over-year till

2020 – this automatically triggers greater need for

social services.

To achieve those objectives, it is necessary to build

up mechanisms to collect data and process them to

get correct results on the same level and amount of

data created by the exponential growth of population.

Page 34: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 34

Expertise in public management, engineering, architecture and urbanism make total sense over the

data generated by society. This is where Big Data Analytics come in! This is the technological way

for governments to understand, classify and make correct use of the big sets of data generated from

digital social media.

Examples of Cities already using Big Data Barcelona This is an international reference of a smart city using Big Data in several ways. Using smartphones

apps used by tourists, the city management can control the people flow by organizing police patrols

in the area, for daily routine or special times. Streets have lights/metal sensors to detect available

parking space to direct the drivers. This also helps urban mobility teams understand the patterns of

vehicle flows and parking places. There are also functioning sensors measuring air temperature,

humidity, pollution and noise detection.

Singapore In 2014, Singapore created the Smart Nation Plan, which uses Big Data to build an efficient

transport system; through sensors they can detect congestion traffics and map car position offering

better routes avoiding the affected areas through GPS. They are currently studying a new system

through GPS to charge tolls when vehicles use restricted zones. The idea is to provide exact car

location, find out distance traveled in the high-density traffic area and charge per usage. There are

approximately over a million vehicles sending position data and getting charged on a daily basis.

The system can also learn the daily vehicle route and estimate tax charges or suggest alternate

routes with different prices/time travel.

London The capital of the UK is also investing in Big Data solution for improving the public transport system.

This includes transport card data collection combined with underground utilization, maintenance

routine schedules, and people habits to evaluate routes and estimate usage.

Another very interesting use of Big Data is the 3D maps showing the cabling distribution to schedule

the maintenance estimated time and precise intervention.

Smart City concept is only beginning; the idea is to have it in every city in the world. The examples

of intelligent cities are isolated so it still not in the politicians administration plan and sometimes

appears as an add-on of government campaigns several years later.

Page 35: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 35

Giant Infrastructure behind a Nation Territory that needs

powerful data analysis28

Electrical Power Generators are the main discussion point in

any society. In Europe, many countries have their power

generators based on Natural Gas Sources that comes from

Russia through a long distance pipe system. Information on

gas flow, pressure, temperature changes, failure monitoring,

etc., already is a huge volume of data to be taken care of.

Other European countries make use of nuclear power plants

and that can produce several terabytes of data just for

monitoring.

Using Brazil as an example, the main electrical power sources are water dams, also called

hydropower stations, whose analysis require combinations of weather information, as the rain is the

main important source of the dams. Back in 2014/2015 due climate changes and “El Nina” natural

phenomenon the water dams went so low that it caused a high alert with potential power cuts

through all over the country.

What to do with Big Data in this scenario?

Having data collection about electrical distribution, city population growth, rain period and water

volume changes in the water dams could provide a powerful report on how the government

regulatory organization could work proactively to prepare the electrical system to combine more

efficient electrical power sources.

Today not only Brazil but also other countries have the visibility that they need to develop new ways

to generate electricity and next power source systems. At the moment, there are several power

stations being built in farms that make use of animal waste that produces Natural Gas not used

previously. Also houses with Sun Power systems installed on the roof are helping increase

production of electricity.

The analysis made in several geographic locations based on sunlight combined with weather

showed that several cities can produce a huge amount of electricity to power industries and

businesses that are the main consumers of electricity during the day and overnight they could make

use of the hydropower stations. That method could proactively reduce water dam use during

Page 36: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 36

periods with low rain estimative reports. Analysis of home, industries, and business electrical

consumption during the day and night is made possible due to the use of sensors installed on the

buildings that send real time data information to the Electrical Agency who controls the power

distribution.

Another category of a power source is Wind Farms distributed in locations where the wind flow is

measured and efficiently produces a large amount of electricity.

Every minute an Eolic turbine sensor records the wind speed and its own power output. And every

five minutes the information is dispatched to high-performance computers that could be 100 miles

away such as the one at the National Center for Atmospheric Research (NCAR). Artificial

intelligence software crunches the data from the Eolic turbine, along with data from weather

satellites, weather stations, and other wind farms. As a result, it provides wind power forecasts of

unprecedented accuracy making energy lower cost.

Smart Wind, Solar Power, Big data and artificial intelligence are producing ultra-accurate forecasts

that will make it feasible to integrate much more renewable energy into the grid.

Developing efficient electrical power generator systems from renewable sources combined with

correct distribution and innovative electrical systems can reduce and potentially totally deactivate

nuclear power plants and reduce risks of contamination.

Disaster Management29

Big Data analysis on weather is also key in catastrophe prevention and important information on

proactive measures. Sensors spread across the country and worldwide plus satellite data

information collected every minute can produce an accurate map of possible catastrophes caused

by tornados, floods, or earthquakes combined with risk areas pre-mapped that could prepare rescue

teams in advance on the workforce to attend those areas.

Analyzing Disaster Big Data to Support Disaster Prevention with Timely and Accurate

Forecasts

Modern sensors installed on several hardware appliances are constantly measuring seismic

movements and the reports are being analyzed in real-time by geological centers with highly

accurate computer calculations. Government agencies in charge of disaster prevention used to

Page 37: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 37

strive to issue evacuation plans in a timely manner based on the information analyzed. Nowadays

they can make decisions, especially during the early stages of a disaster, helping countries

evacuate with advance alert warnings areas that are at risk of tsunamis.

Natural disasters have been increasing in intensity in the past few years. Rainfall precipitation

increased in some places around the globe and due to heavy rain and local torrential downpours

cause concern among several local authorities on the future disasters as they are facing serious

damage, such as floods and landslides every year during rain periods.

Example of real application of Big Data Analysis in Disaster Prevention:

Fujitsu has created a solution that can predict and estimate conditions in areas where there are no

sensors installed providing an advanced, next-generation disaster prevention solution.

The technology enables one-dimensional sensor information to be expanded into two-dimensional

information through simulations using big data on past rainfall during floods along with topographical

information.

Technology for Estimating the Occurrence of Disasters from Social Media Data

The technology uses a natural language processing technique to gather comments that include

keywords related to disasters, such as "flooding" and "inundation". By using a hearsay elimination

technique based on a probability model and machine learning, it is eliminated from the comments

collected by categorizing them into information based on sightings and observation, direct hearsay

information, and indirect hearsay information. Fujitsu analyzes comments about train stations,

crossroads, landmarks and other elements in order to estimate the specific location of the disaster

occurrence.

System for Estimating the Occurrence of Disasters

Page 38: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 38

The tests taken place in Japan using real social media data during a flood from August 2012

showed a possible detection of the disaster with 80% accuracy.

Mathematical Optimization for Simulating Floods

Flood forecasting simulation technology developed by the Public Works Research Institute predicts

changes in the amount of river flow during rainfall.

The forecast program divides the country on cross-sectional 500-by-500 square meters, which

studies beyond the rain precipitation to also include soil type infiltration by rainwater and discharges

into rivers.

The technology used automatically adjusts and optimizes parameters minimizing errors between

simulated discharges and measured discharges using mathematical optimization algorithms to a

flood-forecasting simulator.

Mathematical optimization offers the best combination of parameters with a small number of

calculations. It is essential to use the optimized algorithm that best fits the simulation model.

Mathematical Optimization

Page 39: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 39

Rainfall and Comparison between Actual and Simulated Discharge

Transparency vs. Corruption30

Transparency vs. corruption is a Big Problem worldwide which affects almost every Nation on all

layers of society and business.

BIG DATA cannot change corruption but it can bring much more transparency to the

political-administrative system.

How to achieve this

As described above, “corruption can occur with an office-holder or governmental employee act. Big

Data analysis application can be used in a simple design as a solution by performing real-time

monitoring of public sector individuals involved with internal and external contracts by dynamically

reviewing and analyzing local exchange and bank account operation or purchases. It also can be

integrated with other countries bank institutions making a full diagram of their business operations.

Page 40: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 40

This approach needs not only analytical schema but also new law mechanisms to support the

technology and monitoring systems used.

As part of Big Data design, it can operate at several levels of corruption collecting data for much

more accurate results and auditing. For example, it can be deployed at a city level to capture all the

contracts signed by local public sector and also, to have a system to follow the contract operation,

such as a road construction.

Here is the example:

1 – A town needs a road to be built to connect two hospitals due to their specialties and patient

exchange to avoid the current congestion roads during peak hours.

2 – The city politicians create a bidding announcing the requirements for the road such as sizing,

timing to be constructed, and estimated budget available for that.

3 - Constructor Companies that have the infrastructure skills for that kind of operation apply to

attempt winning the bid for the new road construction with the lowest price.

4 – Not only the lowest price is involved in the deal but also the material used, warranty, time to

accomplish the construction, local jobs position, reports, etc.

5 – Currently in several countries the constructor companies that won the contract only get

monitored at the start and few stages pre-scheduled in the contract.

6 – From the starting time to the end of the construction, there is a lack of accurate report done by

the public team to follow the construction.

Here, a Big Data system could be used in conjunction with Constructor Companies and the local

city hall officials to report daily activities that could be recorded by CCTV public or private cameras

and be available via a few “clicks” online. Also, local citizens could watch and monitor the work

performed documenting with a mobile App linked straight to a Data Lake system updating the

delivery chronogram. On top of it, accurate reports and analysis of weather conditions or major

changes can also be recorded and provide up to date route map for the construction chronogram.

If more accurate information is required all the invoices detailing material purchased from 3rd party

companies can be available online for full matching control. Sounds crazy but computer systems

could do that through the analytical queries in Big Data. It can calculate the kind of cement used on

the mixture with sand and rubble to validate that the amount bought was delivered and used to fill

the estimated area pavement. All this data information could be available for any person within the

community to evaluate and confirm if the procedure was correct or not. It is full and clear

transparency for the society members.

Unfortunately current models are far from this new analytical model through Big Data. The leaders

of the nation are fully dependent on several layers of administrators that can corrupt themselves

accepting money for paperwork changes without a proper auditing solution. Even invoices get lost

due lack of proper archiving and accurate indexing of data.

It is impossible to stop corruption but using new models of application and Big Data analysis

systems can make corruption harder to occur.

Page 41: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 41

Transparency brings more qualifications to a Nation and sequentially more economic investments to

internal industry for growth.

Open Data means more government transparency!

There are several projects related to “Open Data” based on a series of worldwide government

initiatives to attend to society l demands for transparency and efficiency in public money expenses

control. One of the most well-known initiatives is the Open Foundation Protocol which is looking

forward deploying an open standard protocol that allows efficient access to useful data for

application analysis related to public politic actions, communities actions, and operative insight

inspection, all this being done by Big Data analytics systems.

Moving towards this new initiative an example can come from the Brazilian government who is

deploying several projects to centralize Public Data to government data centers and offer consulting

access through web portals.

To achieve this new model of transparency and, security, the Public sector requires an innovative

computer system to ingest, process and archive data in high speed and also make it available all

the time. Due to the amount of data to be ingested and managed the infrastructure must be highly

scalable to store and make it available for the analytical queries. The necessary solutions that can

accommodate this request are Cloud Computing and Data Lake systems for Big Data analytics.

The result for society and government are fully positive as the time to consult the information is

reduced drastically. Furthermore it is scalable so new systems can be added to the solution as

required for the expansion. Big Data Analytical tools can work in parallel generating reports within

minutes showing a real transparency for the society and reducing waste of public money. Having

that analysis on hand can help public agents on decisions to take with high accuracy, lowest

failures, lowest unwanted expenses, prove right expenses with documentation, find breaches that

can lead to potential corruption schemes, audit every sector properly, etc.

Big Data and the Challenges for the Future

Several sectors from private and public institutions already raised a big challenge for Big Data to the

near future; privacy. The risks of confidential information been stolen and published will be more

and more real. The scientists’ awareness and more strict security measures are so far the best

approaches to this challenge. There are other methods like data encryption and exclusive data

access depending on the level of the requester to use the data that also is being studied by

scientific researchers.

Of course we will see scandals about private data content leaks due to negligence or failure in

procedure operation or maybe hacked systems but scientists should always look forward to

evaluating the risks and apply correction to these situations. The population should be informed of

the huge benefits in time, money and life saved that Big Data analytics brings to society.

Big Data seems to be in the correct scenario due to two main factors: 1) Pressure from society to

have public results published faster; 2) Affordable computer technology available for statistics

analysis;

Page 42: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 42

First steps to start with Big Data Solution31

1) Identify the opportunities to apply the Big Data

a. Understand objectives

b. Show what is possible

c. Identify the right goals

d. Prioritize and choose the beginning project

2) Clear Proof of Value

a. Deliver project result in analytics

b. Show measurable valuable points

Few products available and architecture examples

Traditional IT data centers are not efficient, not scalable, and very hard to manage and support

creating a terrible layer of complexity to work with Big Data. Additionally, it has several limitation and

performance problems not allowing easy scalability.

As mentioned throughout this document, Big Data analytics alone cannot help a Nation much. It

requires infrastructure and a parallel solution like Cloud Computing to facilitate the implementation

and scalability growth for data processing and management.

Basic steps to start with

Identify which kind of data will be stored

Data classification and retention levels to be applied

Data Storage model with high availability, fault tolerance and is easy scalable

Archiving Solutions to work in parallel with the Storage models

Processing Scalable Systems to offer historical and real-time processing of Big Data

Application types to be used for high volume processing

Development Team

Development and Quality Assurance System

Production System

Support Team

Data Scientists and Analytical Teams

For the Public Sector, due to security reasons the ideal platform is a private Cloud with centralized

and highly secure measures but at the same time flexible for content management, as a different

level of the organization will have different access to the information.

For example, a classified “top security” content about agriculture production year achievements

should be processed and only top-level leaders could be aware of the results that should be stored

in separate areas with extremely limited access controlled and audited. Meanwhile, information

Page 43: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 43

about weather, education and cultural content are classified as “informative“ content and should be

available to public at real-time.

Follow below a Private Cloud illustrative model of Data Center with its Big Data

infrastructure

Based on the political administration organization

model used in this article from Brazilian Nation

government (Federal > state > city) the architecture

design could be:

Federal Unit

1) The owner of the Data Center facilities where Storage and Cloud servers will reside.

2) Responsible for the infrastructure

a. Facilities Distribution

b. Electrical Power

c. Cooling System

d. Communication

e. Scalability

f. Hardware provisioning for Storage and Processing

g. Maintenance

h. Security

3) Deployment of Private Cloud Solution

4) Management of Cloud Services

Cloud Data Center Site Facility Topology distribution

Page 44: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 44

The topology design example here can be used to distribute Cloud Converged Infrastructure

equipment per data center. The workload will be managed separately as each data center can

serve different users based on their physical location in the territory, like users in the north use data

center A while users from the south will use with more frequency data center D. All Data Centers

are interconnected forming a unique solution and each site can operate as Disaster Recovery from

the other or also do offline data processing based on the utilization.

Each data center has a full set of Cloud, Storage, and Big Data appliances installed running

independently from each other providing high security and resilience to data information stored.

Here is a summary of DELL EMC, VMware and Cisco products to cover all Cloud data center

infrastructures

> DELL EMC Storage - Enterprise storage-class of products to meet all Big Data application

performance, scalability, availability, data protection and security.

> VMware Virtualization - Virtualization and platform orchestration for multiple environments without

the need of additional hardware specific.

> Pivotal HD - Open source Enterprise Hadoop distribution that provides Apache Hadoop features

and Hadoop-related project features from Pivotal’s value-added extensions and analytics support

> Pivotal Big Data Suite - Full suite of integrated technologies to easily create data-driven

applications that meet any data processing and advanced analytic requirement at scale.

Full Overview

Page 45: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 45

Private Cloud Model

Private cloud model infrastructure provides exclusive services to a single organization comprising

multiple units of the business. In this case, each division of government administration is considered

a business unit.

The services offered include self-service, multi-tenancy for the states and their cities, with the ability

to provide virtual machines and platforms as they are required, and changing computing resources

on-demand. It also offers Big Data products as a Service for the business units. The whole service

can be controlled through chargeback tools that track computing usage units charging only for the

resources used. That may sound strange as the Federal organization already used the money from

tax payers to create the architecture but looking from another perspective each unit of the

organization will pay back charges which can be applied to the development of the system and

personal qualification.

The main idea behind Cloud system is a centralized operational design offering data protection and

services through all the government unities.

There are two variations to the private cloud model:

1 - On-premise private cloud hosted by an organization within its own data centers.

2 - Externally hosted private cloud hosted external to an organization and is managed by a third-

party organization.

DELL EMC and VMware offer the VBlock solution on the Cloud Architecture design

VCE VBlock

VBlock is a full range of pre-integrated servers with shared storage, network devices, virtualization,

and management, all tied together for easy scalability. It has been created to offer extreme

efficiency on Hadoop deployment.

This Solution is easily deployed attending immediate needs for on-demand architecture scalability

Page 46: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 46

Big Data Processing Architecture Design

DELL EMC offers the Greenplum architecture where data is automatically partitioned across

multiple 'segment' servers, and each 'segment' owns and manages a distinct portion of the overall

data. The entire communication goes through network interconnect without disk-level sharing or

contention to be concerned with - also known as 'shared-nothing' architecture.

Here MapReduce integration allows developers and DBAs to run both MapReduce and SQL in

Greenplum’s parallel data flow engine. MapReduce enables analytics to be run on petabytes of

data.

Page 47: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 47

Pivotal HD for Big Data processing

The architecture in the diagram below shows Pivotal HD real-time, interactive, batch processing in a

single Hadoop platform.

Storage Products and Data Lake capability for high volume of data ingestion and future

growth

SAN and NAS Storage array architectures were not designed to store or protect data at large multi-

petabyte capacity levels.

This new era of massive content storage environments requires storage architecture based on an

object storage model which not only stores object data with string capacity level but also store these

new capacity levels at a manageable cost point.

DELL EMC Isilon

A pillar for Big Data appliances, Isilon appliance is a NAS storage platform for multi-protocol

support. It includes Hadoop that eliminates inefficient storage silos and provides a first class

security system and incredible speeds time.

Elastic Cloud Storage Appliance (Storage as a Service)

It is a solution for large geo storage distribution. A powerful hyper-scale geo-distributed object and

HDFS storage platform ECS has the ability to efficiently store billions of objects while delivering data

anywhere to any device. It also allows geo-scale analytics and Multi-Cloud API's to seamlessly

connect to public clouds. It operates from petabyte to Exabyte and beyond.

Page 48: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 48

Conclusions

The goal of this article is to highlight several points where new Technologies like Big Data and

Cloud Computing can help leaders and politicians on the improvement of daily tasks during their

administration.

Within some Big Data examples shown all leaders can build their own strategy following the paths

already in use by other nations to evaluate the present issues in today’s society and from there,

develop plans to address one by one with efficiency.

While Big Data is a data analysis methodology enabled by recent advances in technologies and

architecture, Cloud Computing offers the fastest solution, secure and robust for big data

implementation.

The Big Data analytic tools can bring much more efficiency and results-driven action plans with low

cost. The benefits are not limited to cost reduction on several sectors evaluated by the tools but

several preventive measures with shortest time to be applied. A better Nation means a better future

for the next generation that will come after us. That also means a better world for everybody to

leave, as there will be best approaches on the sustainability with the lowest impact on the planet’s

natural resources and at the same time generating goods for the business exchange.

Not all but the main topics of a Nation daily routine that involves growing population, expanding use

of natural resources, improving productivity, sharing knowledge and efficiency have been covered in

this article. The IT field focus now is no longer to a local infrastructure like servers or known local

storage performance issues; it is a whole new layer of the IT infrastructure where resources are

granted on-demand and processing power for the huge volume of data analytics is fully distributed

to hundreds and thousands of computer machines.

Nations or sectors in the society that have not yet started to use Big Data strategy can start by

engaging data scientist teams and have the tools mentioned readily available for them to use

through Cloud Services. As you may have noticed the solutions available today can be ordered and

be ready to use within few weeks. Data Scientists will be responsible for designing the applications

and analytic systems to analyze all sort of data information that could be ingested.

A great deal of the information to start with the analysis is already available from sensors, mobile

devices, etc., and they are really important to have a clear picture in an actual Nation statistics

scenario.

Big Data and Cloud Computing is a whole new area in the IT industry and it is just at the beginning.

New ideas and areas to apply it will come soon.

From the results of these statistics not only leaders but also others from the society will be able to

help on action plans that will be fully beneficial for the whole society.

Page 49: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 49

Appendix – List of Abbreviations

App – Application BDE – Big Data Extension BI - Business intelligence CCTV – Closed Circuit TV CDR – Call Data record CRM – Customer relationship management DB – Database DCA – Data Computing Appliance AKA “Greenplum” DCN – Data Center and Cloud Networking EDW – Enterprise data warehouse ERP – Enterprise resource planning FGV-Rio – Facudade Getulio Vargas Rio de Janeiro – (University Getulio Vargas Rio de Janeiro) GemFire – In-memory distributed data grid GPS – Global positioning system HAWQ – Parallel SQL query engine from Pivotal HDFS - Hadoop Distributed File System HR – Human Resources IMR – Infant Mortality Rate IoT – Internet Of Things IT – Information technology JDBC – Java Database Connectivity MADlib – Big Data Machine Learning in SQL for Data Scientists MapReduce – Message Passing Interface standard, having reduce and scatter operations MPI – Message Passing Interface MPP – Massively parallel processing NAS – Network Attached Storage ODBC – Open Database Connectivity OLAP – Online analytical processing OLTP – On-line transactional processing OS – Operating System PC – Personal computer RFID – Radio frequency identification tag RDBMS – Relational database management system Rpart – Recursive Partitioning and Regression Trees SaaS – Software-as-a-Service SAS – Statistical Analysis System SAN – Storage Area Network

Page 50: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 50

Footnotes - References 1 Gazzaneo, Rodrigo C. – “TCC – BigData”, From PDF copy. March 2015.

2 Stalin, Joseph. Statement Part “Nation“, From Wikipedia, Web. https://en.wikipedia.org/wiki/Nation.

3 “State (polity)”, From Wikipedia, Web. https://en.wikipedia.org/wiki/State_(polity).

4 “Big Data“, From Wikipedia, Web. https://en.wikipedia.org/wiki/Big_data.

5 http://www.sas.com/en_us/insights/big-data/hadoop.html.

6 “Cloud Computing”, From Wikipedia, Web, https://en.wikipedia.org/wiki/Cloud_computing

7 “Sustainable Development”. From Wikipedia, Web https://en.wikipedia.org/wiki/Sustainable_development.

8 Statistics Data about Brazilian Nation. From Wikipedia. Web. https://en.wikipedia.org/wiki/Brazil

9 http://www8.hp.com/br/pt/industries/public-sector.html?compURI=1087532#.VqwSpFMrKuU

10 “A utilização do Big Data na agropecuária”, Web.

https://www.scotconsultoria.com.br/noticias/artigos/35032/A-utiliza%C3%A7%C3%A3o-do-Big-Data-na-agropecu%C3%A1ria. 4

th June 2014.

- “Good Agricultural Practice”, From Wikipedia Web. https://en.wikipedia.org/wiki/Good_agricultural_practice. - http://www.usp.br/portalbiossistemas/?p=6510 - Mariz, Cristiano. “Da terra brotam os dados”. Web. http://exame.abril.com.br/revista-exame/edicoes/1074/noticias/da-terra-brotam-os-dados. 10 Feb 2014. 11

http://www.brazilianbeef.org.br/texto.asp?id=18. - http://business-reporter.co.uk/2013/11/05/day-6-big-data-beef/ 12 “

Big data” israelense ensina produtores que uma xícara de alegria significa maior produtividade”

Ministry of Economy. “State Of Israel”. Web. http://itrade.gov.il/brazil/?p=4918. 23 June 2015 13

http://blog.qmagico.com.br/educacao/big-data-servico-da-educacao/ - https://www.linkedin.com/pulse/20140719195519-471910-o-valor-do-big-data-no-mercado-educacional - Campos, Newton. “Ensino Adaptativo: O Big Data na Educação”. Web. http://educacao.estadao.com.br/blogs/a-educacao-no-seculo-21/ensino-adaptativo-o-big-data-na-educacao/ 26 April 2014 14

Green-Lerman, Hillary. “Visualizing Personalized Learning”, “The Knewton Blog“ Web. https://www.knewton.com/blog/adaptive-learning/. September 10

th September 2015.

15 http://www.kdnuggets.com/2015/03/how-big-data-can-improve-lives-poor.html

16 http://www.brookings.edu/blogs/africa-in-focus/posts/2015/06/02-big-data-poverty-senegal

17 http://cio.com.br/opiniao/2015/09/01/o-big-data-a-servico-da-sociedade/

18 “i-Canada”, Web. http://www.icanadanetwork.ca/about-i-canada/. 2011

19 https://www.theaudienceagency.org/insight/using-the-evidence-to-reveal-opportunities-for-engagement

- http://www.rio2016.com/culture/

20 http://exame.abril.com.br/tecnologia/noticias/policia-de-sp-usara-sistema-baseado-em-big-data-para-combater-crime

- Jansen, Thiago and Matsuura, Sergio. Web. http://oglobo.globo.com/economia/tecnologia/autoridades-recorrem-controverso-cruzamento-de-dados-na-prevencao-de-crimes-14453408. 4 Nov 2014

- http://bigdatabusiness.com.br/estrategia-politica-saiba-como-ela-pode-se-beneficiar-com-a-mineracao-de-dados-2/ 21 http://embrapii.org.br/aeroespacial-e-defesa-2/ 22 http://www.splunk.com/pt_br/solutions/industries/public-sector/defense-and-intelligence-agencies.html

23 http://sistema4.saude.sp.gov.br/sahe/documento/leitosredeHospitalar.pdf

24 http://www.scielo.br/scielo.php?script=sci_arttext&pid=S2237-96222015000200325 - http://www.nature.com/nature/journal/v527/n7576_supp/full/527S1a.html - https://www.coursera.org/course/bigdatabrasil - Infinit Healthcare. Web http://www.infinithealthcare.com/resource-center/whats-up-in-healthcare-nov-16-22-2014/. 26 November 2014 25 http://bigdatabusiness.com.br/category/politica/ 26

Info extracted from - MCAFEE and BRYNJOLFSSON, 2012 - http://www.admin-magazine.com/HPC/content/download/5604/49345/file/IDC_Big%20Data_whitepaper_final.pdf 27 http://bigdatabusiness.com.br/como-smart-cities-usam-big-data/ 28

https://www.technologyreview.com/s/526541/smart-wind-and-solar-power/

Page 51: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 51

29 http://www.forbes.com/sites/bernardmarr/2015/04/28/nepal-earthquake-using-big-data-in-a-crisis/#6076da81532f - “Analyzing Disaster Big Data to Support Disaster Prevention with Timely and Accurate Forecasts“, Web. http://journal.jp.fujitsu.com/en/2015/05/29/02/. 29 May 2015. 30

“Corruption”, From Wikipedia, Web. https://en.wikipedia.org/wiki/Corruption - http://blog.opovo.com.br/bigdata/2014/08/11/dados-abertos-mais-transparencia-para-acoes-governo/ 31

http://www.emc.com/big-data/expertise.htm - http://www.emc.com/big-data/solutions.htm

Page 52: WHAT BIG DATA CAN DO FOR AN ENTIRE NATION …...WHAT BIG DATA CAN DO FOR AN ENTIRE NATION Robson Serafin Technical Support Engineer III Dell EMC robson.serafin@dell.com 2016 EMC Proven

2016 EMC Proven Professional Knowledge Sharing 52

Dell EMC believes the information in this publication is accurate as of its publication date. The

information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” DELL EMC MAKES NO

RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE

INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED

WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying and distribution of any Dell EMC software described in this publication requires an

applicable software license.

Dell, EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries.