big data analytics in the mauritian context 15 - meetupfiles.meetup.com/8630902/big data analytics...

27
BIG DATA & BIG DATA ANALYTICS IN THE MAURITIAN CONTEXT

Upload: donhi

Post on 23-Apr-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

BIG DATA & BIG DATA ANALYTICSIN THE MAURITIAN CONTEXT

Business Analytics Consultant

Specialities

Design and development of OLTP,Decision Support and Business Intelligence Systems

Techniques

Web Front end : HTML, CSS3, JavaScriptWeb Back end : PHP, ASP.NETBusiness Intelligence : OBIEE, Microsoft BI, TableauData Mining : R, SASFrameworks : .NET, Cocoa, ADF

+5 years experience in the sectorIndustries: Insurance, Health, Telecom and Retail

OTHER INTERESTS : IOT, RealTime OS, Human Machine Interfaces, Bioengineering, Distributed computing.

Email : [email protected]

SAS

A bit about me

❑Why all the fuss about Big Data ? ❑The 4 V’s of Big Data ❑Big Data abroad, investment by industry and technology landscape ❑Mauritius and Big Data ❑Big Data analytics , architectural comparisons ❑The Future for Big Data ❑Big Data , the dangers

Agenda

❑ Big Data refers to massive, often unstructured data that is beyond the processing capabilities of traditional data management tools.

❑  Big Data can take up terabytes and petabytes of storage space in diverse formats including text, video, sound, images etc.

No single standard definition…

“Big Data” is data whose scale, diversity, and complexity require new

architecture, techniques, algorithms, paradigm shifts and analytics to manage it

and extract value or hidden knowledge from it.

What’s all the fuss about BIG DATA ?

The 4 V’s of Big Data.

Big Data Abroad - Investment by Industry

Big Data - Landscape

FACTS -> Small country, 1.3 million people -> Small number of large companies who use Decision Support Systems -> Spread of the word is the best marketing tool used !

But ? • Business Stakeholders can already begin to aggressively build their

organizations’ big data capabilities, be it in fisheries, manufacturing, banking, insurance, retail , telecom, marketing and health

• This effort will take time, but the impact of developing a superior capacity to take advantage of big data will bring enhanced competitive advantage over the long term.

• Therefore it is well worth the investment to create this capability to be strategically positioned to deliver services to companies or governments in Africa and even beyond in terms of technology consulting and outsourcing offering from Mauritius.

Mauritius and Big Data ?

-> Lack of clear data policies and legislation. As an ever larger amount of data is digitized and travels across organizational boundaries, there is a set of policy issues that will become increasingly important, including, but not limited to, privacy, security, intellectual property, and liability. Laws will have to be enacted consequently.

-> Lack of investment in public and private infrastructure. Will need the deployment of new technologies (e.g., storage, computing, and analytical software) and techniques (i.e., new types of analyses). Need to spur up data capturing technologies and Internet of Things (IOT). Smart Cities can act as a test lab for this kind of adoption but it is better to start with existing towns

-> Lack of proper research and scientific facilities

-> Unavailability of significant talent pool.

Big Data – Challenges for technology adoption in Mauritius

-> Organizational change and talent. Organizational leaders often lack the understanding of the value in big data as well as how to unlock this value and hence are reluctant to invest in both infrastructure and talent. In addition, many organizations in Mauritius today do not structure workflows and incentives in ways that optimize the use of big data to make better decisions and take more informed action.

-> Access to data • To enable transformative opportunities, companies and government will

increasingly need to integrate information from multiple data sources. • In some cases, organizations will be able to purchase access to the data. • In other cases, however, gaining access to third-party data is often not

straightforward. The sources of third-party data might not have considered sharing it. Sometimes, economic incentives are not aligned to encourage stakeholders to share data.

A stakeholder that holds a certain dataset, for example, telecommunication data, might consider it to be the source of a key competitive advantage and thus would be reluctant to share it with other stakeholders. Other stakeholders must find ways to offer compelling value propositions to holders of valuable data.

Big Data adoption in Mauritius -What needs to be rethought ?

-> Industry structure and public sector management Sectors with a relative lack of competitive intensity and performance transparency, along with industries where profit pools are highly concentrated, are likely to be slow to fully leverage the benefits of big data. For example, in the public sector, there tends to be a lack of competitive pressure that limits efficiency and productivity; as a result, the sector faces more difficult barriers than other sectors in the way of capturing the potential value from using big data.

The Mauritian health care system is another example of how the structure of an industry impacts on how easy it will be to extract value from big data. This is a sector that has a lack of performance transparency into cost and quality.

As these examples suggest, organization leaders and policy makers will have to consider how industry structures and public sector performance management framework could and should evolve in a big data world if they are to determine how to optimize value creation at the level of individual firms, sectors, and the Mauritian economy as a whole.

Big Data adoption in Mauritius -What needs to be rethought ?

No to Monopolies or Oligopolies

-> New technologies to increase connectivity and processing power (e.g., storage, computing, and analytical software) Data capture, cleansing, evaluation at all levels.

-> New Innovative business flows (i.e., new types of analyses). The range of technology challenges and the priorities set for tackling them will differ depending on the data maturity of the public or private institution. Legacy systems and incompatible standards and formats too often prevent the integration of data and the more sophisticated analytics that create value from big data.

-> Investment in better training and skill set

-> Better visibility abroad ( Which can drive investment locally )

Big Data adoption in Mauritius -What needs to be rethought ?

Create more transparency Making relevant data more readily accessible across otherwise separated departments can sharply reduce search and processing time. Integrating data from R&D, engineering, and other business/manufacturing units to enable concurrent engineering can significantly cut time to market and improve quality.

Segment populations for targeted action Helps organizations to create highly specific segmentations and to tailor products and services precisely to meet those needs. This approach is well known in marketing and risk management but can be game changing elsewhere—for example, in the public sector where an ethos of treating all citizens in the same way is commonplace.

Enhancing human decisions with automated algorithms Sophisticated analytics can substantially improve decision making, minimize risks, and unearth valuable insights that would otherwise remain hidden.

Innovating new business models, products, and services Help Mauritian companies create new products and services, enhance existing ones, and invent entirely new business models. Access to data from the African continent can help local manufacturers improve the development of the next generation of products and to create innovative after-sales service offerings.

How can Big Data help Mauritius ?

Characteristics OLTP Analytics/ BI / DSSSource of Data Original Operational Data Data comes from OLTP or/and other sourcesPurpose of Data Business Tasks Planning, Decision Making

Amount of Data per Transaction

Usually Small (compared to DSS) Can be very Large

Type of Data Detailed SummarisedTimesliness of Data Must be current Current and historicalUpdates to Data Frequently Less Frequently, Updates onlyDatabase Design Normalized - Lots of Tables De-normalized Number of transaction/users

Many ( 100's to 1000's ) FewResponse Time Quick Reasonable but quick is better !Queries Standard/Simple Queries Complex/AggregationsDatabase Operations Insert,Modify,Delete,Update,

ReadRead

Type of Processing Well-Defined Ad hoc

Stepping into Big Data Analytics - Comparison between an OLTP and a traditional Analytics Platform

Challenge -> Real Time / Big Data Analytics Needs rethought of infrastructure, integration, reporting and presentation in terms of the 3 V’s

Traditional Business Intelligence – Architecture Example

Pull and push A pull operation uses fixed time intervals to check for new data, while in a push operation data is loaded into the target once a change appears.

A staging area, or landing zone, is an intermediate storage area used for data processing during the extract, transform and load (ETL) process. The data staging area sits between the data source(s) and the data target(s), which are often data warehouses, data marts or other data repositories. Data staging areas are often transient in nature, with their contents being erased prior to running an ETL process or immediately following successful completion of an ETL process. There are staging area architectures, however, which are designed to hold data for extended periods of time for archival or

troubleshooting purposes.

Traditional Business Intelligence – Visualization

❑ Descriptive

❑ Diagnostic

❑ Predictive

❑ Prescriptive

Big Data Analytics – Types

Big Data Analytics - Architectural Example of a customer analytics platformDecision tree - decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm. Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages. Classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known. Clustering  is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters) Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs.

Big Data Analytics and Traditional BI togetherHadoop ->An open source (free) software framework for processing huge datasets on certain kinds of problems on a distributed system. Its development was inspired by Google’s MapReduce and Google File System. It was originally developed at Yahoo! and is now managed as a project of the Apache Software Foundation.

Hbase -> An open source (free), distributed, non-relational database modelled on Google’s Big Table. It was originally developed by Powerset and is now managed as a project of the Apache Software foundation as part of the Hadoop.

MapReduce -> A software framework introduced by Google for processing huge datasets on certain kinds of problems on a distributed system. Also implemented in Hadoop.

Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis.

Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily in the areas of collaborative filtering, clustering and classification.

Pig is a high-level platform for creating MapReduce programs used with Hadoop.

Big Data Analytics – Common visuals

Big Data Analytics Landscape

The future for Big Data – The Internet of Things !

The future for Big Data – The Internet of Things !

The future for Big Data – The Internet of Things !

When Big Data meets Big Brother - The Dangers

BIG BROTHER CAN BE AN AUTHORITARIAN OR INTRUSIVE GOVERNMENT OR A PRIVATE ENTIRY SELLING YOUR INFORMATION

Next !

Business Intelligence and Data Mining Group in MSCC

Once a month meet-up

Everybody who’s interested in statistics or maths and software development

University students

Business People

The Mauritius Software Craftsmanship Community http://www.meetup.com/fr/MauritiusSoftwareCraftsmanshipCommunity/events/223170974/

Ceridian Learning Center http://www.ceridian.mu/about_ceridian_academy.aspx

THANKS