mapping presentation thag big data from space

15
Delegation / Organisation Logo Outsourcing Partner Big Data from the Space | 21 st of February 2017 | Slide 1 Big Data From the Space 2017 Cycle 1st Mapping Meetings Outsourcing Partner Sp. z o.o. Bartosz Szkudlarek Piotr Zaborowski

Upload: bartosz-szkudlarek

Post on 21-Mar-2017

79 views

Category:

Engineering


2 download

TRANSCRIPT

Page 1: Mapping presentation THAG big data from space

Delegation / Organisation Logo

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 1

Big Data From the Space

2017 Cycle 1st Mapping Meetings

Outsourcing Partner Sp. z o.o.Bartosz SzkudlarekPiotr Zaborowski

Page 2: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 2

We are Outsourcing Partner, a technology company, specialized in custom software development and Big Data.

Outsourcing Partner capabilities on Big Data

Page 3: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 3

What can we bring?Proven technology experience with common Big Data technologies.

Outsourcing Partner capabilities on Big Data

Page 4: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 4

Outsourcing Partner capabilities on Big Data

Our experienceSix projects in Big Data domain, which use Hadoop, Apache Spark and other technologies. Two projects for ESA where the point was to integrated and visualize massive data.

Page 5: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 5

Outsourcing Partner capabilities on Big Data

Project name Project subject Technologies Numbers

European Space Agency GEOSS Web Portal

Data hub portal with search functionality.

Objective of this project was to integrate two different data sources on one visualisation platform

HTML5, maps, microservices More than 1 mln resulsTwo different data sources.

European Space Agency The EO Web – the new website

Proof of concept for new content architecture of new Earth Observation website which collects all information from domain services.

The primary purpose of this project to identify and unify content elements from all EO websites and to provide efficient mechanism for harvesting, indexing, categorising and searching content.

HTML5, Elastic Search, Kibana, Google Analytics

More than 50 websites with technical documentation about missions instruments and other information connected with the area, over the 500k resources identified.

Operational, constant dev Proof of concept Operational, complete

Page 6: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 6

Outsourcing Partner capabilities on Big Data

Project name Project subject Technologies Numbers

Telecommunication sector T-MobileMessaging broker

Communication exchange between operator and customer is crucial. We implement communication broker for text messages (SMS, push notifications, etc..) which allows to monitor:• message efficiencies (how many

reminders are needed for force user to pay delayed payments, what message force user to buy additional internet limit),

• message rules ( the system can not send information about available internet package if user order package though any channel).

Casandra, Apache Hadoop The system handled 15 mln customers, 3 mln message per day.

Telecommunication sector T-MobileCustomer self-service system

To provide services for customers, the telecommunication company needs to have many backend systems to support operations.The aim of this project was to implement the mechanism for collecting information about user activities in one repository. Except massive amount of data the challenge was to unify information from many domains systems.

ELC stack (Elastic Search, Kibana, Logstash)

Operational, constant dev Proof of concept Operational, complete

Page 7: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 7

Outsourcing Partner capabilities on Big Data

Project name Project subject Technologies Numbers

Betterware Retail companySale support prediction mechanism

Together with Betterware, we analyzed the sales data and singled the sets of products which are frequently bought by consumers.

Apache Sparx, Apache Hadoop, Tableau Software

8 500 customers, 1 k orders dally, machine learning algorithms train on 1 mln operations (5 years of history data).

Insurance companyIntegration of customer databases

The aim of the project was to integrate data about customers and their operations stored and managed by four different domain systems. The scope of the project contains:- data analysis and providing integrated domain model, - ETL transformations programming, - visualization of data based on Tableau Software

Tableau Software, Amazon AWS

4 domain system, more than 30 unified domain objects.

Operational, constant dev Proof of concept Operational, complete

Page 8: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 8

Outsourcing Partner capabilities on Big Data

Project name Project subject Technologies Numbers

Electoral Committee Candidate for President of the RepublicMedia monitoring

During the presidential election in 2015 in Poland we monitored social media (Facebook, Twitter, Youtube) and digital newspapers.

From data fetched from social media we prepared reports of popularity of particular candidates, sentiment of comments connected with candidates and leaders of communities (blog authors, influencers), we built algorithm estimates trending phrases for political domain.

Apache Hadoop,Apache Spark, HTML5 reports

Operational, constant dev Proof of concept Operational, complete

Page 9: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 9

Comments on Big Data from Space (OSP)

• Security and legal recommendations should be defined if applicable• 4.4 Services and data location with legal consequences policy is not referenced.

Harmonisation should clarify strategy and policy towards data localisation and promoted licensing models technologies.

• Services reliability• 4.5.4.6 suitable services reliability or reproducibility for industrial development.

Availability model should be applied (like in the Ground Segment) for platforms exposed to crowdsource/industry to secure its business models

• Openness to other data sources• 4.5.4.1 Some proven decision support solutions base on combining satellite data and

other data sinks, thus architecture supporting data integration should be considered.

Page 10: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 10

Comments on Big Data from Space (OSP)

• Consider exchangeability aspect• 4.5.X.1 Interoperability and exchangeability can be one of the strategy dimension in

cross domain data flow.• Consider architectural influence of data organisational spread on usability (technical)

• 4.5.2.1 For data organisation (like CDM) shredding policy should be aligned to current and potential requirements. Solution should enable generic interfaces be build in awareness of underlying data distribution while not infrastructure.

• Openness vs predictability on provided platforms• 4.5.3.1 orchestration and prioritisation: in shared environment extensive experiments

may coexists with operational periodic/stream analytics that should not be depredated.

Page 11: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 11

OSP suggestions for Big Data from Space Roadmap

Apart from precise needs and solutions mapping we suggest consideration of following.• Standardisation advisory body constituted for new/ongoing initiatives would enable

natural alignment to process and consider new approaches.• Services and technologies catalogue of state of the art, recommended and

applying setup for members and industry review.• Layered architecture of systems should be proposed and adopted with common

interfaces to enable interoperability, relocations, third party added value services development - with respect of blurred borders and dependencies.

• Federalisation tactics should be consolidated.• Industry-related, legal and security policies and strategies should be defined.

Page 12: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 12

Conclusions on Big Data from Space from OSP

The most valuable Big Data projects came from interdisciplinary teams which can juggle data from many different data sources

Page 13: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 13

Conclusions on Big Data from Space (OSP)

Data Scientists are mostly mathematicians and physics. Significant part of them start experiments from sample databases such us IRIS or Lena. Why can't they use the Agency resources?

Page 14: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 14

Conclusions (OSP)

As SME with long SW and big data domain we recognise following challenges in unlocking data potential according to 5.2 European Strategic Interests:

• High entry threshold - data is closed for non-domain industry companies and research units.

• Current ESA big data exploitation projects are silo – there is no collaboration and competition, no place for processing workflow,

• There is (possibly) evaluation gap – resources managed by the Agency are valuable but unevaluated, there are no (not many) mechanism for collecting community feedback and evolve,

• Great data and services are of undefined reliability and partly unpredictible

Page 15: Mapping presentation THAG big data from space

Outsourcing Partner Big Data from the Space | 21st of February 2017 | Slide 15

Conclusions (OSP)

Useful tools to deal with pitfalls of Big Data exploitation:

• Focusing on the potential customers the Agency should put an effort promoting and exposing the value of the data,

• Data platform should be as open & simple as possible – the Open Data principle, • Implement mechanisms of collaboration; define subsets, rate&evaluate, share:

ideas, experiments, results, extend, finally create processing chain, • Deliver reliable services meeting industry needs or enable commercial

federalisation/transition to business of value added services