big data in malaysia - emerging sector profile

Download Big Data in Malaysia - Emerging Sector Profile

If you can't read please download the document

Upload: sandra-hanchard

Post on 31-Oct-2014

848 views

Category:

Business


1 download

DESCRIPTION

Big Data is a buzz term with global traction. But while interest and awareness is high, is that buzz being converted effectively into significant economic activity in Malaysia? What are the inhibitors to driving Big Data solutions? And where are the opportunities we should nurture? In this presentation, Big Data Malaysia shares insights from a new survey based on a range of stakeholders in this emerging industry.

TRANSCRIPT

  • 1. Converting Buzz into Activity in the Big Data Ecosystem Sandra Hanchard & Tirath Ramdas Big Data MalaysiaBig Data World Show JW Marriott, Kuala Lumpur, November 2013

2. 10 events since May 2012 Plus 9 duringA networking group for people passionate about data.We talk about applications in social media analytics, financial data, consumer insight, telecommunications, etc.Participants from end-users, vendors, academia engineers, analysts, managers, professors, entrepreneurs.Wide technical breadth (Hadoop, R, Greenplum, Postgres, Cassandra, MongoDB, 0MQ, Prudsys, Storm, Acunu Analytics, Google BigQuery, Dremel, Oracle, Datasift, Tableau, GPUs, NetApp, Hive, Hbase, AWS, MySQL, Teradata).www.BigDataMalaysia.org 3. Rationale Opportunities and inhibitors to Big Data activity in Malaysia? Whos interested vs. involved? What is the current and future capacity for big data skills? Where are the critical gaps in skills? What are the soft inhibitors, including data access, regulation and perception? 3November 2013 4. About the survey.. Methodology ContentCollection over October 2013 Distribution via Big Data Malaysia email & social media channels and partners 108 respondents over 90 organizations Not intended to be representative. Illustrative of Big Data Malaysia network About you Enabling Your Organization End-Uses Skills Capabilities Data SourcesThis deck contains preliminary findings: we are generating hypotheses for further analysis. 4November 2013 5. Lucky draw sponsors5November 2013 6. Technical assistance6November 2013 7. Half respondents in ICT, even spread across other sectors Information and communications technology Marketing services Professional, scientific, and technical services Educational services48%Media and Classifieds Finance and Insurance Government7% n = 108 78%Manufacturing Not applicable e.g. I am a student Other Preliminary findings - November 2013 8. 48% management, 40% practitioner CEO, or equivalentC-level: 24%CIO/CTO, or equivalent24%Senior/middle managerC-level Other Software engineerPractitioner: 40%AnalystEmerging number of Data scientists (6%) Who are they?Data scientistTechnical consultant (e.g. pre-sales) Academic or Scientist Full time student Other 05101520No. respondents Practitioners include; Software engineers, Analysts, Data scientists, Technical consultants and Academic/Scientists.8Preliminary findings - November 20132530 9. No. respondentsTop management concentrated in SMEs, other roles spread Enterprise > Boutique 20C-level15 10 5 0No. respondentsEmployees exceeding Employees from 200 Employees from 75 to Employees from 5 to Employees of less than 1000 to not exceeding 1000 not exceeding 200 less than 75 512 10 8 6 4 2 0Senior/middle managerNo. respondentsEmployees exceeding Employees from 200 Employees from 75 to Employees from 5 to Employees of less than 1000 to not exceeding 1000 not exceeding 200 less than 75 59Self-employedSelf-employed15Practitioner10 5 0 Employees exceeding Employees from 200 Employees from 75 to Employees from 5 to Employees of less than 1000 to not exceeding 1000 not exceeding 200 less than 75 5Preliminary findings - November 2013Self-employed 10. Some complacency in how data leveraged, bullish anticipated spend My organization is effectively deriving tangible benefits from our organizational data assetsHow do you expect your spend on Big Data to change in 2014 compared to 2013?100%21%Strongly agreeIncreasing by more than 25%% Respondents n=108Increasing by between 10% and 25%39%Agree50%Increasing by between 5% and 10%Increasing by less than 5%29%0%Neutral4% 7%Disagree Strongly disagreeNo change0 Dont know/prefer not to say = 16Managers (n=75) defined as respondents who selected yes to having managerial responsibility in their role.10Preliminary findings - November 20135101520No. Managers25 11. ..but actual and planned headcount remain low Big Data Headcount (HC): Current vs. Next 12 months Low HC, High growth HC Next 12 months> 50121-50114-101-3Dont know/Prefer not to say7 11426111813No. Managers5241-34-1011-20High HC, Low growth2None yetAnalysis based on Managers; excluding those who selected Dont know/Prefer not to say for Current headcount. n=68.11High HC, High growth1311-20NoneLow HC, Low growth3Current HCPreliminary findings - November 201321-50> 50 12. ICT high outsourcing intent suggests technical fragmentation How willing would you be to outsource high-skill tasks in your Big Data initiatives to external consultants?How do you expect your spend on Big Data to change in 2014 compared to 2013? Non-ICT100% Increasing by more than 25%24%% Respondents42%Increasing by between 10% and 25%Quite/Extremely willing24%Moderately willingIncreasing by less than 10%50% 27%Slightly willing24%0%Not at all willing 12% 28% 18% 0% Non-ICTICTn=2912ICTn=3310% 20% 30% 40% 50%Boutique opportunities amongst ICT, but Non-ICT also priority targets given matching expected spendPreliminary findings - November 2013 13. Whatever Big Data offers, were all focused on the customer End-uses ranked by priority of relevance AllNon-ICTCustomer behavioural profiling Customer service and/or experience Competitive intelligence Customer retention Social trends monitoring Customer acquisition Customer cross-sell and/or up-sell Forecasting supply and demand Brand monitoring Product and service innovation Operational cost management Risk management Supply-chain monitoring Infrastructure and assets monitoring Compliance and regulatory issuesICT 103 101 105Very relevant104 108 117 104115 115 135 132 158 111Moderately relevantSlightly relevant Not all relevant n=56ICT greater production focus as well as Forecastingn=52Where are your blind spots? Are you too internally or externally focused with aspirations. Big data providers: Specialise or provide holistic solutions?Swing is an indexed number based on relevance score.13Non-ICT greater external focus e.g. social trends & brand monitoring110 109n=108Industry swing by AllPreliminary findings - November 2013 14. Managers have higher aspirations for End-uses vs. practitioners End-uses ranked by priority of relevance All* Customer behavioural profiling Customer service and/or experience Customer retention Customer cross-sell and/or up-sell Customer acquisition Competitive intelligence Social trends monitoring Forecasting supply and demand Product and service innovation Brand monitoring Operational cost management Risk management Supply-chain monitoring Infrastructure and assets monitoring Compliance and regulatory issues All* excludes Students and Others Very relevantModerately relevant Slightly relevantNot at all relevant n=95Role function swing by All Practitioner Manager 109 102 114 104 111 101 107 109 110 107 105 124 113 108 91 108 n=44Practitioners have stronger internal org. focusn=51Managers need to align perception of Big Datas value throughout organization with business objectives. What internal end-uses are being overlooked by Managers?Swing is an indexed number based on relevance score.14Managers key priorities: Profiling Retention AcquisitionPreliminary findings - November 2013 15. Desired skills: distributed data analysis, nod to fundamentals Capabilities ranked by priority of needIndustry swing by All AllSpecialised data analysis, modeling, simulation (op.research, machine learning) Distributed systems (e.g. Hadoop) deployment and/or administration Fundamental computer science and/or software engineering Industry-specific/domain knowledge Applied math and/or statistics Web/mobile development and/or visualization Research experience from any quantitative discipline Business (strategy, marketing, product development, etc.) Hardware/sensor design 101 109 123High need Little need No needThose with Intermediate/Advanced skills prioritize distributed systems Those with Basic tech skills prioritize domain knowledge Soft skills undervalued (Strategy, marketing etc.) No love for Internet of Things (hardware/sensor design skills).Swing is an indexed number based on need score.15ICT 106 119 109 113 116n=108Non-ICT Critical needPreliminary findings - November 2013111 n=56n=52Skill-level swing by All Intermediate/ advanced 105 117 122Entry102 111 103 104 101 n=47n=61 16. Specific skills: ICT demands Hadoop, Non-ICT wants algorithms 1.6Key priority areas:Normalized Priority Score1.4 1.2 1 0.81. 2. 3. 4.Big and Distributed Data (Hadoop, MapReduce) Algorithms (computational complexity, CS theory) Machine Learning (decision trees, neural nets, SVM, clustering) Back-End Programming (e.g. Java/C++/Python/Rails/Objective C)0.6 0.4 0.2 0ICT (n=35)16Combined (n=61)Non-ICT (n=26)Preliminary findings - November 2013 17. Desired capabilities: uncovering and visualizing patterns in real-time Capabilities ranked by priority of relevanceIndustry swing vs. All AllReal-time insights from real-time data streams Uncovering patterns (e.g. segments, correlations) from multi-structured data sets Visualizing/presenting insights Data discovery and exploration across many data sources Statistical analysis on big working data sets (>100GB) Automated decision making Machine-generated data (e.g. log files, periodic diagnostics) Content and sentiment from online media (e.g. social media) Efficiently and safely storing large data sets on infrastructure controlled by my org. Image, video, and audio data Physical sensor networks (e.g. "Internet of Things")Non-ICT120 137 173 109Slightly relevant Not at all relevant n=56n=52Clear desire to derive meaning from Big Data (i.e. insights) Those in a non-ICT role more likely to prioritize content; social and mediarich data (i.e. very unstructured data)Swing is an indexed number based on relevance score.17110 125 107 119 119 117 136Moderately relevantn=108ICTVery relevantPreliminary findings - November 2013 18. Strong willingness to outsource bodes well for service providers Willingness to Outsource swing vs. All ManagersCapabilities ranked by priority of relevance All Managers Visualizing/presenting insights Uncovering patterns (e.g. segments, correlations) from multi-structured data sets Real-time insights from real-time data streams Data discovery and exploration across many data sources Statistical analysis on big working data sets (>100GB) Automated decision making Efficiently and safely storing large data sets on infrastructure controlled by my org. Machine-generated data (e.g. log files, periodic diagnostics) Content and sentiment from online media (e.g. social media) Image, video, and audio data Physical sensor networks (e.g. "Internet of Things")UnwillingModerately relevant Slightly relevant Not at all relevant n=38Top priority by Managers to communicate meaning from Big Data (visualization & insights)Desire for discovery: leverage through better information managementSwing is an indexed number based on relevance score.18111 120 113 103 124 120 122 146 130 167 110Very relevantn=75WillingPreliminary findings - November 2013n=37 19. High-commitment managers had higher prioritization of capabilities Forward-capacity swing vs. All ManagersCapabilities ranked by priority of relevance All ManagersHigh-FCVisualizing/presenting insights Uncovering patterns (e.g. segments, correlations) from multi-structured data sets Real-time insights from real-time data streams Data discovery and exploration across many data sources Statistical analysis on big working data sets (>100GB) Automated decision making Efficiently and safely storing large data sets on infrastructure controlled by my org. Machine-generated data (e.g. log files, periodic diagnostics) Content and sentiment from online media (e.g. social media) Image, video, and audio data Physical sensor networks (e.g. "Internet of Things")Cautious-FCModerately relevant102 102 109 109 114 124 129 107 113Slightly relevant Not at all relevant117Very relevantn=46120 n=31n=15Only exception was media-rich dataForward capacity: Measures resource commitment *Headcount *Expected headcount *Expected spend *Willingness to outsource ..which will increase likelihood of delivering desired Big Data outcomes Swing is an indexed number based on relevance score.19Preliminary findings - November 2013 20. Organizations are sourcing data both internally and externally My organization uses sources for Big Data initiatives primarily from the following: None yetICT (n=45)7%27%20%11%36%Internal data Open-access third-party data (incl. government)Non-ICT (n=47)17%28%6% 6%43%Proprietary third-party data Combination20ICTs more open sources; Non-ICTs should prioritize content opportunities e.g. data journalism Respondents focused on profiling customers more dependent on thirdparty data e.g. social mediaRespondents who selected "Very high" for relevance of Customer behavioural profiling Combination Internal data Open-access third-party data Proprietary third-party data None yet Sorted by highest %. Bars illustrate sw ing against remaining samplePreliminary findings - November 2013104 59 146 313 87 n=45 21. Open data from Government needed to support ecosystem Having access to some government data does/will create valuable Big Data opportunities for me 100%% of respondents n=10821%Strongly agree19%Agree49%Neutral4% 6%What kinds of government data will assist you? Demographic; socioeconomic; behaviour Population (online & offline), migratory Crime (by ethnic group); Border Security Public &community services (utilities, health, education) Location (by utility); GIS Financial; credit WeatherDisagree Strongly disagree50%0%21Preliminary findings - November 2013 22. Government data needed for benchmarking & consolidation In order for us to understand the needs of Malaysians, statistical data from the population census is important to identify correlations with internal behavioural data Head of Decision Science, MNC bankGovernment has many different sets of survey data, collected from various sources. For instance, Ministry of Health data can be sourced from private and general hospitals, clinics or consultancies. Big data offers a mechanism to speed up consolidation of all this information, without any processing delays to configure each and every source. - Jin Chuan Tai Director, ChrysaSys Consulting Sdn Bhd22Preliminary findings November 2013 23. General wariness of red-tape, PDPA identified as biggest concern Do you believe your local legal/regulatory environment a hindrance to your planned Big Data initiatives? 100%Not at all a hindrance2%% of respondents n=10813%47%Not a hindranceWhat hindrances in particular? Personal Data Protection Act 2010 (PDPA) Red-tape Bureaucracy and organizational structure Data compliance and data risk Loss of data 15% Dont know: greater education around PDPA needed?Neutral50%17%Moderate hindrance6%Severe hindrance15%Dont knowRegulation does prevent some of our products or product features being deployed in some markets. Head of Development, global marketing analytics firm0%23Preliminary findings - November 2013 24. Less than a third willing to upload internal data to a Cloud service I am willing to upload my internal data to third-party infrastructure (e.g. a public cloud) 100%% of respondents n=108Strongly agree20%50%6%Vendors need to identify concerns privacy, migration cost, perceptions specific to Malaysian professionalsAgree37%Opinion is divided starkly amongst High-forward capacity respondents.NeutralHigh-forward Capacity23%14% 0%24Disagree Strongly disagreeStrongly agree Agree Neutral Disagree Strongly disagree Bars represent swing against Bars illustrate sw ing against Cautious-forw ard CapacityPreliminary findings - November 2013175 113 41 210 153 n=31 25. Some conclusions.. 25Strong Grass-roots and Mid-tier support for Big Data in Malaysia. Unknown at local Enterprise, C-level. Aspirations high but Human Resource commitment a concern. Immediate skilling priorities include R and Hadoop. Opportunities for boutique firms in Malaysia to meet specialist technical needs with global punch. Preliminary findings - November 2013 26. ..further research How are Big Data budgets being split between infrastructure / personnel / data? Who qualifies as a Data scientist what skills do they have, and what value do they add? How can Big Data activity contribute to Malaysias push to become a high-income nation by 2020? 26November 2013 27. 5 Questions to ask yourself today 27Where are your aspirational blind-spots? Internal vs. External. Are your aspirations unrealistic? Are you committing resources aggressively enough? Are you prioritizing the right blend of skills? Are you driving/participating in cultural change for Big Data advocacy? November 2013 28. Get in touch Tirath Ramdas [email protected] Founder Sandra Hanchard [email protected] Researcher You can find us on.. Feedback / questions / comments What do you need to know? www.bigdatamalaysia.org How can Big Data Malaysia serve your organization? November 2013