guide to big data...in conclusion, big data and the technolo-gies that support it are enabling us to...

13
SUPPORTED BY GUIDE TO BIG DATA

Upload: others

Post on 08-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

SUPPORTED BY

GUIDE TO BIG DATA

Page 2: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

INTRODUCTION P3

SOFTWARE FRAMEWORKS FOR BIG DATA P4

RECOMMENDATION ENGINES P5

CHOOSING THE RIGHT BUSINESS INTELLIGENCE TOOL P6

REAL-TIME DATA CAPTURE FOR BIG DATA P7

BIG DATA FOR SMB RETAIL P8

BIG DATA IN TRAVEL P9

OUTSOURCING YOUR BIG DATA REQUIREMENTS P10

COMPANY PROFILES P11

CONTENTS:

GUIDE TO BIG DATA

SUPPORTED BY

Page 3: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

We were really excited to collaborate with MDeC to write and publish this guide to Big Data.

Having the chance to work with multiple smaller technology companies from Malaysia has been a real education and has enabled us to produce a guide that is more unique and varied in content than when we collaborate with larger MNC companies.

Big Data is an evolving discipline with many emerging technologies.

It’s a time of technical evolution. Typically at such a stage in the IT cycle, the really exciting innovation happens in smaller independent technical companies. Such companies sometimes get consumed by larger MNCs and if they hit something “hot” they may become larger MNCs themselves.

By working with specialist local companies, we have been able to explore and explain the key areas of Big Data, and do so with expert guidance and input from technologists developing cutting

edge technologies that push the boundaries of areas like Neuronal Network Databases, Recommendation Engines and Data Capture.

Our thanks go to MDeC for supporting this guide and linking us with MSC status companies to create a truly collaborative and informative guide.

Yours in Data & StorageAllan Guiam - EditorData&StorageASEAN

INTRODUCTIONFROM THE EDITOR

NOTE FROMMDEC CEO – DATUK BADLISHAM GHAZALIBig Data Analytics is not simply a technology proposition, but also how organisations can structure new business models, drive into new markets and target growth more effectively. The sooner businesses realise this, and align their IT and business teams accordingly, the faster they will see the benefits and gain a competitive advantage from Big Data Analytics. Recognising this, MDeC is once again pleased to host the Big Data Week in Kuala Lumpur for the second time. I am also pleased that

we embarked on a collaboration with Data & Storage Asean to produce this informative guide and feature some of our local players who are harnessing opportunities arising from Big Data Analytics. This is important because our responsibility at MDeC does not end with purely supporting MSC status companies alone, we are also tasked with helping the wider IT community grasp important technical concepts so that they continue to invest and benefit from it.

GUIDE TO BIG DATA

SUPPORTED BY

Page 4: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

If you have been researching Big Data, you will no doubt have heard of Hadoop - the popular software framework for storing and processing Big Data.

But what is Hadoop? Is it the only emerg-ing platform for processing Big Data? Are there other alternative technologies that may be better suited to Big Data tasks?

Hadoop or more formally referred to as Apache Hadoop was originally developed out of papers published by Google that described its distributed file system and a process called MapReduce. Without getting too deep into the technology, MapReduce solves the problem of indexing millions or billions of data items.

Hadoop is built on a clustered computing environment where data is stored against multiple nodes running commodity hardware linked through a network. Because processing is shared, as more processors and storage are added into the cluster, MapReduce is able to perform fast analysis on huge data sets.

Whilst Hadoop is arguably the most famous framework, there are other technologies, which are viable alternatives to Hadoop such as Disco (originally developed by Nokia) also based on MapReduce, and NoSQL databases, such as MongoDB, NoSQL and Oracle NoSQL.

The challenge of Big Data is not only in its scale, but also in how we might want to analyse it. We need to be able to capture the data and store it in a way that allows flexibility in how we can access and utilise it. Consequently, the software frameworks used to handle Big Data will be varied and

continue to grow in number.

Our company, Neuramatix, developed a patented technology called NeuraBASE which creates a datastore that emulates the way, we believe, the human brain works. Taking this neuronal networking approach allows us to process enormous amounts of data at high speed without the need for distributed clustered computing.

The NeuraBASE datastore is built based on a parent-child dependency architecture. As a simplified example, the parent data or node might be the letters C, A and T. The child data or node created from this may be the words CAT, ACT and AT. These words will not be stored as separate entities, they are created from pointers and dependencies to the parent node.

This approach has two major unique factors that are perfect for Big Data. Firstly, the more data we ingest, the smaller the rate of growth will be in NeuraBASE. This keeps the data processes manageable without having to move to distributed computing models. Secondly, as we ingest more data, more dependencies are learned, meaning, allowing the data model to perform tasks learned by recalling relevant dependencies at high speed. For example, the more bilingual text we input into our machine translation system, the more it improves its translation accuracy - with no further programming required.

The one source of Big Data that exceeds all others is genomics. In genomics research, the NeuraBASE approach we take allows us to search across entire human genome significantly faster than any other methods that employ clustered computing approaches to Big Data.

A detailed description of the Neuronal Networking Approach to Big Data is available here - http://www.neuramatix.com/ANeuronalnetworkapproachofex-pressingnetworkmotifs.pdf

The NeuraBASE approach to Big Data allows us to push the boundaries of artificial intelligence by using a neuronal model that learns and adapts as opposed to rigidly following pre-determined pro-gramming logic and decisions trees. For example, NeuraBASE can be used to enable a robot to learn to self-balance and walk on different terrains. Minimal pro-gramming rules are required as the robot learns from both its successes and failures. Two robots using NeuraBASE in different environments may learn at different rates and in different ways. As they continue to learn and adapt to each of their environ-ment, the eventual style and length of their strides may end up being different, much like how we humans are able learn and adapt to our environments.

We are only scratching the surface of what can be achieved with Big Data. Innovation is happening everywhere. Big Data should not only be about standardised platforms and architecture for problems we face today. It’s about creating technologies that will solve problems we will face tomorrow.

Whilst technologies like Hadoop may be grabbing the headlines today, companies and organisations looking to achieve extraordinary things should look beyond the large global players. If anything, Big Data teaches us about openness to possibilities.

SOFTWARE FRAMEWORKS - THE CORE OF BIG DATA BY ROBERT HERCUS

SUPPORTED BY

ROBERT HERCUS Robert has over 40 years’ experience in Information Science, specialising in large-scale computing infrastructure and computationally intensive projects. This includes

hardware and software development, military systems development, overseeing implementation of the IT infrastructure and development of the Touch ‘n Go prepaid card in Malaysia.

Robert is the co-founder of the Neuramatix Group of Companies and the inventor of NeuraBASE,

a patented concept for the construction of neuronal networks using temporal or spatial association of neurons. He is also the Managing Director and co-founder of Malaysian Genomics Resource Centre Berhad (MGRC), one of Asia’s leading providers of genome sequencing and analysis, and genetic screening services.

GUIDE TO BIG DATA

Page 5: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

If you are an Amazon customer it’s more than likely that you have been the beneficiary of one of the most successful and advanced recommendation engines around today. When you buy books on Amazon you will have noticed the recommendation that “people who purchased this book also purchased that book”. Reportedly 35 percent of Ama-zon’s business is generated from recom-mendations.

Increasingly more business is conducted online. Consumers’ online behaviour and footprints are getting bigger and more detailed. With Big Data technology this behaviour can be collected and analysed in order to make customised and indi-vidual predictive recommendations for complementary purchases.

In the early days of online retailing a one-size fits all approach largely based on “tags” or “hard linking” product depend-encies was considered effective. To this day these simplistic types of recommen-dations engines still have a part to play for ecommerce sites selling a simple array of goods.

However today we are seeing more businesses including ecommerce sites like Amazon employ more advanced ways to customise the consumer recommenda-tion experience – an approach that is increasingly not just important but vital.

Today, building a recommendation engine is a true exercise in Big Data collection

and predictive analytics. Significant real-time analysis needs to occur without impacting website performance.

For each person that visits a site, every action they perform on that site needs to collected including personal (if available), geographic and product information. Other factors such as seasonal variations and special opening hours also need to be included in the data collection process. Following which all data collected needs to be matched and analysed.

As recommendation engines evolve to pull in and assess all of this data, two main models have stood out from the rest:

Content filtering which is based on linking keywords and values in user profiles and product descriptions; and

Collaborative filtering is based on analysing large amounts of user behav-iour, actions, responses and preferences in order to make predictions about their likes based on matching to similar users.

More recently, we have also seen the development of Hybrid Recommender Engines, where both content and collabo-rative filtering are deployed in a single solution. Research shows that for more complex buying processes a hybrid approach can result in more accurate and effective recommendations.

At Predictry my team is working on developing recommendations even further. We are incorporating a third area into our hybrid recommendation engine

which we term social sentiment. This pulls data from social media, specifically data which expresses a sentiment such as “comments” or “likes”. We combine this with content and collaborative analysis to create very highly targeted, customised and individualised recommendations.

Typically this technology is being used for upselling and cross-selling as part of the ecommerce process. However the uses could extend far beyond that, potentially giving you a totally individualised web experience no matter what you are searching for.

In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly filter the “noise” on the internet and bring you directly to the items and content that you want to see. At Predictry we believe we are at the forefront of individu-alising customisation. However, traditional and established methods may still be appropriate for some applications.

When looking at adding recommendation to your e-business or e-service, the most important thing is to understand the complexity of your offering as well as the profile of your user base. Once you have a strong understanding of those factors, only then can you make the right choice of recommendation engine and provider.

RECOMMENDATION ENGINES BY ST CHUA

ST CHUA, PREDICTRY CEO ST enjoys turning ideas to reality. His experience includes being the Global Business Developer of Rebate Networks, a German-based Venture Capital with interest in the

daily-deal space spanning across 30 countries globally. Before that, he was an integral part of Maxis’ CEO office where he was heavily involved in new business and overseas expansion, including

M&A deals worth USD1.3 billion. His role was expanded to become the retail project manager for Maxis, managing 29 retail outlets nationwide in Malaysia. Besides being a Director in Verve Technologies Sdn Bhd, he is also a co-founder of a health spa, a FMCG distribution company, a women fashion e-commerce site, a professional photo studio and a market entry consultant. In his free time, he also mentors European and Russian based start-ups. ST holds an MBA from INSEAD, France.

GUIDE TO BIG DATA

SUPPORTED BY

Page 6: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

Business Intelligence (BI) has been around for many years even before the buzz started around the term Big Data. BI is Data Analytics, and for years, BI tools have served the purpose of enabling businesses to pull data from various sources into one analytical engine. Permitting data to be sliced and diced in ways previously not possible with traditional Relational Databases.

In essence this is a large part of what Big Data is all about – taking huge amounts of raw data and turning that into something useful for business. The challenges of doing this have evolved with massive data sets needing to be collected from increasingly varied sources. Whilst BI is a mature market with established players, the technology around Big Data Analytics is still evolving.

As with all new and evolving technologies, we caution customers not to get caught up in the hype when selecting their BI tool. The most important thing is that you choose a solution and a provider that solves YOUR problems.

Before choosing your BI tool, it is impor-tant to think about the data you need to ingest and how easy or difficult that task will be with different tools. You also need to consider how regimented or flexible you plan will be based on the variety and type of data you expect to analyse. These factors will affect the technology you use.

Today there are a number of preferred models for BI engines.

One of the most mature models is based on Relational Online Analytical Processing (ROLAP). ROLAP uses data already pre-indexed data from the RDBMS from

which it is pulling data. This usually means that load times can be fast. But as it relies on “pre-indexed” data, ROLAP is not always as flexible in how it can analyse data. If you have tightly defined analysis for whch SQL type queries are well suited, then ROLAP can be the correct choice. ROLAP is best for analysing non-aggregatable data such as textual strings. However when it comes to Big Data, ROLAP based BI tools may often be too reliant on RDBMS indexing to easily and effectively deal with unstruc-tured data. Also, because ROLAP is based on SQL type querying it can suffer from relatively slow query performance.

Another often used model is Multidimen-sional Online Analytical Processing (MOLAP). MOLAP tends to be built on a proprietary engine and uses a concept called pre-aggregation, which means the calculations are pre-generated as the datastore is created. The advantage is that highly complex queries can be run very quickly. The flexibility and speed makes it excellent for dealing with many of the demands of unstructured Big Data. However there are limitations. Mainly due to the pre-aggregation process, loading data can be complex and may also need extra investment in skilled resources. In addition, this method of building the datastore can place limitations on scalability. MOLAP is more suited for analysing summary data from larger data sets.

Big Data puts new demands on BI and Data Analytics. Raw data now comes from so many sources and there is an increasing need to perform BI in real-time, so we have to advance BI methodology to

keep pace with evolving demands. We are seeing Agile Business Intelligence in response to the ever evolving questions we may want to ask based on expanding data sources.

At Speedminer we have developed what we call a columnar approach to BI. We use an underlying object orientated database to import data from nearly any source whether structured or unstruc-tured. This approach enables us to break the data down into a format that is both scalable and flexible, in many cases giving us the benefit of both MOLAP and ROLAP. Our approach also means that applications can be implemented on our database enabling data to be available to our BI engine the instant it is created. We believe this is unique and enables true real-time BI analytics. Our Columnar approach is highly flexible and adding new data sources into the BI engine is simple and fast.

BI tools deployed correctly will save time on existing business analytical tasks. For instance, our Columnar approach has enabled a government department to significantly reduce a monthly number crunching exercise from 13 days to 1 ½ days.

To be clear, there is no obvious right choice. You need to understand your data, set your criteria and make decisions based on your objectives. Is it speed of load, ease of load, speed of query, or flexibility of query? Once you know your priority you can start to review your options based on an understanding of which methods are best suited to your needs.

CHOOSING THE RIGHT BUSINESS INTELLIGENCE TOOL BY THOMAS HOW

THOMAS HOW Thomas How, is the founder of Speedminer Sdn. Bhd. He has more than 20 years of IT experience especially in consultation, development and implementation of Data

Warehouse and Business Intelligence (BI) solutions. He has in-depth experience utilising efficient, scalable techniques dealing with large-scale data

warehouses across a variety of industries including customers such as the Department of Statistics Malaysia (DOSM), MATRADE and Celcom (M) Berhad. His experience with different customers’ requirements, architectures, and methodologies has enabled him to evolve a unique approach to data warehousing utilising best of breed components and methods. He actively works on data warehouse implementation and consults internationally.

GUIDE TO BIG DATA

SUPPORTED BY

Page 7: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

There was a time decades ago when starting a new business, all you needed was good products and friendly neighbourhood customer service. These days the hyper competitiveness of local business environments re-quires more than a great product or superb customer service. It requires business intelligence that is dynamic and real-time reflective of the target market. Companies that have access to the right information and are able to act upon these have the upper hand.

Market intelligence has been around for years and while the fundamentals have remained unchanged, the demand for greater scale and the ability to amalga-mate the data, and act on the information has certainly challenged marketers’ abilities to deliver.

Core capabilities and differentiator

Raydar Research is a marketing informa-tion services and research agency. Our core competency is around data collec-tion. We developed software and mobile solutions to allow for large scale data collection, the results of which can be used to help an organisation bridge the gap between consumer expectations and businesses deliverables resulting in performance improvement, satisfaction and loyalty. We have partnered with local carriers to identify the target demograph-ics and use their mobile devices to engage the audience and develop real-time market intelligence.

Our key differentiators include the ability to identify or define a very targeted group

in terms of demographics, the ability to collect the data at the point of engage-ment as it happens, and the ability to provide instantaneous data analysis, which in turn, allows organisations to make business decisions faster and in real-time. Many of our clients find the ability to close a market intelligence project faster as a strategic enabler allowing them to execute targeted campaigns faster, more efficiently.

Most of our clients are from overseas markets like Singapore and other markets around the region. By partnering with local telco carriers, we are able to conduct location-based surveys using the users’ mobile device to really hone in on specific users – for example, travellers at the Changi Airport in Singapore or perhaps shoppers on Orchard Road.

Learning Experience

Having worked with so many organisa-tions over the years, providing advice as well as developing innovative processes and technologies, it is interesting to discover the differences and similarities of our engagements over the years. Busi-nesses looking to harness market research for the first time must take a moment to understand their customer’s profile, particularly for startups or organi-sations strapped for resources. The advice of starting small will allow the business to focus its energy and resourc-es, and hopefully help establish a beach-head market that it can build from. Use highly targeted research, identifying the addressable market, and develop a

persona for the organisation. One thing is certain you cannot establish a position in your target market sitting in a desk. Market research is as much a discipline as it is a practice. We suggest organisa-tions to pick a reliable data collection company to partner with for field work. Ask for credentials and used cases. Take time to ask the service provider making sure they understand your business and your industry.

The Future

Market research as a business practice is beginning to draw interest among Malaysian businesses. Unfortunately, many do not understand or know how to use market research can help them build strategy. Coupled with digital and social media, there is an opportunity for local companies to draw upon the success and experience of businesses in other countries to hone and harness the potential.

What is also needed is for industry associations and relevant government bodies to rally around these innovations and educate members and the general public. Together relevant bodies need to dig deeper into how the technology can serve the interest of their members so that the industry, as a whole, is able to benefit from the potential of market research. With overseas businesses coming into Asia and regional players expanding into Malaysia, there is ample drive and opportunity for harness the potential of market research.

REAL-TIME DATA CAPTURE FOR BIG DATA BY KYM WONG

KYM WONG Kym Wong is the founder of Raydar Research, the creator of multiple mobile platforms to collect Big Data. He has a knack for translating real business needs into mobile applications that deliver value to customers and businesses such as banking, healthcare, manufacturing and consumer goods. www.raydarresearch.com

GUIDE TO BIG DATA

SUPPORTED BY

Page 8: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

S i m p l i f y i n g G r o w t h

Ranked 13 in the 2013 Global Retail Development Index report by management consultant A.T. Kearney, Malaysia is forecasted to experience solid retail growth over the coming years. Retail sales are expected to pick up following government initiative to improve the retail despite concerns over higher operating costs.

We believe these challenges can easily be overcome by better understanding customer buying behavior. Web Bytes is one of the early pioneers of online retail management solutions – essentially hybrid point of sales (POS) systems that combine advance technology, proprietary software and cloud computing to capture and analyse customer buying patterns.

Retailers have a micro view of their business, i.e., they know what products sell well in a specific store or mall. But beyond this, they do not understand why certain products sell at specific periods in the year. What we do at Web Bytes is aggregate the retail sales data to draw a picture of customer buying behaviors over any period of time at any area in Greater Kuala Lumpur.

Analytics is still at its very nascent stage within Malaysia’s retail sector. Many SMB retailers make decision using only their gut feeling. For more than three years, we’ve been serving SMB retailers – most of whom are not very analytics savvy. We give them the tools to analyse their business data in much the same way larger retail chains perform business intelligence to discover

what products sell and the margins they make selling goods.

What is missing here is the ability to see the whole industry. For example what products sell well in Suria KLCC in January versus Bukit Bintang. Or why people flock to Bangsar district. Imagine if a retailer has access to data that shows what products move during January in MidValley this information would alter the way retailers stock products.

Like other SMBs, retailers are very con-servative when it comes to investing in technology. One of the best things about our solution is that we are fully on the cloud. We fully support a pay per use model. Many of our SMB customers often start with simple offline POS systems. Over time as the business grows, we can scale our backend to support their business.

We do so by ensuring that we support and connect with standards-compliant POS infrastructure. You only need to install a client program to handle the peripheral. We developed an in-house application that is installed at retailer POS.

Today our customers include fruit stalls, apparel, footwear and grocery stores. Come 2015 when the government intro-duces a goods and services tax (GST), our customers will be comforted to know that our systems will comply with the new tax system on day one. That cannot be said of many of today’s current systems.

Does having a business analytics or business intelligence (BI) constitute having

a Big Data strategy?

That is a misconception. People think that just because they have deployed a BI solution they now have Big Data. In our view, true Big Data is having access to information beyond your business. A single retailer can only see what is happening at their store. If we are able to extend that knowledge to encompass the entire industry – that is Big Data!

Today we process a billion Ringgit of business annually. Does that give us Big Data? Absolutely not! We are working to rebuild our systems to accommodate the Big Data. As a vendor and an industry, we face a number of challenges including adopting standards for how data is created, collected, analyzed, managed and destroyed. We are not yet there but certainly we are getting there.

As with all young technologies, another key challenge is educating our retail customers about data security and privacy. Two years ago we achieved ISO certification and this has helped customers understand that the service we provide is secure – that we do not access the details of the data, merely aggregate the data to allow create a picture of the industry. This gives our customers the confidence that their data remains private and secure. But for Big Data to become widely accepted and used, we need industry and government support to qualify and certify technologies and standards.

BIG DATA FOR SMB RETAIL BY BOON-SHENG OOI

OOI BOON SHENG Ooi Boon Sheng, Founder of Web Bytes Sdn. Bhd., From age 18 Boon Sheng has show entrepreneurship starting as a freelance programmer. Later he obtained his Bachelor

Degree of Computer Science from Universiti Sains Malaysia (USM) and achieved the USM Gold Medal Award by Harvard Foundation for academic excellence. Whilst at University Boon Sheng designed an engineered and Resources Management Tools (RMT), which won several

awards including the Gold Medal award at ITEX07 and Gold Medal at the British Innovation and Technology Show, UK 2007.

More recently Boon Sheng formed Web Bytes, a software technology company that deliver cloud-based solution for the retail industry in the region. Web Bytes’s anchor product, Xilnex, currently powers thousands of retail users in over a thousand outlets. The solution, powered by Microsoft Azure, processes over RM 1 billion live retail transactions annually for retail businesses across Malaysia, Singapore, Australia and Cambodia.

GUIDE TO BIG DATA

SUPPORTED BY

Page 9: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

According to the 2013/2014 ITB World Travel Report over 5.3 million Malaysians travelled overseas spending US$7.2billion in 2012. In the same year Malaysia welcomed 25 million tourists contributing RM60.6 billion to the economy according to the UNWTO. The Travel & Tourism indus-try generated US$50.3 billion to the local economy (over 16 percent of total GDP) and is forecasted to reach US$54 billion in 2014.

The challenge for everyone in the industry is to figure out which parts of the country locals and foreign tourists will want to visit, and the goods and services they are willing to spend on. For many businesses, speculations are ripe that Big Data may hold the key to unlocking the puzzle within the industry.

The operative word is ‘speculation’ for indeed whilst many businesses have heard of Big Data and are interested to use Big Data, many don’t know how to deploy and use Big Data to their advantage. For most the knowledge of Big Data is limited to reading materials and snippets of informa-tion shared behind seminar walls. Indeed when meeting interested parties, the most common approach we encounter is “I read about it. My company has strong direction to harness Big Data but we don’t know to do it.”

As an early proponent of Big Data, we help them explore the potential of the technol-ogy and at the same time help them recognize that Big Data is a transformative journey that requires making big, some-times, disruptive decisions of they are to achieve their Big Data objectives.

The reality of Big Data is that it forces you to wrestle with three key strategic and operational challenges: information strategy, data analytics and enterprise

information management. Information strategy revolves around how you harness the information at your fingertips. Data ana-lytics is about garnering insights from your data so you can predict future customer behaviour, trends and outcomes. Finally, because of the volume, variety and velocity of data coming into your systems, you need an enterprise information manage-ment to drive innovation.

Experienced data analytics people will tell you that Big Data is complex technology that requires complex solutions. It is complex because in one single system you have congregations of large volumes of different types of data coming in very rapid sequence, and changing just as quickly. To handle this complexity therefore requires complex but powerful and highly scalable platforms like Hadoop. 

At Fusionex, we live by the principles of simplicity. The many failed ERP and CRM systems have taught us that users will shy away from using tools too difficult to use. Conversely we saw from the success of the Apple iOS platform that users will flock towards technology that is easy to use. We recognised the capability of Hadoop to allow us to work with Big Data with minimal programming. We developed GIANT – a Big Data analytics software to shield end-users from the complexities surround-ing Big Data, low-level plumbing, hard-core Apache Hadoop and MapReduce program-ming.

We are able to do this because we work with some of the largest retailers, hyper-markets and malls to connect their transac-tional data and provide them insight on customer behaviour. We also work with some of the largest hotels to help them identify customers who are buying or not buying their promotions and developing

insights to drive future campaigns by clustering customer behaviour patterns. 

Many large enterprises are unfased by complex technologies. For instance in the travel industry, organizations already have structured data culled from CRM systems. They also have unstructured data captured from customer engagements within their call centre. The challenge is how to combine all these data to develop a 360 degree view of the customer. Many within the travel industry continue to struggle to fully understand customers proactively. They often operate in reactive mode waiting for customers to approach them. Most marketing campaigns are broadcast mode designed to cast a wide net hoping to make a few catches. They don’t have clear view of customers or even an adequate cluster of customers.

The reality is that because Big Data is still in its infancy, many businesses lack the skillset, exposure and technical expertise to deploy Big Data. Many are also turned off by the upfront costs that come with building and deploying an in-house Big Data systems.

Marketers of Big Data principles of speak of the three Vs of Big Data: volume, velocity and variety. We believe that for businesses to truly reap the benefits of Big Data, they need to aim for a fourth V – val-ue.

For decades companies have been building repositories of data that let them see what happened in the past. But looking at historical data does not allow them to predict the future. What they want is to be able to predict what the customer will buy in the future. That for them is a powerful value. Big Data promises to make that happen.

BIG DATA FOR TRAVEL BY ISAAC JACOB

ISAAC JACOB As Vice President of Business Consulting, Isaac Jacob has more than a decade’s experience in functional consultancy and enterprise software project implementations.

Specialising in process re-engineering and change management, he has implemented Business Intelligence and analytics projects in various

industries, involving large volumes of data, different data sources to provide in depth analysis of data and trends. 

Jacob’s technical background coupled with his strong accumulated domain knowledge across manufacturing, market research, financial and asset management has led him to successfully execute and spearhead enterprise projects globally spanning across countries such as the U.S, Singapore, Malaysia, Holland, France, Hong Kong and U.K.

GUIDE TO BIG DATA

SUPPORTED BY

Page 10: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

The hype around Big Data has got many businesses believing that Big Data is the “holy grail” in their pursuits of things such higher revenues, improved customer loyalty and business expansion. Most businesses want to get on the bandwagon and see what Big Data can do for them.

The problem is that Big Data is more than buying a Business Intelligence (BI) tool and running some reports. In truth the investment required to do Big Data properly is significant. In the same way the thought process and planning required for any Big Data initiative are also significant.

Outsourcing Big Data is a very viable option that almost any company considering working in this area should consider. There are some specific situations where outsourcing is not possible, usually this revolves around privacy and confidential information, but beyond this using an Outsource Partner may very well be the best option for deriving value from Big Data.

The main benefit of outsourcing to a Big Data specialist is expertise. As an example at Pulse Group, in our Big Data practice “Pulsate” we have a combined 10 years of experience in this field which becomes an immediate head start for companies new to Big Data that employ our services.

The starting point with an outsource partner is also beneficial in its own right. Before we can start we need a tightly defined OBJECTIVE of what you want to achieve from your Big Data project. We have seen in-house Big Data projects fail simply because the aims and objectives were too loose from the outset. A reputable outsource partner will help you set those objectives. Just as important they should be able to help you assess whether those objectives are achievable. Big Data promises so much, it takes experience to know in advance if it can deliver.

A major criterion to getting “into Big Data” is the investment required. Big Data needs to be done properly or not all and it’s not just about hiring an analyst to crunch some numbers. There is investment in new hardware and software, middleware, networking and bandwidth. In addition the technology that you need to invest in may be new to your IT people as an example Hadoop or object based databases. So in addition to capital expenditure there may also be additional human resources required. Companies like Pulse that have years of Big Data experience already have the skills and the technology in place, so our customers benefit from the economies of scale that we already enjoy. For a short term temporary Big Data project outsourcing may be the only viable way to achieve

ROI. When establishing a permanent Big Data practice, this significant investment needs to be factored in and when it is, outsourcing may still be the best option.

At Pulse we see one other huge value that we bring to the success of Big Data initiatives, and in our view it is a major factor in deciding whether to insource or outsource. This is “filtering noise”. Big Data is not just big as data can come from multiple sources and much of the content can be a distraction rather than helpful. Cleaning and selecting the right data to analyse is a crucial but non-trivial task. Getting it wrong is almost certainly setting you up to fail. So in some respects it is ironic that to work with Big Data you need to know Big Data to start with. In our experience you either need to buy in the experience and expertise or outsource it.

Outsourcing Big Data initiatives is not for everyone! Outsource companies will generally have broad experience. However focused vertical experience of a fully resourced in-house team is likely to be more attuned to very specific needs of a particular vertical.

The point to note is that to undertake a Big Data project is a serious undertaking that will present difficult challenges, steep learning curves and investment of time and money. Outsourcing these initiatives should always be on the table as an option for any new Big Data project.

OUTSOURCING BIG DATA BY BOB CHUA

MR. BOB CHUA As CEO and Executive Chairman, Pulse Group PLC, Bob Chua is a Malaysian entrepreneur running Asia’s premier Marketing Analytics and Big Data Solutions Provider. He is the proud winner of many entrepreneurial awards,

including the prestigious Ernst & Young Entrepreneur Award in 2008.

Bob advises companies on various public and private boards, and enjoys mentoring budding entrepreneurs in his spare time.

[email protected] Skype ID: bobchua

GUIDE TO BIG DATA

SUPPORTED BY

Page 11: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

COMPANY PROFILESThe NeuraBASE Toolbox is a commercial software library for developing intelligent systems and devices based on a neuronal network model. The NeuraBASE Toolbox functions enable users to create intelligent applications by understanding the sequences and frequency of events that occur within a dataset. Once NeuraBASE has been trained using a set of data, systems developers can perform analysis and make predictions by detecting similar patterns in other sets of data. A dataset that has been

trained using NeuraBASE can be searched and recalled to better understand the conditions or sequences of events which led to the successful achievement of a task. This can help developers to easily recreate these conditions in their environment. http://www.neuramatix.comTel +603 2283 3860

Speedminer Sdn. Bhd. (Speedminer) is an MSC (Multimedia Super Corridor) Company. Its flagship product - Speedminer System has been successfully implemented at various sites throughout the world. Speedminer works with various Sales Channel/OEM Partners who distribute our software in selected geographies and help our customers tailor and deploy solutions. At the moment, Speedminer’s clientele includes organizations from Asia, Europe, America, Australasia, Middle

East and Africa. Speedminer System has received many awards include MSC-APICTA Award 2006-Best of Applications And Infrastructure Tool, PIKOM Awards 2006/2007- Emerging Company of The Year and the Best Product of the Year. Speedminer Sdn Bhd is a recipient of MSC Malayia’s Innovation Voucher.http://[email protected]

Web Bytes’s anchor product, Xilnex is a new-age Cloud-Based Retail Management Solution which helps retailer to grow without having to deal with the complexity and price of a conventional solution. Xilnex is one of the few retail management solutions in the world that is EAL1 (Common Criteria/ISO 15408) certified which promises world class security framework in place so that retailers can operate without worrying on its

data security and reliability. Xilnex currently power thousands of live Point-Of-Sales terminals which process over RM 1 billion of sales transactions a year. Web Bytes Sdn Bhd is a recipient of MSC Malaysia’s Innovation Voucher.http://[email protected]

Pulse Group Plc, is a research process outsourcing group, providing a range of online and offline solutions to the market research industry.We provide a range of services, including online market research panels, call centre solutions, translation services, and qualitative online and offline solutions.Our complete research solutions integrate seamlessly with your internal systems and processes, allowing you to stay focused in delivering value to your clients.As a virtual extension to your business, our services provide you with the confidence that allows your business to focus on:

With an experienced global team, we are able to provide our clients with timely, high quality data and importantly value for money. With leading technology in call-centre support, Pulse Group Plc. offers a one-stop solution.http://[email protected]

Predictry is the big data arm of Verve Technologies Sdn Bhd that is focussed in the research and development of Predictive Analytics solutions. Its core product is a robust and customisable recommendation engine to meet the latest business needs of e-commerce, marketplace, web-listing or content sites. By having Data Scientist and Business Intelligence experts in the team, Predictry is able to provide customised solutions with the goal of assisting clients to increase relevancy, user

engagement and click-through-rates. Predictry is supported by MDeC and a recipient of the Product Development and Commercialisation Fund (PCF) for Big Data. Verve Technologies Sdn Bhd is a recipient MSC Malaysia’s PCF (Product Development & Commercialisation) Grant.http://[email protected]

We help companies increase their profitability by providing them with data driven insights to create innovative customer experience solutions that truly delivers value to their customers and results in lasting loyalty.We achieve this by providing our clients with technologies and systems that capture “in-the-moment” experience and streaming those data in

support of our clients’ efforts to build lasting relationships with their customers.http://[email protected]

Travel & Hospitality is a core focus for Fusionex, where we have built various solid platforms to assist industry players for the purposes of improving sales revenue, reduce operating costs and raising the bar for customer satisfaction. For early adopters of IT for their businesses, we have created an online presence through services like online booking of products and services, as well as integration with social media channels for an optimised consumer experience.The following are just a few of our

key solution offerings are Online reservation systems and Travel agent booking management systems. Fusionex is a recipient of MGS (MSC Malaysia R&D Grant Scheme) Grant and MSC Malaysia’s PCF (Product Development & Commercialisation) Grant.http://[email protected]

S i m p l i f y i n g G r o w t h

GUIDE TO BIG DATA

SUPPORTED BY

Page 12: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

Big Data is a key transformative technology that is being pushed for adoption under the Digital Malaysia Plan, the national initiative to advance the country towards a vibrant digital economy by 2020.

Digital Malaysia will achieve this by creating an ecosystem that promotes the pervasive use of Digital Technology in all aspects of the economy. This will include connecting communities globally and in real-time, in order to increase the nation’s Gross National Income (GNI), enhance productivity and improve standards of living.

Collectively, Digital Malaysia aims to achieve the following: the creation of 160,000 high value jobs, increase Malaysia’s ICT contribution from 9.8% to 17%, provide an additional 1% SME contribution to Gross Domestic Product and create an additional RM7,000 of digital income per annum for 350,000 Citizens.

NOTE FROM MDeC

GUIDE TO BIG DATA

SUPPORTED BY

Page 13: GUIDE TO BIG DATA...In conclusion, Big Data and the technolo-gies that support it are enabling us to create recommendation engines which truly !lter the “noise” on the internet

PUBLISHED BY ASIA ONLINE PUBLISHING GROUP DATASTORAGEASEAN.COM

SUPPORTED BY