bypass batch. rethink real-time. prepare once, query ... · compression – enabling in-memory at...

19
Analytics as Intended Bypass Batch. Rethink Real-Time. Prepare Once, Query Infinitely. A powerful NoSQL, in-memory search and compute platform from

Upload: others

Post on 26-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

Analytics as Intended Bypass Batch. Rethink Real-Time. Prepare Once, Query Infinitely.

A powerful NoSQL, in-memory search and compute platform from

Page 2: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

The algorithmic economy is coming. The same algorithms that changed finance are coming to healthcare, retail and manufacturing … Analytics algorithms are the connectors between big data and decision making.

Gartner, 2015

Page 3: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

4 | FinchDB 5

… definitely has a new, category-defining quality to it …

… [I see] so many use cases that could leverage this type of capability …

… an interesting and fascinating application …

… making hard things easy is a key driver of buying decisions right now …

… I definitely see the value in something like this …

… what’s missing in in-memory computing is true differentiation and innovation. This product is definitely on the right track …

Page 4: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

6 | FinchDB 7

Table of Contents 1 Introduction 7 In-Memory & Why It Matters 9 About FinchDB 11 Bypass Batch14 Rethink Real-Time16 Prepare Once, Query Infinitely 18 Our IP19 Use Cases 23 Benefits of FinchDB25 FinchDB In Action 27 About Finch Computing

Page 5: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

1 | FinchDB 2

Introduction Though often short-handed as just “databases,” database management systems (DBMS) are critical to organizations’ ability to store, retrieve and maintain their information assets. While most of us envision rows and columns when we think of databases, the reality is that data is stored in many different forms – as text, spreadsheets, presentations, research reports. And database management systems are complex software solutions that have, for years, allowed some of the world’s most successful companies to understand and leverage these data resources.

However, as modern data needs have evolved in terms of both scale and complexity, modern database management systems have struggled to keep pace.

The reaction has been that vendors have bolted-on new technologies or new capabilities to their older, firmly entrenched database products. Or, in the case of some in the enterprise, the reaction has been to avoid adopting new technologies altogether.

Both of these approaches are short-sighted. And both result in an exacerbation of the problem.

And this is before an organization even encounters critical questions around enterprise search or analytics platforms. And whether they meet their needs for speed or complexity, or both.

It’s time for something new.

It’s time to change the game.

Page 6: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

3 | FinchDB 4

The biggest barriers are that – even with all the new capabilities that in-memory or NoSQL databases can provide – they still can’t do all of the things that modern retail, or finance, or e-commerce, or cybersecurity, or automated industrial, or Internet of Things use cases demand. Things like…

Serving-up embedded, per transaction analytics

Embedding dynamic models in the query

Linking entities (or data points) on the fly

Scaling to handle petabytes of data … or more

Compressing data to make in-memory feasible at-scale

Decompressing data (down to a single field) in true real-time

Returning scored and ranked results

Allowing a user to do fuzzy searching

Enabling knowledge discovery over time

Running in the cloud or on commodity hardware

FinchDB can do all this and more…

For certain, there have been incremental changes in the database management system realm in recent years. The rise of the not-only SQL (NoSQL) vendors and the falling cost of memory have been the most impactful and the most potentially disruptive.

But neither has gained broad, mainstream adoption.

Here’s why:

A SQL database management system, with amped up speed enabled by an in-memory architecture, will still deliver the same kind of results that a disk-based SQL DBMS can deliver. It will just do it faster. It will deliver lightning fast answers to static data, stored in rows and columns and with absolute values.

“How many items did I sell last Tuesday?” “How many items are in my warehouse?”

A NoSQL database management system is better suited for finding data stored in ways other than rows and columns. Think graphs, clusters, or documents, for example. These types of databases offer incredible flexibility. And, when placed in-memory they offer this flexibility at the breakneck speeds associated with in-memory technologies. But questions persist about NoSQL databases. Important questions of scalability, interoperability and maturity of the platform.

Incredibly though, these are not the biggest barriers to adoption.

Page 7: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

Introducing...

Analytics as Intended

Page 8: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

7 | FinchDB 8

in-mem • or • y com • pu • tingThe storage of information in the main random access memory (RAM) of dedicated servers rather than in complicated relational databases operating on comparatively slow disk drives.

Techopedia, accessed 10/6/15

In-Memory and Why It Matters In-memory computing enables a wide variety of critical business functions to be performed faster. Better. More effectively.

It helps organizations glean meaningful insight – patterns, co-occurrence, outliers, or trends – from within huge volumes of data. And it’s possible today in ways it never was before, thanks to the falling price of memory. This makes in-memory both relevant and economical across a wide variety of use cases.

The fundamental benefit of in-memory technologies is, without question, speed. But what can you do with all that speed?

Personalized Customer ExperienceUnderstand your customers in entirely new ways. Merge data about them from myriad sources to gain a single, dynamic view of your customer at the moment you need to access it. Deliver better experiences, driving better customer relationships and trust, and, ultimately, more revenue. Think call centers. Service desks.

Cost-SavingsRun business processes faster. With greater efficiency. Identify risks or redundancies. Correct them in real-time. Perform analytics you didn’t know you could. Make decisions about what you learn. Streamline. Simplify. Save. Think enterprise IT managers.

Risk ManagementIdentify patterns that indicate whether a failure (or a breach, or a malfunction or fraud) has occurred or whether one is about to occur. Instantly. Don’t just see the risk, stop it. Quickly. Get smarter and better at it each time. Think safety and security officers. Risk and compliance teams. Enforcement and investigations.

Enhanced CompetitivenessDifferentiate from others in your industry by leveraging data to become smarter, faster, better at what you do. Develop entirely new products or services on the basis of this increased awareness. Think product development. Business strategists. Information service providers. Consultants. Analysts.

Predictive AnalyticsLeverage sophisticated algorithms to see around corners. Predict what ad would best resonate with a visitor to your site. Serve up discounts or new offers based on anticipating customer needs or movements. Get ahead of performance risks. Identify fraud as it’s happening. Think marketers. Investigative teams. Strategic planners. M&A teams.

FinchDB marries database management system functionality, with search engine capabilities and embedded, advanced analytics. And it’s all in-memory, built for in-memory applications. We began by imagining all of the possibilities of a true, all in-memory platform ... and then we built one that delivers analytics as they’re intended to be done. Without baggage. Without constraints. Reflective of today’s data landscape.

Page 9: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

9 | FinchDB 10

… current architectures are already straining under the load generated by today’s applications and services; the enterprise will need an entirely new approach to

enable real-time performance of big data loads.

IT Business Edge: The Need for Real-Time Big Data Across the Enterprise, August 28, 2015

About FinchDB Bypass Batch. Rethink Real-Time. Prepare Once, Query Infinitely.

FinchDB is a new kind of computing and analytics platform for finding better, faster insights from within massive amounts of data. It’s a database management system (specifically, a JSON document oriented database), an analytic engine and a search tool – in one. All three. All together. All in memory. And in each area, patented technologies enable radically different data experiences.

With FinchDB, you can bypass batch. Rethink real-time. Prepare your data once and query it infinitely. Apply predictive models on the fly, per transaction. Run it on commodity hardware, or in the cloud, and scale it infinitely.

FinchDB picks up where current, pieced together solutions fall short. It’s the kind of big data platform that today’s environments demand.

And it’s a completely new way of interacting with information.

FinchDB Adds Value in Environments Where...

There are high volumes of changing data.

The questions are also changing.

The answers depend on accessing the whole set (or large subset) of the data.

The question or analytics model is changing, or you have lots of different questions.

Compression – enabling in-memory at scale – is important.

Speed is important: inline analytics, which avoids latency, enables true real-time analytics.

Storage footprint is important, as are hardware costs.

There is a need to embed predictive analytics in models.

1M documents to petabyte scale; streaming, constantly changing

data, or more of same type of data

Questions are unique to users; analytics driven by the information that comes through on the query

Looking for the “best” answer, not a definitive one. Consider how/if/to

what extent data changes

Need flexibility in the query formation and fuzzy search; DBMS must perform like a search engine

as well as a database

FinchDB = down to 16% of original size

Need sub-second response times; enabling analytics per transaction.

Need embedded models

Need storage costs reduced; must run on commodity hardware

As in HTAP environments

Page 10: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

11 | FinchDB 12

Answer

Bypass BatchBatch processing, commonly short-handed as just “batch,” has served numerous enterprise functions incredibly well. If you want to look up answers to certain questions that you know you want to know (think: answering a single question, at a single point in time, about each customer in your database), batch processing can help you do it. It’s a means of storing pre-calculated answers to predetermined questions in such a way that you can look them up on-demand. But for all its benefits, batch processing has one fatal flaw: It requires a user to know and use predetermined inputs and, second, to know the question they want to answer. But what if you don’t know either of those things? And aren’t the things you don’t know you don’t know the most potentially impactful to your business?

FinchDB features a few important innovations that overcome the shortcomings of batch.

Fuzzy SearchesFinchDB allows a user to perform fuzzy searches – meaning you can use it as much like a search engine as a database. Say you want to find everything in your database about a person. But you only remember that person’s first name and some other piece of qualifying information, like their employer. FinchDB allows you to search based on those two pieces of information alone and will return every potential result, every candidate answer for your query. It also accounts for misspellings. If you’re looking for “John,” “John,”

“Jon” and “Jonathan” will all be returned.

Scored and Ranked ResultsFinchDB will also return scored and ranked results. It shows users a confidence score for every query response (0-100%) and ranks the answers from most to least likely to be correct. This allows a user to see the best answers from among all possible answers and determine the one that is closest to what they’re looking for. Most database management system technology would return a null answer in this scenario.

Embed Models in Each QueryFinchDB also does something else important – something that can dramatically reduce organizations’ dependency on batch processing: It affords a user the ability to embed models in each query. Why is that important? It means that the computations necessary to return an accurate, valid query response are happening at the moment you query the database. Nothing is pre-computed. Your answers are based on the most accurate, up-to-date picture of your data and as you conceive new questions, you can receive new answers without a lengthy data preparation and pre-processing effort. You can perform analytics per transaction, or in the aggregate.

Embedded Analytics: Why it Matters

Custom Code

DBMSInitial

Answer

Shortcomings of Analytics Outside the Database

Pre-Computed Answer

DBMS

Known QueryStatic Data

Batch-processed data is run through Hadoop for analysis, then put into a DBMS to be queried.

Document database technology, like MongoDB, is coupled with custom code to perform the analytics the user wants.

Batch Processing (Look Up Known, Precomputed Info)

X

X

There’s no one platform out there collecting and analyzing siloed data, and this leads to a less comprehensive understanding of one’s connections.

Tech Crunch: The Future of the Web Is All About Context, August 18, 2015

Query

static data

static models

static queries

no content

too slow

difficult to manage

Page 11: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

13 | FinchDB 14

On-demand, real-time reporting allows the perspective of the most recent knowledge to be acquired … being able to rephrase the question in respect to the answers to the first question is at the heart of interrogative inquiry. Information is flexible and you can reexamine it in many different ways with the right tool set at your disposal.

Smart Data Collective: 10 Pains Businesses Feel When Working with Data, July 15, 2015

A few important features of FinchDB enable it to deliver true, real-time performance

All-In-Memory, Built for In-MemoryFinchDB is entirely in-memory. It doesn’t just have in-memory components or features. It was also built to perform in-memory. Unlike other solutions that were simply put in memory to achieve faster performance, every element of FinchDB was conceived with in-memory computing use cases in mind.

CompressionFinchDB’s proprietary compression approach can shrink data to just 16 percent of its original size – while also preserving the ability to decompress a single record or a single field. This innovation is precisely what makes in-memory feasible at scale.

Access All Relevant DataFinchDB derives its query responses by examining all of the relevant surrounding, contextual data – in sub-millisecond response times. This is enabled by our in-memory approach, as described above, and also by the complex, proprietary algorithms we have developed, and the distributed nature of our architecture.

Analytics Inside the DatabaseFinchDB’s embedded analytics – down to each query, as discussed elsewhere in this document, or in the aggregate – also means that the analytics it delivers are faster than those done via separate analytics engines. This is because there’s inherent latency any time data has to go from one platform to another. FinchDB’s in-memory architecture, combined with the feature of embedded analytics, means it’s much, much faster than any other solution on the market.

Rethink Real-TimeWhen it comes to analytics, faster is better. Faster than the other guys is the ballgame. But “real-time” means different things to different vendors. Are analytics really real-time if you have to precompute results using static data? Are the results really real-time if the data changed while you were processing it? What if the data changes again?

Predictive analytics. Real-time recommendation engine use cases. High-volume, high frequency transactions. These are the types of use cases that demand not just speed – but incredible flexibility and fidelity. FinchDB delivers.

At least 73 percent of businesses have or plan to invest in big data within the next 24 months, according to a recent Gartner survey. The big data

machine is clearly here to stay, with today’s organizations searching for new ways to derive actionable business value from their troves of information. While Hadoop-powered data lakes are a popular choice for analyzing data

‘after-the-fact,’ there is often missed opportunity in analyzing real-time data streams the moment the information is received …

Information Management, Real-Time Fast Data Streams Can Drive Business Value, August 11, 2015

Page 12: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

15 | FinchDB 16

Answer Candidate Set

With FinchDB, you can prepare your data once and query it infinitely because the technology features:

Embedded Models in the QueryOne of the most distinctive features of FinchDB is that analytical models are embedded in each query – and this is why we can say it allows you to “prepare once, query infinitely.” This is because once the data is prepped and once models are built, a user doesn’t have to rebuild and reprocess their data. The analytical models are dynamic and flexible. The data can be queried and analyzed any number of ways for any number of users – each time returning results that are based on the most accurate, most up-to-date picture of the data.

On-the-Fly LinkingAnother proprietary feature of FinchDB is its ability to perform on-the-fly entity linking in true real time. Its in-memory architecture coupled with its ability to process all relevant, contextual data around an entity make it capable of instantly finding connections in the data to perform real-time knowledge discovery and machine learning – meaning FinchDB gets smarter and better the more data it ingests.

Embedded, Per-Transaction AnalyticsNot only does FinchDB combine search, database and analytics technologies, it allows a user to perform analytics on specific transactions, at the exact moment of the transaction. From there, a user can make immediate, impactful decisions based on the result.

Data by itself is meaningless. Its true value is in the insights we extract from it. Currently available platforms are simply not getting the job done.

Tech Crunch: The Future of the Web Is All About Context, August 18, 2015

Prepare Once, Query InfinitelyIt’s been estimated that as much as 80 percent of a typical data scientist’s time is consumed with so-called

“janitorial work.” Cleaning data. Standardizing it. Processing and preparing it for specific uses. And doing it again if the use case, the user group or the data itself changes. With FinchDB, that’s not the case.

Not only can FinchDB handle dirty data, after the data is prepared once, it can be used over and over again for different uses. A user doesn’t have to rebuild their data every time. It takes in true, real-time streaming data and lets a user analyze the stream as well as his or her stored data – making connections in the data, discovering new relationships or finding new patterns all the time.

Every FinchDB query includes a search specification and a scoring/ranking specification. We look at both to return a candidate set.

In an entity disambiguation use case, to do that, we calculate a disambiguation score, based on: Name Score, a Topic Vector Score, a Context Vector Score, and a Prominence Score.

And we do that in less than a millisecond around every event. In this use case, an “event” is a new document coming into the system.

The same would be true in other use cases. In a cybersecurity usecase, an “event” would be an attack. In this scenario, you could take what’s happening in your environment and include that data as part of the query.

Anatomy of a FinchDB Query

Best AnswerCandidate Set Query

Page 13: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

17 | FinchDB 18

Our IP Finch Computing technologies are built on 25 pending patents that address transformative new inventions in the areas of in-memory computing, knowledge discovery, entity disambiguation, data compression and on-the-fly record linking and scoring – all critical components of the types of end-to-end, big data analysis solutions that modern workplaces and consumers of data will demand.

The patent portfolio that makes possible the new kinds of data experiences that FinchDB enables includes:

Search (7 patents)Improving search capability within in-memory databases

The search capabilities we’ve built into FinchDB enable it to work like a search engine as well as an analytics platform and DBMS. Users see scored and ranked results – just like we’re accustomed to seeing from commercial search engines. A user can perform fuzzy searches – meaning they can construct imperfect queries and still get meaningful responses. And, returned search results draw on all relevant, contextual data – with FinchDB computing, in an instant, the closest-matched answers to your query.

Entity Disambiguation & Knowledge Discovery (7 patents)Correctly identifying same-named or similarly-named entities and understanding their relationships

The proprietary, topic based models in FinchDB are complex and sophisticated, allowing it to understand huge volumes of context and derive accurate query responses. We put these computational algorithms in-memory, and married them with other text analysis innovations, to enable radically different data experiences. Structured or unstructured. Streaming or static. Internal or external.

In-Memory Database Technology (6 patents)A massively powerful, all-in-memory computing platform built from the ground-up

Not just a database. Not just an analytics platform. Not just search technology. All three. All together. All in-memory. Conceived to be a complete paradigm shift and to allow a whole new set of users to understand and interact with information like never before. To expect more than current technologies afford. To rethink and redefine. To change the information analytics game completely.

Text Analysis (4 patents)Automatically reading free-form text as a human would

Not just scanning text, really reading it. Understanding it. Applying logic, learning and reason. Understanding specific industry lexicons. Or specific, user-defined expressions. Being smart enough to define an entity’s type, its identity and its relationships. To change the stored picture of an entity as the entity itself changes over time.

Compression (1 patent)Reducing the size of a dataset to as much as 16% of its original size

It’s what makes in-memory computing feasible at scale. To make massive amounts of data infinitely smaller. But preserving the ability to decompress a single record, or even a single field in an instant. Accessing the data you need – all of the data you need – at the very moment you need it. In less than a fraction of a millisecond.

Page 14: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

19 | FinchDB 20

Use CasesFinchDB is relevant across a number of diverse but industry-agnostic use cases. One of them is entity disambiguation – the ability to, in an instant, correctly determine the identity of an entity (a name, a place, an organization, etc.) mentioned in text – even if that identity shares a name with other entities of the same type. For example, is the text referring to John Roberts the Supreme Court Justice, or John Roberts the Fox News anchor? Is it referring to New York, the city, or New York, the state?

Entity Disambiguation in NewsFor demonstration purposes, we often use a streaming feed of news – content from 15,000 English-language sources to the tune of about 800,000 articles per day – to show how FinchDB performs super-fast, super-accurate entity disambiguation. The screens below show sample performance metrics on this feed – running on just a four-node cluster in AWS.

Again, FinchDB does all of this using just 4% of its CPU capacity. And, it can perform on streaming data or static data. Structured or unstructured. We find that this simple demonstration spurs all sorts of customer-driven use cases – and we can develop the necessary tools and support to enable them all.

Sample Performance Metrics

Other Use Cases

Fraud DetectionMonitoring financial transactions to identify patterns that could indicate fraud

Internet of ThingsCollecting high- volume, high-velocity sensor and telemetry data to improve performance, meet customer needs or support new product development

Digital Communication/Message TrafficMonitoring streaming feeds of message traffic to identify patterns, risks, trends

CRM/Customer Service EngagementAggregating customer information from multiple sources with different data models to improve the customer experience

PersonalizationIngesting clickstream data at high throughput rates to create and refine visitor profiles, serving up relevant content at each site visit

Cyber Security Protecting data from breaches, theft or misuse

Legal IntelligenceMining legal documents (docket data, filings, etc.) to identify and disambiguate entities and understand the relationships and trends within that data

Running on our streaming feed of real-time, English-language news from around the world, FinchDB handled 70,000 disambiguation queries every five minutes. That’s 233 queries per-second. It’s an average of 10 unique entities correctly identified per incoming news document.

More than 230 Disambiguation Queries Per Second

PROD Search – Volume (5 minute interval) (Search Controller) (2dh)

last min avg maxMemDB-Prod-4: SearchController Successful Search Count Value [avg] 10.85k 6.55k 11.38k 17.89kMemDB-Prod-3: SearchController Successful Search Count Value [avg] 10.97k 5.6k 11.45k 17.83kMemDB-Prod-2: SearchController Successful Search Count Value [avg] 9.9k 6.3k 11.34k 17.85kMemDB-Prod-1: SearchController Successful Search Count Value [avg] 11.23k 6.19k 11.41k 17.81k

FinchDB processes this massive feed of incoming, unstructured text with lightning fast speed. In an average of less than one-thousandth of one second, it is able to produce accurate disambiguation results by analyzing huge amounts of context – reading and understanding it as a human would.

Performing at Sub-Millisecond Response Times

PROD Search – Average Response Time (milliseconds) (Search Controller) (2h)

last min avg maxMemDB-Prod-1: SearchController Successful Search Count Value [avg] 1.07 0.551 1.02 2.28 MemDB-Prod-2: SearchController Successful Search Count Value [avg] 0.8953 0.5056 0.9493 2.23 MemDB-Prod-3: SearchController Successful Search Count Value [avg] 1.05 0.5299 0.9493 2.54 MemDB-Prod-4: SearchController Successful Search Count Value [avg] 0.8297 0.5067 1.02 3.21

Even at its peak, in this example, FinchDB was using just 12% of its capacity (on one of the four nodes in this AWS instance). On average, total utilization hovered at around 2 percent – an indicator of FinchDB’s massive scalability and its ability to run, essentially, on commodity hardware.

Using A Fraction of Total CapacityPROD MemDB – CPU Utilization (2h)

last min avg maxMemDB-Prod-1: CPU Utilization (Average Over 1 Minute) [avg] 0.92% 0.32% 2.17% 6.39% MemDB-Prod-2: CPU Utilization (Average Over 1 Minute) [avg] 3.44% 0.32% 1.97% 6.98% MemDB-Prod-3: CPU Utilization (Average Over 1 Minute) [avg] 4.37% 0.23% 2.04% 5.36%MemDB-Prod-4: CPU Utilization (Average Over 1 Minute) [avg] 1.55% 0.3% 2.49% 11.87%

Page 15: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

21 | FinchDB 22

Who Uses FinchDB?

The end-user value of a solution like FinchDB falls into three broad but not mutually exclusive categories: Increasing Revenue, Identifying Risk and Fulfilling the Organizational Mission.

The page at right shows user stories across a variety of enterprise functions within each of these three categories. If these challenges sound like ones you’re facing, we’d love to hear from you.

FinanceAn accounting company is moving its corporate headquarters. It wants to understand which of the two options it is considering would be most disruptive to its headquarters-based employees and their productivity. It wants to assess everything from commutes to home prices, to schools and accessible retail and services.

Operations & Risk

A software company wants to understand whether there are any opportunities to generate revenue by simply correcting errors in its poorly maintained contracts database. It wants to search this database for certain people, places and organizations of importance to the company, with the goal of identifying opportunities to reestablish or enhance relationships.

Supply Chain A retailer wants to make its supply chain more efficient. The company wants to examine the geographic areas around its direct and secondary suppliers and their competitors. This will help determine whether shorter, more direct shipping routes, or choosing new vendors based on their proximity to others, could generate significant cost savings for the company.

Human Resources

A professional services company is pursuing a multi-million dollar contract. The prospect has shared the names of those hearing the pitches and making the selection. The prospect has indicated that chemistry with the pitch team will be an important factor in its decision. HR must quickly cross-reference the bios of those hearing the pitches with the company’s talent pool to determine which resources have similar experiences, connections and associations.

Sales An inside sales resource wants to quickly identify customer prospects matching the profile of current customers, or of a set of defined characteristics, to the profiles of people mentioned in various public content sources (news, SEC EDGAR data, Pacer data, SBA loan data, etc.).

Comms & Marketing

An online retailer wants to send ultra-targeted mobile advertising to its customers, informed by an understanding of the locations and organizations that matter most to them. The content and cadence of the advertisements should change as information about the customers changes.

INCR

EASE

REV

ENUE

Finance A company discovers that one of its employees is defrauding the company. It is concerned that there may be multiple employees involved and wants to examine the relationships between the individual in question and other employees.

Operations & Risk

A multi-family housing developer with a non-union labor force is under near constant scrutiny from local unions and lawmakers. The company wants to understand whether its employees are connected to any of the most aggressive activists and whether upticks in union activity correlate to any perceived employee dissatisfaction.

Supply Chain An sporting goods company wants to understand potential weaknesses in its supply chain that could disrupt its operations or produce quality issues that could result in reputational damage. It wants to examine direct and secondary suppliers, the leadership teams in place within those organizations and recent news coverage of the companies and their products.

Human Resources

A hiring manager at a defense and aerospace systems company wants to know if a candidate has relationships with any of the company’s competitors or suppliers. The job requires accessing sensitive, proprietary information that, if in the wrong hands, could expose the company to significant risk.

Sales A sales person wants to be aware of M&A activity in a crowded, niche market to avoid unintentionally contacting or sharing information with competitors. Access to easily searchable real-time information, gleaned from news and regulatory sources, is key to enabling this level of awareness.

Comms & Marketing

A restaurant chain wants to assess how news coverage on issues like portion sizes, kids menus and trans-fats has changed over time. It wants to see how the share of coverage has shifted among its competitors and how it differs from region to region and from season to season. The intelligence will allow it to create specialized menu offerings and marketing campaigns around them.

IDEN

TIFY

RIS

K

FinanceA professional services company wants to develop more accurate revenue projections, based on the real-time input of its project managers. It wants to augment its projections with predictive analytics based on historical performance around events (weather, late holidays, school calendars, etc.) and the anticipated occurrence of similar events during an accounting cycle.

Operations & Risk

A company wants to ensure its in-house legal personnel are instantly made aware of changes in local, state and federal law (or potential changes to those laws so that it can advocate for or against them via the trade industry PAC). It wants to monitor news coverage about these potential changes in the specific geographies material to its operation.

Supply Chain A produce supplier is notified of a salmonella outbreak that could impact its wholesale customers and their retail customers. It must quickly identify all of the primary and secondary purchasers of its products and the growers in its network that could be the source.

Human Resources

Research shows that a person’s manager has an enormous impact on productivity, satisfaction and professional development. With this in mind, a healthcare organization wants to assess its mid-level managers and the tenure, turnover and performance of the individuals on their teams. Once the organization has identified the most effective and ineffective managers on its staff, it can socialize the behaviors of the effective ones and take corrective action with regard to the others.

Sales An e-commerce company wants to immediately offer relevant coupons or product offering to visitors to its site, taking in demographic data from lists it owns, current events data, click stream data and market reports. It must make these assessments and delivery decisions in an instant, or lose a sale forever.

Comms & Marketing

A company wants to understand the rate at which it is being covered in the media, on what topics versus its largest competitor, in what outlets and by which reporters. This intelligence will influence its positioning strategy and its overall corporate and product narrative.

FULF

ILL

THE

MIS

SION

Page 16: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

23 | FinchDB 24

Benefits of FinchDBFinchDB affords users a number of important benefits. Recent research on predictive analytics in particular validates what we’ve seen with our customers: The benefits of FinchDB as an enabling platform technology are broad and deep – and important.

Competitive Advantage57% According to recent research, more than half of predictive analytics projects help companies differentiate and gain market share.

FinchDB helped a legal information service provider stand out based on the quality, accuracy, timeliness and comprehensiveness of its content.

New Revenue Opportunities50% Half of these projects led to revenue-generating opportunities, including dimensionalized customer relationships or entirely new products or services.

FinchDB is working with an ad-delivery company to fulfill the promise of true, real-time, 1:1 marketing.

Increased Profitability46% Nearly half of organizations launching predictive analytics projects say they’re more profitable, more efficient.

FinchDB is helping an energy customer make operational decisions with better insights that show opportunities to cut and manage costs.

Customer Service Improvements39% Of organizations using predictive analytics say it’s made it easier for them to attract and retain customers.

FinchDB is helping a consumer electronics company better meet its customers’ mobile needs.

Cost Savings36% Mitigate risk. Eliminate redundancy. Identify organizational efficiencies. Organizations using predictive analytics say they know more, do more, save more.

FinchDB because it’s an analytics platform, a search tool and a DBMS in one – lowers total software costs, in addition to the savings it delivers via efficiencies and profitability gains.

Ventana Research, sponsored by IBM and KDNuggets, “Next Generation Predictive Analytics Benchmark,” July 2015, 2015

Page 17: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

25 | FinchDB 26

FinchDB in Action: Two Examples

Faster, Better Content Processing

Large, global information service providers distinguish themselves by offering customers access to comprehensive, domain-specific intelligence about the industries in which their customers operate. Customers use this information to make all sorts of business critical decisions: how to differentiate, how to tailor a pitch, how to reach their target customers. These information service companies build their business on the ability to offer the best-available quality, accuracy and precision. And to do this, they must regularly reprocess massive amounts of textual information – in the case of legal information service providers, for example, this includes things like new docket information, judges’ profiles, case histories, decisions or other legal documents.

Within any organization, one of the only ways to do this is to put all of their data into existing document-oriented database management systems, query it and then pull it all back out to do analytics. This is time consuming and unnecessarily complicated.

Input – Process – Analyze – Repeat

FinchDB enables us to offer something different.

We can take in streaming data, process it quickly and perform analytics in real-time, at the moment a user queries the data. We leverage all the relevant data to get high quality results and do so with incredible speed. We enable real-time knowledge discovery, updating knowledge bases as information about the entities within them changes. And we do it in big data environments with high transaction rates.

In-Memory makes this whole process faster.

On-the-Fly linking makes it more accurate.

Search capabilities make it easier.

Knowledge discovery makes it better.

And with FinchDB, our customers can deliver a superior service to their customers, and differentiate from competitors on the quality and comprehensiveness of their information.

Fulfilling the Promise of Real-Time, One-on-One Marketing

We’ve all visited websites and been greeted by an ad for an item we’ve recently viewed online. Concert tickets. Shoes. Kids toys. What happens less often is that we visit a site and are served up – immediately – an ad or a coupon or an offer for an item tailored just for us. Without having to have searched for it before. The content is generated based on a wide variety of data – lists that marketers cultivate, information from our purchase and browsing history, demographic data, real-time click stream data and more.

The promise of this type of highly customized, and ostensibly highly effective, form of marketing has eluded brands for years. FinchDB makes it possible.

Currently, we’re working with a major, data-driven content delivery company to do precisely this. To enable them to perform real-time analytics on multiple data sources and, in less than a millisecond, serve up the right content to that user while they’re on an ecommerce site – not the next time they comeback, or once they leave and are surfing the web.

Our capability will enable this company, our customer, to differentiate from its competitors and to offer a never-before-enabled capability to its customer base. The potential revenue impact is enormous. All because FinchDB was designed to add value in today’s complex information environments.

1 2

Page 18: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

27 | FinchDB 28

Organizations need actionable insights faster than ever before to stay competitive, reduce risks, meet customer expectations, and capitalize on time-sensitive opportunities.

InformationWeek: Real-Time Analytics: 10 Ways to Get it Right, August 20, 2015

About Finch Computing

Finch Computing, formerly Synthos Technologies, is a division of Qbase, LLC. Together, we build and support new ways of interacting with information – via three innovative products that address complex and never-before-addressable big data needs at various points in the software stack.

FinchDB is an in-memory computing platform with embedded analytics that is changing the expectations of database technology. Part database, part search engine it enables radically different data experiences.

Finch for Text is an entity extraction and disambiguation engine. It reads free-form text as a human would, extracting eight distinct entity types and disambiguating them against massive knowledge bases. It turns documents into data points and is the foundation for an effective unstructured text strategy in the enterprise.

Finch Analyst is an end-to-end data discovery solution that enables customers across a variety of industries and use cases to find greater meaning and insight from data. Whether it’s streaming or static. Internal or external. Words or numbers.

We believe the search tools of today are insufficient. We understand that the rate at which the world creates information will never be this slow again. And we know that analytics, including predictive analytics, are going to become a larger and larger part of every professional’s job.

Finch Computing enables dramatically different data experiences. And meets an intensifying market need for a better, faster, more accurate picture of the environments in which our customers operate.

Page 19: Bypass Batch. Rethink Real-Time. Prepare Once, Query ... · Compression – enabling in-memory at scale – is important. Speed is important: inline analytics, which avoids latency,

29 | FinchDB

Contact Us

[email protected]

Washington, DC

12018 Sunrise Valley Drive Suite 300 Reston, VA 20191 +1 888 458 0345 toll free

San Francisco, CA

28 Second Street Floor 3 San Francisco, CA 94105 +1 415 314 7110

Beavercreek, OH

3800 Pentagon Boulevard Suite 110 Beavercreek, OH 45431 +1 937 521 4200