tuplejump

`

tuplejumpThe data engineering platform

A startup with a vision to simplify data engineering

and empower the next generation of data powered miracles!

tuplejump

RohitFounder and CEO

SatyaFounder and CTO

What we do?• Tuplejump Platform provides ready to use, out of the box, all integrated

end-to-end data pipeline components to bring your idea to life fast!• Most startups spend a lot of time studying and integrating various OSS.

We have done this for you and assembled a system incorporating best of the breed systems.

• Our service engineers can assist you or develop your PoCs to entire solutions in record time.

The Data Pipeline

COLLECT TRANSFORM

PREDICT

STORE

EXPLORE

VISUALIZE

OpsCenter

The Tuplejump Platform | COLLECT

Hydra The tentacled framework to gather high volume and velocity data from push (devices, page alerts, forms, etc) and pull (web scraping, blogs, social networks, etc.) powered by Akka, reacting on demands to events and streaming to Spark to batch process.

The Tuplejump Platform | TRANSFORM

Spark + Calliope

Using the friendly Spark API with added features to easily consume or load data from and to Cassandra powered storage.

Transform structured and unstructured data and join other most simple data sets using drag and drop.

Join delta transformations on real time feeds with existing data using Spark streaming,

The Tuplejump Platform | STORE

DStore - Cassandra++

Cassandra, enriched with our custom components to provide an single storage mechanism for Files, (un)structured data, generic data formats like XML and JSON, etc.

StargateStargate, a lucene powered indexing mechanism built right into C* to allow for advanced indexing and searching of data

SnackFSSnackFS provides an HDFS compatible fat driver distributed file system over Cassandra.

The Tuplejump Platform | EXPLORE

Shark + CalliopeShark Analytical engine shines in exploring structured and unstructured data sets having large amounts of data .

With Calliope, you can have the most comprehensive reporting on data from Cassandra in seconds and minutes not hours.

Using Stargate indexes you can filter a lot of data in Cassandra saving those agonizing hours of batch jobs.

UberCube

Our patent pending Ubercube (™) technology is an distributed OLAP cube engine designed from ground up for interactive exploration over very large datasets. .

The Tuplejump Platform | PREDICT

MinerBot

Building on Spark's ML frameworl.

EA and ANN/DL frameworks to take ML to the next level.

Drag and drop Machine learning soon!

The Tuplejump Platform | VISUALIZE

Pissaro

A modern, game changing data frontend providing highly interactive and reactive visualization frontend.

Not just reports!

The Tuplejump Platform | OpsCenter

OpsCenter Deployment, monitoring and management framework built specifically targeting deploying, maintaining and scaling our platform without touching your server. Click to cluster One click deployment o take your application from development to cluster.

BigData PaaS Coming soon is a PaaS, so you focus on your idea and let us worry about the rest.

Tuplejump Advantage• All the advantages of Spark + All the advantages of Cassandra + Much

more!• Over 500x (much more in case of filtered data) faster than traditional

Hadoop solutions• Shark + C* provide for superfast ad hoc querying.• UberCube empowers sub-millisecond responses on very large cubes• MinerBot provides ready to use ML Algos, plus a possibility of much more

complex algos and mechanisms than just map reduce.• Ready to use, no integration required• Easy to develop, deploy, monitor and scale

Case Study I - IoT

Case Study I - IoT• Hydra was designed for IoT in first place. Supports MQTT for messaging

from and to devices/sensors and communication between devices.• Use message processing to raise alerts• Use batch processing for advanced data analytics• DStore provides a highly scalable write optimized distributed storage for

events and messages.• MinerBot powers anomaly detection and automation on event analysis

and patterns • Build multidimensional analytics cube on the event features with

UberCube • Visualize and understand the events in charts with Pissaro

Case Study II - AdvertisingAds

Case Study II - Advertising• Hydra empowers high volume/velocity data collection to gather page

clicks, user events, user behaviuor, etc.• Event Processing to trigger/handle RTB• MinerBot to optimize ad-user matching based on previous success/failure

records• Pissaro to empower the Advertiser dashboard and reports

Lets talk!

tuplejump

Documents

lot of data

data snackfssnackfs

velocity data

existing data

load data

generation of data

unstructured data sets

large amounts of data