open source bi overview
DESCRIPTION
Proof that an entire data driven Business Intelligence stack can be successfully implemented through open source software.TRANSCRIPT
Open Source Business Intelligence Overview
From Data Source to Analytics and Beyond
Agenda
● Open Source and BI● Data sources● Data Integration● Reporting/Frontend● Analytics● Data Quality● Data Governance
Source: https://www.informs.org/ORMS-Today/Public-Articles/October-Volume-37-Number-5/Back-in-Business
Data Sources
Traditional○ PostgreSQL - http://www.postgresql.org/
■ Pivotal Greenplum - http://gopivotal.com/
○ MySQL - http://www.mysql.com/
■ Percona - http://www.percona.com/
■ MariaDB - https://mariadb.org/
Columnar○ MySQL Derivatives
■ InfiniDB - http://infinidb.org/
■ Infobright - https://www.infobright.com/
○ MonetDB - http://www.monetdb.org/Home
Relational vs Columnar
Source: http://www.calpont.com/images/column-oriented-database.jpg
Data Sources
NoSQL○ Cassandra - http://cassandra.apache.org/
○ MongoDB - http://www.mongodb.org/
○ CouchDB - http://couchdb.apache.org/
○ Infinispan - http://www.jboss.org/infinispan/
○ Hadoop - http://hadoop.apache.org/
■ HBase - http://hbase.apache.org/
■ Hive - http://hive.apache.org/
OLAP○ Mondrian - http://mondrian.pentaho.com/
Source: http://gerardnico.com/wiki/database/oracle/oracle_olap
The Next Wave of Data Sources
Virtualization○ Teiid - http://www.jboss.org/teiid/
Semantic Web/Graph○ Sesame - http://www.openrdf.org/
○ Neo4j - http://www.neo4j.org/
○ OrientDB - http://www.orientdb.org/
○ Infogrid - http://infogrid.org/trac/
Source: http://www.ebizq.net/blogs/guest_session/2009/12/putting-data-to-work-for-cloud-bpm-mdm-and-soa-projects.php
Graph Database
Source: http://en.wikipedia.org/wiki/Graph_database
Data Integration
Kettle - http://kettle.pentaho.com/
Talend - http://www.talend.com/
CloverETL - http://www.cloveretl.com/
Reporting
BIRT (Actuate) - http://www.eclipse.org/birt/phoenix/
Pentaho - http://reporting.pentaho.com/
Jaspersoft - http://community.jaspersoft.com/
Saiku - http://meteorite.bi/saiku
Full Stacks
SpagoBI - http://www.spagoworld.org/xwiki/bin/view/SpagoBI/#
Pentaho - http://www.pentaho.com/
Jaspersoft - http://www.jaspersoft.com/
Analytics
R - http://www.r-project.org/
Weka - http://www.cs.waikato.ac.nz/ml/weka/
RapidMiner - http://rapid-i.com/content/view/181/
Data Quality
Profiling○ DataCleaner - http://datacleaner.org/
○ DQGuru - http://www.sqlpower.ca/page/dqguru
Suites○ Talend - http://www.talend.com/products/data-quality
Testing○ SQLUnit - http://sqlunit.sourceforge.net/
○ dbFit - http://benilovj.github.io/dbfit/
○ etlUnit - https://github.com/dbaAlex/etlUnit (shameless plug :p )
Data Governance
MDM○ Talend - http://www.talend.com/resource/data-governance.html
Business Rules Engine○ JBoss Drools - http://www.jboss.org/drools/ ○ Open Rules - http://openrules.com/