intern presentation - amit jaspal

6
Free Text Seach within Hive - Amit Jaspal - Software Engineering Intern, Cloudera Search - Graduate Student at University of Illinois Urbana Champaign - Worked for D.E.Shaw & Co. before joining UIUC - Undergrad from Indian Institute of Information Technology

Upload: amit-jaspal

Post on 22-Aug-2015

51 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Intern Presentation - Amit Jaspal

Free Text Seach within Hive

- Amit Jaspal- Software Engineering Intern, Cloudera Search- Graduate Student at University of Illinois Urbana Champaign- Worked for D.E.Shaw & Co. before joining UIUC- Undergrad from Indian Institute of Information Technology

Page 2: Intern Presentation - Amit Jaspal

Motivation : Enabling analysis of Unstructured Data

Page 3: Intern Presentation - Amit Jaspal

Integrating Solr with Hive

Page 4: Intern Presentation - Amit Jaspal

SolrStorageHandler - Integration Framework between Hive and Solr

Page 5: Intern Presentation - Amit Jaspal

How to use SolrStorageHandler in Hive

● CREATE EXTERNAL TABLE sales_contracts_solr ( id int, string title ... ) STORED BY 'org.apache.hadoop.hive.solr.SolrStorageHandler', TBLPROPERTIES( ‘solr.zookeeper.service.ensemble’ = '127.0.0.1:2181/solr', ‘solr.collection.name’ = ‘sales_contracts’,

‘solr.query’ = ‘termsandconditions:*Sections 19 U.S.C. 1304*’);

● SELECT * from sales_contracts_hive JOIN sales_contracts_solr ON sales_contracts_hive.id = sales_contracts_solr.idwhere sales_contracts_hive.interest_rate > 10%

Page 6: Intern Presentation - Amit Jaspal

Thanks

- Patrick Hunt ( Manager and Mentor )- Search Team- Hive Team