intern presentation - amit jaspal
TRANSCRIPT
Free Text Seach within Hive
- Amit Jaspal- Software Engineering Intern, Cloudera Search- Graduate Student at University of Illinois Urbana Champaign- Worked for D.E.Shaw & Co. before joining UIUC- Undergrad from Indian Institute of Information Technology
Motivation : Enabling analysis of Unstructured Data
Integrating Solr with Hive
SolrStorageHandler - Integration Framework between Hive and Solr
How to use SolrStorageHandler in Hive
● CREATE EXTERNAL TABLE sales_contracts_solr ( id int, string title ... ) STORED BY 'org.apache.hadoop.hive.solr.SolrStorageHandler', TBLPROPERTIES( ‘solr.zookeeper.service.ensemble’ = '127.0.0.1:2181/solr', ‘solr.collection.name’ = ‘sales_contracts’,
‘solr.query’ = ‘termsandconditions:*Sections 19 U.S.C. 1304*’);
● SELECT * from sales_contracts_hive JOIN sales_contracts_solr ON sales_contracts_hive.id = sales_contracts_solr.idwhere sales_contracts_hive.interest_rate > 10%
Thanks
- Patrick Hunt ( Manager and Mentor )- Search Team- Hive Team