@ApachePirk
PresentedBy:EllisonAnneWilliamsApachePirkPPMCMember;Founder,EN|VEIL
Outline
WhatisApachePirk?
WhatisPIR?
WhyApachePirk?
PirkBasics
Roadmap
GetInvolved
Appendix:Wideskies
WhatisApachePirk?
FrameworkforScalablePrivateInformationRetrieval(PIR)
BeautifulBlendofMathematics&ComputerScience
DevelopedattheNationalSecurityAgency
DonatedtotheApacheSoftwareFoundationinJuly2016
UndergoingIncubationwithintheApacheIncubator
TwoASFReleasesTo-Date– 0.3.0ReleaseComingSoon
WhatisPIR?
PIR– PrivateInformationRetrieval
FieldofTheoreticalMathematicsandComputerScience- ~20years
AbilitytoPrivatelyRetrieveInformationfromaDataset
WithoutRevealingAnyInformationRegardingtheQuestionsAskedORtheResultsObtainedtotheDatasetOwneroranObserver
PoweredbyHomomorphicEncryption
WithoutPIR
WithPIR
IhaveaprivatequestionQI’mgoingtousePIR…
Querier Responder
Data
IhaveaprivatequestionQI’mgoingtousePIR…
IformE(Q)
E(Q)
Querier Responder
Data
IhaveaprivatequestionQI’mgoingtousePIR…
IformE(Q)
E(Q)
Querier Responder
AskE(Q)
ProduceE(A)
Data
IhaveaprivatequestionQI’mgoingtousePIR…
IformE(Q)
E(Q)
E(A)
Querier Responder
AskE(Q)
ProduceE(A)
Data
AnswerA=D(E(A))
IhaveaprivatequestionQI’mgoingtousePIR…
IformE(Q)
E(Q)
E(A)
Querier Responder
AskE(Q)
ProduceE(A)
Data
IhaveaprivatequestionQI’mgoingtousePIR…
IformE(Q)
E(Q)
E(A)
AnswerA=D(E(A))PIRisawesome!
Querier Responder
AskE(Q)
ProduceE(A)
Data
WhyApachePirk?
PIRHistoricallyLargelyTheoretical
Needfor
PracticalPIRRobustandDeployablePIRImplementations
ApachePirk
ProvidesaLandingPlaceforRobust,ScalablePIRFostersaCommunityAroundScalablePIR
PirkBasics
QuerierGeneratesEncryptedQueryVectorsGeneratesNecessaryDecryptionItemsforEachQueryVectorDecryptsEncryptedResults
ResponderPerformsEncryptedQueriesFormsEncryptedQueryResults
AskE(Q)
ProduceE(A)
IhaveaprivatequestionQI’mgoingtousePIRK…
IformE(Q)
E(Q)
E(A)
AnswerA=D(E(A))PIRKisawesome!
Querier Responder
BeyondtheQuerierandResponder
EncryptionLibraryPaillier CryptosystemCurrentlyImplemented
DataSchemaFramework
QuerySchemaFramework
GenericDataFilter
Testing– DistributedandIn-MemoryTestSuites
DataSchema
{"date":"2016-02-20T23:29:05.000Z",
"src_ip":"55.55.55.55",
"event_type":"dns-hostname-query",
"query_id":"9cef5344-3dee-41f9aa32da72d9f74778",
"qtype":[1,0],
"dest_ip":"1.2.3.6",
"ip":["10.20.30.40","10.20.30.60"],
"qname":"a.b.c.com",
”rcode":0}
<schema><schemaName>nameoftheschema</schemaName><element><name>elementname</name><type>classnameortypename(ifJavaprimitivetype)
oftheelement</type>
<isArray>trueorfalse-- whetherornottheschemaelementisanarraywithinthedata</isArray>
<partitioner>optional- Partitioner classfortheelement;defaultstoprimitivejavatypepartitioner </partitioner></element>
</schema>
DataSchema
{"date":"2016-02-20T23:29:05.000Z",
"src_ip":"55.55.55.55",
"event_type":"dns-hostname-query",
"query_id":"9cef5344-3dee-41f9aa32da72d9f74778",
"qtype":[1,0],
"dest_ip":"1.2.3.6",
"ip":["10.20.30.40","10.20.30.60"],
"qname":"a.b.c.com",
”rcode":0}
<schema><schemaName>awesomeDataSchema </schemaName><element><name>date</name><type>string</type>
<isArray>false</isArray><partitioner>org.apache.pirk.schema.data.partitioner.
PrimitiveTypePartitioner</partitioner></element>
….Lotsmoreelements….
</schema>
QuerySchema
{"date":"2016-02-20T23:29:05.000Z",
"src_ip":"55.55.55.55",
"event_type":"dns-hostname-query",
"query_id":"9cef5344-3dee-41f9aa32da72d9f74778",
"qtype":[1,0],
"dest_ip":"1.2.3.6",
"ip":["10.20.30.40","10.20.30.60"],
"qname":"a.b.c.com",
”rcode":0}
<schema><schemaName>myAwesomeQuerySchema </schemaName><dataSchemaName>superAwesomeDataSchema </dataSchemaName><selectorName>nameoftheelementinthedataschemathatwillbetheselector</selectorName ><elements>
<name>elementname</name></element><filterNames>
<name>(optional)elementnameofelementinthedataschematoapplypre-processingfilters</name></filterNames><additional>(optional)additionalfieldsforthequeryschema,in<key,value>pairs<field><key>keycorrespondingthethefield</key><value>valuecorrespondingtothefield</value>
</field></additional></schema>
QuerySchema
{"date":"2016-02-20T23:29:05.000Z",
"src_ip":"55.55.55.55",
"event_type":"dns-hostname-query",
"query_id":"9cef5344-3dee-41f9aa32da72d9f74778",
"qtype":[1,0],
"dest_ip":"1.2.3.6",
"ip":["10.20.30.40","10.20.30.60"],
"qname":"a.b.c.com",
”rcode":0}
<schema><schemaName>myAwesomeQuerySchema</schemaName>
<dataSchemaName>superAwesomeDataSchema</dataSchemaName><selectorName>qname </selectorName ><elements>
<name>src_ip </name>
<name>dest_ip </name></element>
<filterNames><name>google.com </name>
</filterNames></schema>
Algorithms&Implementations
AlgorithmsWideskies withPaillier
QuerierStandalone,Multi-threaded
Algorithms&Implementations
ResponderStandalone,Multithreaded
DistributedBatchMapReduce,SparkDatafromHDFS,Elasticsearch
DistributedStreamingStorm,SparkStreamingDatafromKafka
RoadmapImplementationRoadmap
InputAdaptors- NoSQLDatabases:Hbase,Accumulo;Kafka,NifiStreaming- StormandHeron,SparkStreaming,FlinkBatch– Flink,Beam
AlgorithmicRoadmapSecureMultipartyComputation,PrivateSetIntersectionFullyHomomorphicEncryption
AlwaysontheRoadmapImprovements/OptimizationstoExistingCodeBenchmarking
GetInvolved
We❤MathematiciansandComputerScientists
Youdon’thavetocodetocontribute!
ApachePirkWebsitehttp://pirk.incubator.apache.org
MailingLists–SubmitandDiscussIdeas/IssuesDev:[email protected]:[email protected]
@ApachePirk
Thanks!
Wideskies Appendix