mastering your universe with p4 search
DESCRIPTION
P4 Search is a tool built internally and open-sourced in the Perforce Workshop. It creates and uses an external search index to allow users to search the content of a Perforce Server. This talk will explain the inner workings of P4 Search, its setup and applications, and explore ideas on how to extend this great and essential tool.TRANSCRIPT
![Page 1: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/1.jpg)
#
Sven Erik KnopTechnical Marketing Manager
Mastering Your UniverseP4Search
Ralf GronkowskiPrincipal Product Consultant
![Page 2: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/2.jpg)
#
Sven Erik KnopPerforce Software
Ralf GronkowskiPerforce Software
![Page 3: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/3.jpg)
#
• Why P4Search?• What is P4Search?• Implementation Details and Demonstration
Overview
![Page 4: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/4.jpg)
#
Why P4Search?
![Page 5: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/5.jpg)
#
What is Search?
p4 files / p4 fstat / ...
???
File names, Changes ...
File content?
C#
.h
JAVA
PPTX
![Page 6: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/6.jpg)
#
• Built-in command, since Perforce 2010.1• Search files stored in P4D based on content
– Case sensitive and insensitive searches– Can use regular expressions– Can search through all revisions– Provide context search
• Returns depot paths
p4 grep
![Page 7: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/7.jpg)
#
• A few drawbacks:– Text search only, limited to 4K lines– No search for Metadata such as attributes
• Performance concerns:– Limited to 10,000 revisions by default– Memory and CPU consumption– But: lockless with peeking since 2013.3
What’s Not to Like?
![Page 8: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/8.jpg)
#
Solution: External Indexp4 files/p4 fstat
index
storesearch
Search engine indexes contentStores it in its own database
Users search the index firstIndex returns a depot path
Index and Perforce Servercan live on separate hosts
![Page 9: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/9.jpg)
#
• Lucene– Scalable, high performance indexing– Search Algorithms
• Solr– Stand-alone enterprise search server– HTML Administration interface– Extensible
• Tika– Content analysis tool
Apache Lucene, Solr and Tika
![Page 10: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/10.jpg)
#
• P4Search– Index queue (processing indexing requests)– Search controller (security)– RESTful API (integration into other tools)– UI (simple searches)
• Runs in Jetty
Additional Components Required
![Page 11: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/11.jpg)
#
What We Want to Search For
//depot/Talkhouse/rel1.0/com/walkerbros/common/widget/EBolt.java#10
![Page 12: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/12.jpg)
#
• Changes/Changelists• Branches• Jobs• Users• Workspaces• Depots
What We Don’t Want to Search For
![Page 13: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/13.jpg)
#
• Content• Metadata (whatever that might be)
What We Search By
![Page 14: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/14.jpg)
#
There is Content …
![Page 15: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/15.jpg)
#
• Accessible through p4 files / p4 fstat ...
And There is P4 Metadata
![Page 16: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/16.jpg)
#
And There is Common Metadata
![Page 17: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/17.jpg)
#
• For ordinary folks– p4 edit file– p4 attribute –n tags –v cool file– p4 submit -d “just defined a cool tag on file rev”
• For admins– p4 attribute –f –n tags –v cool file#rev
• Find them with• p4 fstat -Oa -F "attr-tags=cool" //depot/...
There is Even Custom P4 Metadata
![Page 18: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/18.jpg)
#
• File content• P4 Metadata• P4 attributes• And the common Metadata if desired
P4Search Will Index ...
![Page 19: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/19.jpg)
#
Details
![Page 20: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/20.jpg)
#
What We Store in Solr
+ other fields
![Page 21: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/21.jpg)
#
Solr Search Does Know A Lot But…
No ACL’s, no permission
![Page 22: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/22.jpg)
#
• Is query endpoint for users• Has simplified API• Provides P4 authentication (password|ticket)• Filters query results honoring the existing
P4 protections
So A Search Controller
![Page 23: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/23.jpg)
#
Accessing the Index
P4SearchSearch controller
SolrSearch index
![Page 24: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/24.jpg)
#
• External index and protection table?• Solution:
– Use a programmable search engine– Use Perforce protections to filter results
Users need read access to files to be able to search
Security Concerns
![Page 25: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/25.jpg)
#
• Jetty– Solr
• Lucene
• Jetty– P4Search
• Search queue/Indexer• Search controller• RESTful API• UI
Implementation
![Page 26: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/26.jpg)
#
• swarm.workshop.perforce.com/projects/perforce-software-p4search/files/main
Open source – Where To Find
![Page 27: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/27.jpg)
#
• Download from the Workshop• Follow the provided instructions to install• Run two services
– p4search-solr– p4search-jetty
Installation
![Page 28: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/28.jpg)
#
• On first run index your entire depot– You probably don’t want to do this
• On submit index new file revs– change-commit trigger on depot location
• At any time any given change– curl POST --data commit,change#
http://p4search:8080/api/queue/{token}
Ways to Populate the Index
![Page 29: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/29.jpg)
#
• Indexing– With trigger P4D, so ultimately any given client and user
• Searching– P4Search UI– Piper– Commons– Custom through P4Search API
Who Uses P4Search Today
![Page 30: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/30.jpg)
#
• Deep dive after learning Lucene/Solr• Starting point
p4search/solr/example/solr/collection1/conf– schema.xml– solrconfig.xml
Tweaking P4Search
![Page 31: Mastering Your Universe with P4 Search](https://reader033.vdocument.in/reader033/viewer/2022061118/5469b9b1af7959653c8b4cd4/html5/thumbnails/31.jpg)
#
DEMO