The future of search: How to stay relevant in Sourcing
Greg Lindahl CTO, Blekko
October 13, 2011 - SourceCon
A li@le about me
• Technologist, not a sourcer
• I did get sourced once, in 1995
• I’m proud to have the ugliest slides at the conference!
A li@le about you
• You guys are heavy, sophisJcated users of Google & specialized engines like Topsy
• You’re eager to learn about new things
• You quickly form opinions about what’s useful
Challenges in Sourcing
• The order of search results is based on incoming links & Page Rank
• You guys are heavy users of advanced features such as boolean search
• Social networks are “walled gardens”
PageRank Today
Advanced interfaces
• No one uses the advanced search interface of Google
• You’re lucky that it hasn’t disappeared!
• New Google algorithms trying to guess your intent are probably a net minus for Sourcers
Social Networks
• Facebook is mostly a “walled garden” -‐ many users don’t want their personal info to be public
• Facebook mixes work and play • LinkedIn is a pure play, but younger people don’t use it
• Twi@er is open, but a mixture of work and play
Search implicaJons of social
• General search engines won’t be good enough, now or in the future
• It will remain hard to try to match up candidates with several social accounts plus a web presence
Things you should know: BiGrams
• Google counts: – java programmer OR developer: 80 million – “java programmer” OR “java devloper”: 23 million
• Why? – Everyone indexes pairs and triples of words which are thought to be related • Names, job Jtles, common word pairs
Future search direcJons
• “semanJc search”
• 2 parts: – understanding the source documents be@er • bigrams of names just a start
– understanding your query intent be@er
• This will hurt advanced search and boolean queries!
Future search direcJons
• “real-‐Jme” search: twi@er and non-‐twi@er
• Non-‐twi@er real-‐Jme incorporated directly into the major search engines
• Twi@er search best in specialized engines
New market entrants
• blekko
• yandex
• duckduckgo
slash the web "
Silicon Valley handshake
• blekko was founded in 2007
• $55mm in financing, 29 employees
• Backers: USVP, CMEA Ventures, Yandex (strategic), Ron Conway, Marc Andreesen, …, Ashton Kutcher
Curated Search – No Spam, High Quality
ü Wikipedia model: Users idenJfy top sites for every category
ü Technology: blekko uses social data plus algorithms to make more relevant, spam-‐free search results
Algorithms + People = Better Search"
Slashtag basics
• Both algorthmic and human-‐curated
• Curated slashtags developed in conjuncJon with outside partners, such as Stack Overflow
• Type ‘em directly into the search box: Greg Lindahl /date
Slashtags
• Sort order: /date, /relevance • Narrow your search – Algorthmic: /forum /blog /gov /edu – Site: /foxnews.com – Curated list of websites: /health
• Human-‐edited by groups (think dmoz or wikipedia)
• Every user has their own namespace, plus there’s a namespace for /blekko/ tags
Tips:
• Use /findslashtags to find tags: python /findslashags
• Use /web to get rid of any unwanted autoslashtags
• Watch the suggesJons for slashtag suggesJons as you type
• We have some API outcalls that might save you Jme: /twi@er = Twi@er search API, /video = YouTube, /imges = bing, /bing, /google
Advanced Slashtags
• Can be negated: -‐/foxnews.com • Implicit -‐/spam on every searce – and you can add any result to it with 1 click – use this to get rid of all those “people finders”
• Can use mulJple tags to intersect /linux /blogs /date
• Can include tags in other tags to “OR” them
Sourcer Slashtags
• /people -‐-‐ algorithmic a@empt to find resumes, bios, etc
• /blogs and /forums, -‐/blogs and -‐/forums
• Develop a /spam slashtag, or maybe even several of them that you manually add to various searches
Programming slashtags
• /open-‐source -‐-‐ aliases /oss /foss /floss • /linux, /lkml, /bsd, /windows • /repo -‐-‐ open source public repositories • /apache, /fsf -‐-‐ umbrella organizaJons • /perl, /cpan, /php, /javascript, /python, /django, /ruby, /rails, /java, /erlang, /scheme -‐-‐ languages
• /hpc, /make, /hacker, /hakerspaces
A few examples
• Java programers with high performance compuJng experience: java hpc /people
• Followup on a candidate “marcus wa@s” java /blogs “marcus wa@s” java -‐/blogs “marcus wa@s” java -‐/blogs /date “marcus wa@s” java -‐/blogs /date=2008-‐2009
• Ok, more “marcus wa@s” /twi@er “marcus wa@s” /youtube -‐-‐ oops, basketball guy
The bad stuff
• blekko’s crawl is fairly small, 2 billion pages today – increasing this Fall
• Some of the programming slashtags aren’t as good as others – this will improve over Jme
For more info
• help.blekko.com – Add blekko to the list on the upper right search box
• blekko toolbar
To Sum Up
• “Let me explain. No, there is too much. Let me sum up.”
• Search is evolving in good and bad ways • New tools pop up every year
• I would love to hear feedback: [email protected]