automating tinder w/ eigenfaces and stanfordnlp
TRANSCRIPT
Hours and hours of swiping
Not everyone is interested/attracted
Who wants to sink endless hours with these odds?
What’s the Problem, dude?
1 hr + Average / day
4+ matches
Average / week
Infinite Time wasted
A Problem for Technology
Performing both daily engineering and research-based projects at 3 Tier Logic, I’ve learned how to quickly hack things in Scala and the JVM. I figured a “Tinder Bot” was do-able.
Resources. Spend less time developing, more time automating.
Scoping. What specific problems can be solved with technology?
Research. Use secondary research to figure out what has already been done in this field.
A Problem of Problems
Swiping.• Others have used swipe-all strategies
• Creates bigger problem => more matches to filter • You might not think everyone is attractive
Interest.
• Even after a match, not everyone is interested
• Matches suddenly “go dark” • You simply don’t get along
Spammers.
• Self-explanatory.
A Problem of Problems, solved.
Swiping.• Use some sort of machine learning/A.I.
technique that can be “taught” who I find attractive
Interest.
• Develop a chat bot that will hold a generic conversation for a couple messages to filter the uninterested
Spammers.
• Set up detection rules to filter spammers
What tools are needed?
Scala • Type-safe, easy to hack, and huge advantage
with Java interchangeability.
Eigenfaces • Battle-tested facial recognition since 1980s.
• Best algorithmic performance and easy-to-use on Java Virtual Machine
StanfordNLP • Well-developed JVM-based NLP library
compatible with Scala
What is Eigenfaces? Need to do quick and dirty object or facial recognition? Eigenfaces may be for you.
The essence of Eigenfaces
Eigenfaces is the name given to a set of eigenvectors when they are used for facial recognition.
A typical use for calculating Eigenfaces works as such:
1. Obtain a training set of faces and convert to a pixel matrix
2. Compute the mean image (which is an average of pixel intensity across each image).
3. Compute a differential matrix by subtracting the mean from each training image, pixel by pixel
4. Compute covariance matrix of the differential matrix
The essence of Eigenfaces
Your “average face” may look a little uncanny…
5. Calculate eigenvectors from covariance matrix
6. Compute Eigenfaces by multiplying eigenvectors and covariance matrices together, and normalizing them
Putting it together
def computeEigenFaces(pixelMatrix: Array[Array[Double]], meanColumn: Array[Double]): DoubleMatrix2D = { !
val diffMatrix = MatrixHelpers.computeDifferenceMatrixPixels(pixelMatrix, meanColumn) !
val covarianceMatrix = MatrixHelpers.computeCovarianceMatrix(pixelMatrix, diffMatrix) !
val eigenVectors = MatrixHelpers.computeEigenVectors(covarianceMatrix) !
computeEigenFaces(eigenVectors, diffMatrix) !
} !
Multiplying eigenvectors/differential
(0 to (rank-1)).foreach { i => !
var sumSquare = 0.0 !
(0 to (pixelCount-1)).foreach { j => !
(0 to (imageCount-1)).foreach { k => !
eigenFaces(j)(i) += diffMatrix(j)(k) * eigenVectors.get(i,k) !
} !
sumSquare += eigenFaces(j)(i) * eigenFaces(j)(i) !
} !
var norm = Math.sqrt(sumSquare) !
(0 to (pixelCount-1)).foreach { j => !
eigenFaces(j)(i) /= norm !
} !
} !
Preprocessing is key You need to preprocess your images!
Grayscale. Important for calculating pixel intensity values.
Normalization. Not all lighting conditions are equal.
Cropping. Very important to focus only on facial features.
Without preprocessing, you’re gonna have a bad time.
Scala Advantages
Interoperability is a win. Compatibility with Java means we can use useful classes like `BufferedImage` while keeping Scala’s simplicity*.
val meanImage = new BufferedImage(width, ! height, BufferedImage.TYPE_BYTE_GRAY) !!val raster = meanImage.getRaster() !
*Scala is simpler for this particular situation, IMO
Uncanny Results
Averaging my selections proved interesting.
People I disliked smiled less, had rounder faces, while the opposite was true for those who I found attractive.
Potential for Eigenfaces?
What else can we do with these great faces!
• Subjects that can be read 2-dimensionally, from same angle
• Optical Character Recognition (OCR)
• Image segmentation
• http://www.cs.huji.ac.il/~yweiss/iccv99.pdf
It isn’t a Google Deep Dream, but it has potential…
Play 2. Rebuilt the entire Tinder interface in Play Scala MVC framework for desktop.
Chat Bot. Bot in background is semi-intelligent and looks for uninitiated conversations.
Notifications. Desktop browser notifications alert for new chats.
A Non-Typical Conversation
Before Natural Language Processing could be used to analyze replies of conversations, a structure was needed to map progress of conversations.
• Analyze reply depth
• Provide a path to next reply
• Determine if notification was necessary
Scala Tree Structures Trees track progress and replies of conversations.
Trees Codified
case class MessageTree( !
val value: String, !
val left: Option[MessageTree] = None, !
val right: Option[MessageTree] = None !
) { !
!
/** Walk the node using a boolean input. */ !
def walk(direction: Direction): Option[MessageTree] = { !
direction match { !
case Right => this.left!
case Left => this.right!
} !
} !
} !
Message trees are simple binary trees.
Walking the Tree
FunMessages.messages.find(_.value == theTreeRoot) match { !
case None => createStopGap(m, true) !
case Some(tree) => !
val sentiments = MessageUtil!
.assignSentimentDirection(MessageUtil.filterSenderMessages(userId, m.messages)) !
.map(_._2) !
MessageTree.walkTree(tree, sentiments) match { !
case None => createStopGap(m, true) !
case Some(branch) => !
new TinderApi(Some(xAuthToken)) !
.sendMessage(m._id, branch.value).map { result => … !
Note: pattern matching isn’t the only way to do this.
Sentiment analysis was easy part.
• Library already had trained models for sentiment
• Split each match’s reply into sentences and score sentiment
• Use score to determine reply direction
Ready for StanfordNLP Sentiment of reply determined direction of tree.
val pipeline = new StanfordCoreNLP(nlpProps) !
val annotation = pipeline.process(message) !
var sentiments: ListBuffer[Double] = ListBuffer() !
for (sentence <- annotation.get(classOf[CoreAnnotations.SentencesAnnotation])) { !
val tree = sentence.get(classOf[SentimentCoreAnnotations.AnnotatedTree]) !
val sentiment = RNNCoreAnnotations.getPredictedClass(tree) !
val partText = sentence.toString!
sentiments += sentiment.toDouble!
} !
val averageSentiment:Double = { !
if(sentiments.size > 0) sentiments.sum / sentiments.size!
else 2 !
} !
Create Reply Trees
object FunMessages { !
!
def messages = List( !
MessageTree( !
value = "{name} are you a fan of avocados?", !
right = Some(MessageTree( !
value = "So if I asked you to have a guacamole party with me you'd do it?", !
right = …, !
left = … !
)) … !
Now we have a list of generic replies to open conversations.
Number of photos.
Applicable for both spammers and matching, a profile with one or zero photos was not worth the time.
Length of bio.
An empty or short bio was a strong indicator of spammer presence.
Activity.
If they haven’t been active for a while, they probably won’t respond soon anyways ;)
General Rules of Selection
Integrate with Selection
if(rec.photos.size==2 && rec.bio=="") dislikeUser("sparse photos, no bio") !
else if (rec.photos.size==1) dislikeUser("sparse photos") !
else if (lastSeenAgo > (day*3)) dislikeUser("hasn't been active for %s days".format((lastSeenAgo/day))) !
else if (!photoCriteria(rec.photos)) dislikeUser("failed photo criteria") !
else if (rec.bio.matches("no.{0,15}hookups")) likeUser("claiming friendship only") !
else if (autoLike) likeUser("auto-liked") !
else { !
recommendation.FacialRecommendation.makeComparison(user.user._id, rec._id, rec.photos) match { !
case Some(true) => likeUser("face matched positive recommendation criteria”) … !
Implementing the rules in code in SwipeTask.scala.
• If you need concurrency for basic computational performance, use Futures
• If you’re setting up a router firing to multiple workers, use can use Actors
• If you need something to track state from outside messages, such as counting, use Actors
• And futures are composable!
Notes about Actors Use Actors for State, Futures for Concurrency
1. Top-level bot service iterates through data, looks for tasks.
2. Tasks are spawned in their appropriate actors
1. MessageReplyTask
2. SwipeTask
3. FacialCheckTask
3. Tasks are then placed in a timed queue
Queue System with Actors Concurrent and queued calculations were a must.
Found an advantage to following this anti-pattern, because I was able to throttle the amount of computation (and messaging) without overwhelming my local CPU and the Tinder API. In hindsight, it may have been better to make each Actor a worker.
The Bot Service
class TinderBot(taskWarningThreshold: Int, taskSleepThreshold: Int) extends Actor { !
// Throttler and supervisor watch all of the work !
val botThrottle = context.actorOf(Props(new BotThrottle(1 msgsPer (2 seconds), Some(self))), "BotThrottle") !
val botSupervisor = context.actorOf(Props(new BotSupervisor(self)), "BotSupervisor”) !
def receive = { !
// send commands to the bot !
case BotCommand(command) => … !
// logic for handling queue state !
case QueueState(queueLength) => … !
} !
Admittedly, a little heavyweight…
• One key mistake above is I wasn’t storing state in UpdatesTask actor, I was storing it elsewhere!
• Akka is especially useful for creating timed micro-services like the above
• There are other ways to do this, too…
Easy Scala Services Scala made it somewhat easy to create micro-services.
private class UpdatesTask extends Actor { def receive = case "tick" => TinderService.activeSessions.foreach { s => syncUpdates(s) } } } private val updateActor = Akka.system.actorOf(Props[UpdatesTask], name = "UpdatesTask") private val updateService = { Akka.system.scheduler.schedule(0 seconds, 40 seconds, updateActor, "tick") }
Links
Github Project • https://github.com/crockpotveggies/tinderbox
Scala • http://www.scala-lang.org
Eigenfaces • https://en.wikipedia.org/wiki/Eigenface
StanfordNLP • http://nlp.stanford.edu/
Akka • http://akka.io/