Disco Rectangles

:: art, quil

I was playing with quil recently (got a project planned which I will speak about later) and managed to throw this together in a short while:

Clojure source available here.

Augmenting enlive

:: clojure, enlive, htmlcleaner, scraping

In manipulating HTML documents for features, I find myself needing to use some operations all the time - removing script tags, comments and the like. This feature-set is available in HtmlCleaner and I thus merged the two libraries to produce enlive-helper.

Now you can do:

 (java.io.StringReader. "<html><body><a>hi</a></body></html>") 
 :prune-tags "a")

And as a result the a tag is not picked up:

({:tag :html,
  :attrs nil,
   {:tag :head, :attrs nil, :content nil}
   {:tag :body, :attrs nil, :content nil})})

The options you can pass mirror those of HtmlCleaner. Full docs available in this github repo.

Also, the code is something I threw together from my research so it is released under Matt Might’s CRAPL license.

Diagnosis by Google Doesn’t Work

:: information-retrieval, SIGIR, healthcare, symptoms

I have often Googled for symptoms, visited WebMD (and concluded that I have a deadly disease). At SIGIR 2013, Ryen White’s paper, Beliefs and Biases in IR, provided empirical evidence for the poor success-rate of diagnosis-by-google.

The authors mined medical yes/no questions (For example: Can salmonella cause belly-ache), had physicians answer these questions, and then measured user bias post-search (i.e. the users after perusing the results answer their original questions with yes/no) (the paper contains a very detailed description of the experiments conducted).

The accuracy of the final answer was the most interesting part of this paper - only about half of the questions were accurately answered. That is as good as flipping a (fair) coin for each question. The rest of the paper was a fairly interesting read (and it won the SIGIR 2013 best paper award).

Consistent Hashing in Clojure

:: clojure, consistent-hashing, hotspots

I wrote this post to teach myself consistent hashing - a simple hash family that Akamai’s founders came up with. This was originally done to prepare for a talk in my grad algorithms class (I made a horlicks of the talk but whatever). I am going to provide intuition, analysis and a clojure implementation.

Soli Deo Gloria