Posts tagged 'paper'

Indic Language Identification

2020-11-19

paper, machine-learning, ml, nlp, natural-language

We discuss language identification of noisy, romanized text - an un-addressed but critical problem in Indic text mining, and release a language-identification utility. We then measure geographic extents of language use in India. Summary of a WNUT 2020 paper.

The TW-IDF Model

2014-03-17

cikm, paper,

I found this recent paper on adhoc-IR from CIKM 2013. I haven’t worked in this area but the results seem promising. Essentially, instead of TF, you store an indegree count. The graph in question is the term co-occurrence graph within a window with the direction indicating word-order. The paper won hon. mention at the CIKM so it is clearly very cool.

Link to paper

The Percolator Paper

2013-07-24

reading, paper, percolator, google-research, map-reduce

In the IR reading group this week I decided to read the Percolator paper from Google[1]. It caused quite a stir on several news-reading sites after a Google Research blog-post on the topic. Since I’ve never had the chance to read it, this is as good a time as any. This is not a comprehensive summary at all and lots of results here are hand-wavy. If you want to instruct yourself, please read the paper.