Tuesday, February 17, 2009

[week 6] Muddiest Points

This week's lecture was about Evaluation of IR Systems. In addition to the weekly book reading, we read two papers:
  • Karen Sparck Jones, What's the value of TREC: is there a gap to jump or a chasm to bridge? ACM SIGIR Forum, Volume 40 Issue 1 June 2006
  • Kalervo Järvelin, Jaana Kekäläinen. Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems (TOIS) Volume 20 , Issue 4 (October 2002) Pages: 422 – 446
TREC
Through the IIR book chapter and the first paper, I was introduced to TREC (Text REtrieval Conference) :
1. I was I little surprised when reading the paper about TREC with the affirmation of the author: <<...this progress... has not so far enabled the research community it represents to say:'if your retrieval case is like this, do this' as oposed to 'well, with tunning, this sort of thing could serve yo alright'. >> Despite Google's success, is this still the feeling about the IR community?
2. Which are the sources used to generate content in the Biological and Law tracks of TREC?
3. How do they (TREC consortium) decide to stop a track or to create a new one?

Interpolation on P-R graphs
About the Precision-Recall graphs, I wonder if the Interpolation process may produce, in some cases, a false interpretation of the performance of an IR system.

No comments:

Post a Comment