Tuesday, February 10, 2009

[week 6] Reading Notes

After reading chapter 8th of IIR and the paper about the TREC collection, I started to look more deeply about the different domains covered by this project. It was specially interesting finding a TREC Genomics data and also a TREC law.

About the first one, I read some papers about it and I found out that the project had been running from 2003 to 2007, and in this paper written by the leader of the project,
William Hersh, he states that the project was a success but there was not too much advancement over the state of the art of IR. In his words:

<<...As with all TREC activity, the short cycle of experimentation and reporting of results has prevented more detailed investigation of different approaches. However, there emerged some evidence that some resources from the genomics/bioinformatics could contribute to improving retrieval, especially controlled lists of terminology used in query expansion, although their improvement over standard state-of-the-art IR was not substantial. ...>>

Which is the real gain in creating this test datasets? Is there any new algorithm based on experiments using TREC data, for example?

Other question I raised was if there is some research about creating queries automatically using NLP based on the description of the user need .

No comments:

Post a Comment