Blogging About Information Retrieval: [week 5] Reading Notes

2 questions have raised this week:

1. The binary independence model assumes that terms occur in in documents independently and the authors say that nevertheless the assumption is not right, in practice the models perform satisfactorily in some occasions. Is there any explanation for this result? this practical evidence is just in English language or it also occurs with Chinese and Arab languages?

2. In chapter 12, the authors say that most of the time the Stop and (1 - Stop) probabilities are omitted from the language model. If this situation incurs in not modeling a well-formed language (according to Equation 12.1), why do authors do this?

See you on Thursday in class!

Blogging About Information Retrieval

Tuesday, February 3, 2009

[week 5] Reading Notes

No comments:

Post a Comment

Followers

Blog Archive