Friday, March 20, 2009

[week 11] Muddiest Points

This week the topic was "Web Search", and these are my two cents:

1. When did the search engines (before Google) started to consider the link structure of the web as an important issue to incorporate in their algorithms? I know HITS algorithm performs links analysis, but when did commercial search engines started to using it seriously? I want to find out whether this was the drop who balanced the scale in favor of Google.

2. In the model of the Web as a core, incoming and outgoing links... how is the 22% percent of disconnected pages calculated? if they are disconnected, how to be sure that is less or more?

3. How a search engine decides where to star crawling? Which are the most common heuristics to make this decision?

No comments:

Post a Comment