This week the topic was "Web Search", and these are my two cents:
1. When did the search engines (before Google) started to consider the link structure of the web as an important issue to incorporate in their algorithms? I know HITS algorithm performs links analysis, but when did commercial search engines started to using it seriously? I want to find out whether this was the drop who balanced the scale in favor of Google.
2. In the model of the Web as a core, incoming and outgoing links... how is the 22% percent of disconnected pages calculated? if they are disconnected, how to be sure that is less or more?
3. How a search engine decides where to star crawling? Which are the most common heuristics to make this decision?