Sunday, January 25, 2009

[week 3] Muddiest Points

My first muddiest point is referred to the data structures used for Access Methods. Are Hash table, Binary Trees and B-Trees the only options? There is no other structure like a linked list or a double linked list commonly used for this purpose?

About the compression methods, I just wonder how the indexers deal with the characters codification, for example Unicode UTF-8 characters, UTF-16 or ISO-8859. Should these codifications change the way IR systems index or compress the data? in which way?

No comments:

Post a Comment