INFORMATION RETRIEVAL
Stefano Mizzaro
OBJECTIVES
* Knowledge and comprehension skills: know both basic topics and advanced research trends of the field
* Practical skills: apply basic principles to design, analyse and evaluate IR systems
* Independent judgment skills: judge the quality of different design choices
* Communication skills: describe how IR systems work
* Learning skills: learn new indexing and retrieval techniques
CONTENTS
Detailed contents:
* Classical IR:
– formal IR models (Boolean, vector space, probabilistic and variants as BM25, Language models);
– structure of the inverted index (basics, compression);
– user interfaces for IR (classification, survey);
– classification (definition, naive Bayes classifiers)
– clustering (hierarchical and approximate algorithms);
– evaluation (foundations, methodologies, metrics; research topics).
* Web IR:
– Web graph (size and shape: small world and scale-free networks, bow-tie shape);
– link analysis for ranking and other applications (PageRank, HITS, variants);
– crawling (concepts and architecture);
– spam (short account);
– search engine architecture (short account).
* Case studies and specific issues.
TEXTS
* C. D. Manning, P. Raghavan e H. Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. http://nlp.stanford.edu/IR-book/
* B. Croft, D. Metzler, T. Strohman. Information retrieval in practice, Addison Wesley, 2009
* Other books and papers as detailed during lectures.