Nowadays more and more information is available in unstructured or poorly structured form. Examples of information of this type are textual documents, web pages, videos, photos, music, blogs, etc. The goal of this course is to enable the student to understand the foundations of managing unstructured or poorly structured information.
The course aims to assist students to understand techniques for the indexing, retrieval, filtering, clustering, and presentation of textual and multimedia information held in digital archives, the web, and/or multimedia systems. From this perspective the course complements what the student learned from the previous course on Data Management, where only structured information is dealt with.
The course consists of theoretical lectures and practical sessions. The practical sessions deal with the design, implementation, and evaluation of an information retrieval system for a small and medium size collection of documents.
Examination will consist of 3 theoretical tests and 1 individual course project carried our during the semester (no final exam).
- Required: C. Zhai, and S. Massung. Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining. ACM Books, 2016.
- Suggested: W.B. Croft, D. Metzler, and T. Strohman. Search Engines: Information Retrieval in Practice, Pearson, 2009.