| This course presents an overview of the theory and design of information retrieval systems for unformatted data with an emphasis on text. The focus is on internet search engines, and they are discussed in the context of earlier fixed collection systems. We explore both the Boolean and ranked vector space models of retrieval, as well as variants used in both research and commercial systems. We also discuss a variety of text processing techniques and algorithms, such as parsing, stemming, compression, and string searching. Information retrieval is also a great case study for broader issues in building systems that that scale and perform, so we also discuss associated issues in data structures, algorithms, computational complexity, and measurement. Students will also build a mini search engine. Prerequisite: SEIS 610; recommended SEIS 630 |