Information Retrieval and Web Search
|

| Instructor: | Rada Mihalcea |
| Office: | Research Park, F228, tel: 940-369-7630 |
| Email: | rada at cs unt edu |
| Class hours: | TTh 12:30-01:50pm |
| Office hours: | TTh 03:00-04:00pm or by appointment. Anytime electronically. |
| Teaching assistant: | Satya Mudunuru |
| Email: | chandu at unt edu |
| Office hours: | MW 12-2pm, F205 |
| Course description: | This course will cover traditional material, as well as recent advances in Information Retrieval (IR), the study of indexing, processing, and querying textual data. Basic retrieval models, algorithms, and IR system implementations will be covered. The course will also address more advanced topics in "intelligent" IR, including Natural Language Processing techniques, and "smart" Web agents. |

| Date | Lecture | Reading material | NB | |
| 01/15/2008 | Course overview (ppt) | - | - | |
| 01/17/2008 | Introduction to IR models and methods [ppt] | - | - | |
| 01/22/2008 | Perl tutorial (ppt) | - | - | |
| 01/24/2008 | Perl tutorial (ppt) | - | - | |
| 01/29/2008 | Text processing [ppt] |
Porter stemmer [CM] Chap.2: The term vocabulary & postings lists |
- | |
| 01/31/2008 |
Text properties [ppt] Web Spidering [ppt] |
[CM] Chap.5: Index compression, sect.5.1 [CM] Chap.20: Web crawling and indexes Optional reading: [BY] chapter 6.3 |
Lecturer: Andras Csomai | |
| 02/05/2008 | Practical problems in web spidering [ppt] | - |
Lecturer: Andras Csomai Assignment 1 issued |
|
| 02/07/2008 | Vector space model [ppt] | [CM] Chap.6: Scoring, term weighting and the vector space model |
Lecturer: Andras Csomai |
|
| 02/12/2008 | Boolean model and extensions [ppt] | [CM] Chap.1: Boolean retrieval |
Lecturer: Hakan Ceylan |
|
| 02/14/2008 | Alternative IR models. [ppt] |
[CM] Chap.11: Probabilistic IR [CM] Chap.18: LSA |
Lecturer: Hakan Ceylan |
|
| 02/19/2008 |
Review IR models IR evaluation and IR test collections. [ppt] |
[CM] Chap.8: Evaluation in information retrieval |
- | |
| 02/21/2008 | Term weighting schemes |
[CM] Chap.6: Scoring, term weighting and the vector space model [KSJ] Term weigthing approaches, pg. 323 |
Assignment 1 due. Assignment 2 issued |
|
| 02/26/2008 | Relevance feedback. [ppt] | [CM] Chap.9. | - | |
| 02/28/2008 |
Query expansion [ppt] Text classification [ppt] |
[CM] Chapter 13. | - | |
| 03/04/2008 |
Text classification [ppt] See also: Intro Machine Learning [ppt] |
[CM] Chapter 13. | - | |
| 03/07/2008 |
Learning language from its perceptual context Invited speaker: Ray Mooney, University of Texas at Austin F228, 11am |
- | Note the unusual day/time/room | |
| 03/11/2008 | Review: exam I preparation | All the material studied so far | - | |
| 03/13/2008 | Exam I | - | Assignment 2 due on 03/14 | |
| 03/18/2008 | Spring break | - | - | |
| 03/20/2008 | Spring break | - | - | |
| 03/25/2008 | Link analysis. HITS. PageRank. . |
[CM] Chapter 21.
Page L. et. al Page Rank Citation Ranking: Bringing Order to the Web Also check this page. |
Assignment 3 issued on 03/20. | |
| 03/28/2008 |
Keyword Extraction and Back-of-the-Book Indexing Andras Csomai F228, 10am |
- | Note the unusual day/time/place | |
| 04/01/2008 |
Question Answering [ppt] |
Check the TREC Q&A site | - | |
| 04/03/2008 | Topic Sensitive PageRank | Haveliwala. "Topic-Sensitive PageRank" [pdf] | - | |
| 04/08/2008 |
Special Topics: Topic Sensitive PageRank Special Topics: Introduction to Information Extraction. (ppt) |
- | - | |
| 04/10/2008 | Special Topics: Cross language Information Retrieval (ppt) | Check the Cross Language Evaluation Forum CLEF | Assignment 3 due. | |
| 04/15/2008 | Special Topics: Web 2.0: Wikis and Blogs | - | - | |
| 04/17/2008 | Special topics: Recommender Systems | - | - | |
| 04/22/2008 |
Search Engine Technologies Exam II preparation |
- | All material studied so far (papers included) | - |
| 04/24/2008 | Exam II | - | All material studied so far (papers included) | - | 04/29/2008 | Project presentations I. | - | - | 05/01/2008 | Project presentations II. | - | - |


|
Readings in Information Retrieval K.Sparck Jones and P. Willett Get a quote from BestBookBuys |
|
Modern Information Retrieval Ricardo Baeza-Yates and Berthier Ribeiro-Neto Buy this book (new) from Amazon. Compare prices (new or used) at BestBookBuys |
|
Information Retrieval: Data Structures and Algorithms W.Frakes and R. Baeza-Yates Get a quote from BestBookBuys |
