Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

The Hong Kong University of Science and Technology

Search Engines for Web and Enterprise Data

The Hong Kong University of Science and Technology via Coursera

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
This course introduces the technologies behind web and search engines, including document indexing, searching and ranking. You will also learn different performance metrics for evaluating search quality, methods for understanding user intent and document semantics, and advanced applications including recommendation systems and summarization. Real-life examples and case studies are provided to reinforce the understanding of search algorithms.

Syllabus

  • Introduction to Search Engines for Web and Enterprise Data
    • Welcome to the first module of this course! In this module, you will learn: (1) The major tasks involved in web search. (2) The history, evolution, impacts and challenges of web search engine.
  • Search Engine Business Model
    • In this module, you will learn: (1) Different business models of web search engine.
  • TFxIDF
    • In this module, you will learn: (1) Different information retrieval models, Boolean Models and Statistical models. (2) How to determine important words in a document using TFxIDF.
  • Vector Space Model
    • In this module, you will learn: (1) How to represent a document/query as a vector of keywords. 2) How to determine the degree of similarity between a pair of vectors using different similarity measures, including Inner Product, Cosine Similarity, Jaccard Coefficient, Dice Coefficient.
  • Inverted Files
    • In this module, you will learn: (1) How to index documents using inverted files. 2) How to perform update and deletion on inverted files.
  • Extended Boolean Model
    • In this module, you will learn: (1) How to use Extended Boolean Model to rank documents. 2) How to evaluate conjunctive and disjunctive queries using Extended Boolean Model.
  • PageRank
    • In this module, you will learn: (1) The history and evolution of link-based ranking methods. 2) How to determine query/document similarities using HyPursuit, WISE, and PageRank. 3) Possible extensions that can be applied to Pagerank.
  • HITS Algorithm
    • In this module, you will learn: (1) How to calculate hub and authority scores of web documents using HITS algorithm. 2) Understand the re-ranking process involved in HITS algorithm.
  • Performance Evaluation of Information Retrieval System
    • In this module, you will learn: (1) How to evaluate retrieval effectiveness of an information retrieval using Precision, Recall, F-Measure, Average-Precision, DCG, and NDCG. 2) What are the subjective relevance measures to be used on an information retrieval system.
  • Benchmarking
    • In this module, you will learn: (1) How to use the TREC collection for benchmarking. 2) The characteristics of the TREC collection.
  • Stopword removal and Stemming
    • In this module, you will learn: (1) What is stemming. 2) Different Content-Sensitive and Context-Free stemming algorithms. 3) How to calculate Successor Variety and Entropy for stemming.
  • Relevance Feedback
    • In this module, you will learn: (1) How to perform document space modification using relevance feedback. 2) How to perform query modification using relevance feedback.
  • Personalized Web Search
    • In this module, you will learn: (1) Relative preference is more useful than absolute preference in personalization. 2) The importance of eye-tracking user study in personalized web search. 3) How to model preferences as a weighted vector.
  • Index Term Selection
    • In this module, you will learn: (1) How to calculate discrimination value for index term selection. 2) The importance of word usage in documents in search engine design.
  • Discovering Phrases and Correlated Terms
    • In this module, you will learn: (1) How to use collocated terms in lieu of strict phrases in search. 2) How to identify collocated terms using Pointwise Mutual Information (PMI). 3) How to utilize N-grams for search.
  • Enterprise Search Engine
    • In this module, you will learn: (1) The challenges of enterprise search. 2) The differences between web search and enterprise search.

Taught by

Kenneth W T Leung and Dik Lun LEE

Reviews

Start your review of Search Engines for Web and Enterprise Data

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.