Date on Master's Thesis/Doctoral Dissertation


Document Type

Master's Thesis

Degree Name

M. Eng.


Computer Engineering and Computer Science

Committee Chair

Badia, Antonio Emilio


World Wide Web; Information organization


This thesis proposes a new method of automatic taxonomy generation using the link structure of Webpages. Taxonomy is a hierarchy of concepts where each child concept is said to be encompassed by its parent concept. Techniques have previously been developed to extract taxonomies from a traditional text corpus, but this thesis relies exclusively on the links between documents in the corpus, as opposed to the text of the corpus itself. A series of algorithms were designed and implemented to realize the objectives of this thesis. These programs perform comparably to other techniques using the text in the documents and have shown that there is information available in the link structure of Webpages when creating concept taxonomies.