Date of Graduation


Document Type


Degree Name

Master of Science in Computer Engineering (MSCmpE)

Degree Level



Computer Science & Computer Engineering


Craig Thompson

Committee Member

Gordon Beavers

Second Committee Member

Bajendra Panda


Knowledge Representation, Ontologies, Search


This thesis describes a method to populate very large product ontologies quickly. We discuss a deep search architecture to text-mine online e-commerce market places and build a taxonomy of products and their corresponding descriptions and parent categories. The goal is to automatically construct an open database of products, which are aggregated from different online retailers. The database contains extensive metadata on each object, which can be queried and analyzed. Such a public database currently does not exist; instead the information currently resides siloed within various organizations. In this thesis, we describe the tools, data structures and software architectures that allowed aggregating, structuring, storing and searching through several gigabytes of product ontologies and their associated metadata. We also describe solutions to some computational puzzles in trying to mine data on large scale. We implemented the product capture architecture and, using this implementation, we built product ontologies corresponding to two major retailers: Wal-Mart and Target. The ontology data is analyzed to explore structural complexity and similarities and differences between the retailers. A broad product ontology has several uses, from comparison shopping applications that already exist to situation aware computing of tomorrow where computers are aware of the objects in their surroundings and these objects interact together to help humans in everyday tasks.