Date of Graduation

12-2014

Document Type

Thesis

Degree Name

Master of Science in Computer Engineering (MSCmpE)

Degree Level

Graduate

Department

Computer Science & Computer Engineering

Advisor/Mentor

Craig Thompson

Committee Member

Gordon Beavers

Second Committee Member

Bajendra Panda

Keywords

Knowledge Representation, Ontologies, Search

Abstract

This thesis describes a method to populate very large product ontologies quickly. We discuss a deep search architecture to text-mine online e-commerce market places and build a taxonomy of products and their corresponding descriptions and parent categories. The goal is to automatically construct an open database of products, which are aggregated from different online retailers. The database contains extensive metadata on each object, which can be queried and analyzed. Such a public database currently does not exist; instead the information currently resides siloed within various organizations. In this thesis, we describe the tools, data structures and software architectures that allowed aggregating, structuring, storing and searching through several gigabytes of product ontologies and their associated metadata. We also describe solutions to some computational puzzles in trying to mine data on large scale. We implemented the product capture architecture and, using this implementation, we built product ontologies corresponding to two major retailers: Wal-Mart and Target. The ontology data is analyzed to explore structural complexity and similarities and differences between the retailers. A broad product ontology has several uses, from comparison shopping applications that already exist to situation aware computing of tomorrow where computers are aware of the objects in their surroundings and these objects interact together to help humans in everyday tasks.

Share

COinS