Date of Graduation

5-2017

Document Type

Thesis

Degree Name

Bachelor of Science

Degree Level

Undergraduate

Department

Computer Science and Computer Engineering

Advisor

Li, Wing

Reader

Patitz, Matthew

Third Reader

Beavers, Gordon

Abstract

Data file layout inference refers to building the structure and determining the metadata of a text file. The text files dealt within this research are personal information records that have a consistent structure. Traditionally, if the layout structure of a text file is unknown, the human user must undergo manual labor of identifying the metadata. This is inefficient and prone to error. Content-based oracles are the current state-of-the-art automation technology that attempts to solve the layout inference problem by using databases of known metadata. This paper builds upon the information and documentation of the content-based oracles, and improves the databases of the oracles through experimentation.

Share

COinS