Date of Graduation
8-2008
Document Type
Thesis
Degree Name
Bachelor of Science in Computer Engineering
Degree Level
Undergraduate
Department
Computer Science and Computer Engineering
Advisor/Mentor
Not available
Abstract
The goal of this project is a service based solution that utilizes parallel and distributed processing algorithms to solve the transitive closure problem for a large dataset. A dataset may be view conceptually as a table in a database, with a physical structure representing a file containing a sequence of records and fields. Two records are said to be transitively related if and only if they are directly related due to sharing of one or more specific fields, or a sequence may be made from one record to the other under the condition that all intermediate entries are related the immediate previous and subsequent entry. The transitive closure problem is to cluster the records in a dataset into groups such that all transitively related records are in one group. An approach to solve this problem is to divide the task into two separate problems. The first of these problems is the processing of the dataset, and thus generating a set of pairs. Each of these pairs would include two record identifiers, and these pairs would exist if and only if these two records were directly related. The second of these problems is to use the record pairs to cluster the records into transitive closures. The current software solution solves this second sub problem through the reading of record pairs, produced by a different software solution, and writes the completed results of the transitive closure problem to a file. This thesis studies how to enhance the current software solution in such a way that it becomes a "service". The study includes designing, implementing, testing, and evaluating the enhanced solution. The service model identifies an aspect that would potentially benefit from restructuring or addition of functionality. A current issue is the lack of an ability to fetch transitive closure from within the solution upon the completion of a job, and is thus limited in itsdirect use with other processes or applications.
Citation
Baran, J. (2008). Service oriented transitive closure solution. Computer Science and Computer Engineering Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/csceuht/21