Date of Graduation

5-2020

Document Type

Thesis

Degree Name

Bachelor of Science

Degree Level

Undergraduate

Department

Computer Science and Computer Engineering

Advisor/Mentor

Gauch, John

Committee Member/Reader

Gauch, Susan

Committee Member/Second Reader

Luu, Khoa

Abstract

The ability to recognize human activity, especially air-writing, is an interesting challenge as one could identify any letter from many languages. I intend to investigate this problem of air-writing, but with the added twist of including the following letters from the Spanish alphabet: Á, É, Í, Ó, Ú, Ü, and Ñ. With this new alphabet, I set out to see what kinds of classifiers work best and on what kinds of data, since letters can be represented in multiple ways.

My tracking system will consist of a regular camera and a subject who will draw with a brightly colored marker (green in my experiments). The tracker will track the marker via the hue, saturation, and intensity (HSI) color space, threshold the HSI image on a certain hue range, identify the edges from the threshold or mask image, and get the minimum enclosing circle of the set of edges. With this the subject can draw letters, pressing a key to draw one letter at a time. I used the Python programming language, as well as the OpenCV library, to implement my design.

The classifiers I employed are dynamic time warping, k-nearest neighbors, nearest centroid, and support vector machine. Dynamic time warping classifies letters based on the time series representations of the letters. k-nearest neighbors and nearest centroid classify letters based on the means of each x and y component time series. While the support vector machine classifies letters based on their 28x28 image representations. My total dataset size was 3,630 samples, where 2,640 were used for training and 990 for testing. After testing, dynamic time warping achieved 58.69% accuracy, -nearest neighbors had 48.79% accuracy, nearest centroid had 47.98% accuracy, and the support vector machine had 97.17% accuracy. The accuracies when considering only the English letters improved the accuracies by about 2%. Although I believe more data and analysis is needed for a better conclusion, classifying a vast array of letters on the images seems like a good characteristic to consider when classifying letters and potentially other kinds of characters.

Keywords

Spanish, Classification, Air-Written Letters

Share

COinS