Date of Graduation
5-2025
Document Type
Thesis
Degree Name
Bachelor of Science in Computer Science
Degree Level
Undergraduate
Department
Computer Science and Computer Engineering
Advisor/Mentor
Luu, Khoa
Committee Member
Gauch, John
Second Committee Member
Dobbs, Page Daniel
Abstract
The study of tobacco imagery and marketing is a complex challenge that involves extremely large datasets. It also demands a detailed analysis of the so- cial context and specific types of tobacco being marketed. Despite major recent advances in computer vision and foundation model technology, this still poses a substantial challenge. Through the DEFEND model, we aspire to address these obstacles by integrating features such as multimodal learning, hierarchical under- standing, and feature extraction to develop a foundation model designed to handle the unique challenges of tobacco image analysis. One of the core elements of DE- FEND is the Tobacco 1M dataset, a 1M dataset that is substantially larger than previous models and filters tobacco products into specific categories, with labels for specific tobacco types, brands, and lines, allowing the model to have a greater level of nuance and accuracy in its analysis. This DEFEND model outperforms many leading computer vision models in tobacco analysis in a variety of bench- marks, and demonstrates the potential of these additional technologies with regard to public health analysis and working with large datasets.
Keywords
Computer Vision; Foundation Models; Self-Supervised Learning; Tobacco Content Analysis; Feature Enhancement Modules; Student-Teacher Networks
Citation
Shepard, M. J. (2025). DEFEND: A 1M Dataset Foundation Model for Tobacco Analysis. Electrical Engineering and Computer Science Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/elcsuht/16