Date of Graduation

5-2025

Document Type

Thesis

Degree Name

Bachelor of Science in Computer Science

Degree Level

Undergraduate

Department

Computer Science and Computer Engineering

Advisor/Mentor

Luu, Khoa

Committee Member

Gauch, John

Second Committee Member

Dobbs, Page Daniel

Abstract

The study of tobacco imagery and marketing is a complex challenge that involves extremely large datasets. It also demands a detailed analysis of the so- cial context and specific types of tobacco being marketed. Despite major recent advances in computer vision and foundation model technology, this still poses a substantial challenge. Through the DEFEND model, we aspire to address these obstacles by integrating features such as multimodal learning, hierarchical under- standing, and feature extraction to develop a foundation model designed to handle the unique challenges of tobacco image analysis. One of the core elements of DE- FEND is the Tobacco 1M dataset, a 1M dataset that is substantially larger than previous models and filters tobacco products into specific categories, with labels for specific tobacco types, brands, and lines, allowing the model to have a greater level of nuance and accuracy in its analysis. This DEFEND model outperforms many leading computer vision models in tobacco analysis in a variety of bench- marks, and demonstrates the potential of these additional technologies with regard to public health analysis and working with large datasets.

Keywords

Computer Vision; Foundation Models; Self-Supervised Learning; Tobacco Content Analysis; Feature Enhancement Modules; Student-Teacher Networks

Share

COinS