Date of Graduation

12-2024

Document Type

Thesis

Degree Name

Bachelor of Science in Computer Science

Degree Level

Undergraduate

Department

Computer Science and Computer Engineering

Advisor/Mentor

Luu, Khoa

Committee Member

Gauch, John

Second Committee Member

Gauch, Susan

Abstract

Video Question Answering (VideoQA) focuses on developing mod- els capable of engaging in natural language conversations about video con- tent. Current state-of-the-art typically analyze videos frame-by-frame, a process that is both computationally and memory-intensive. Integrating the Atkinson-Shiffrin memory model with Video Language Models has demon- strated potential for enhancing video understanding capabilities. Reducing the number of frames processed by the model is a crucial operation in this approach, which is achieved by a memory consolidation algorithm. This al- gorithm condenses a video sequence into a small set of representative frames which capture the essence of the video content. However, due to the com- plexity of events in videos, selecting keyframes efficiently and effectively remains a challenge. This work aims to address this challenge by comparing video understanding capabilities across different memory consolidation algo- rithms. Specifically, we present experiments evaluating simple but effective memory consolidation algorithms on the ActivityNet-QA dataset. Through this analysis, we aim to construct an optimal memory consolidation algo- rithm to improve model performance in VideoQA tasks.

Keywords

Video Understanding; Multimodal Large Language Models; Video Question Answering; Atkinson-Shiffrin Memory Model

Citation

Couts, M. (2024). Reducing Token Redundancy in Video-Language Models via Memory Consolidation Algorithm. Electrical Engineering and Computer Science Undergraduate Honors Theses Retrieved from https://scholarworks.uark.edu/elcsuht/17

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Electrical Engineering and Computer Science Undergraduate Honors Theses

Reducing Token Redundancy in Video-Language Models via Memory Consolidation Algorithm

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Abstract

Keywords

Citation

Included in

Search

Links

Browse

Contact Us

Electrical Engineering and Computer Science Undergraduate Honors Theses

Reducing Token Redundancy in Video-Language Models via Memory Consolidation Algorithm

Author

Date of Graduation

Document Type

Degree Name

Degree Level

Department

Advisor/Mentor

Committee Member

Second Committee Member

Abstract

Keywords

Citation

Included in

Share

Search

Links

Browse

Contact Us