Term of Award

Spring 2025

Degree Name

Master of Science, Information Technology

Document Type and Release Option

Thesis (open access)

Copyright Statement / License for Reuse

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Department

Department of Information Technology

Committee Chair

Hayden Wimmer

Committee Member 1

Jongyeop Kim

Committee Member 2

Atef Mohamed

Abstract

In the era of rapid technological advancement, efficient content generation, application development, and data management are crucial for meeting the demands of dynamic digital environments. This thesis uses state-of-the-art models to explore three core areas: AI-driven video content creation, text-to-image-to-text consistency, and automatic text summarization. The first study investigates the potential of AI-powered text-to-video generation to democratize video production and enhance storytelling. By comparing the performance of three models—ModelScope, Text2Video (Zero), and Motion Consistency—this study assessed the quality of generated videos using CLIP scores. It evaluated statistical significance through t-tests and homogeneity tests. Results indicate that ModelScope outperformed the others, though the differences were not statistically significant. These findings underscore AI's transformative role in content creation, making high-quality video production more accessible. The second study evaluates the semantic consistency of a text-to-image-to-text pipeline using four models—DALL·E, Imagen, Grok, and Stable Diffusion. Text prompts were used to generate images, which were then converted back to text using image captioning models. BERTScore, METEOR, ROUGE, and BLEU were employed to assess the similarity between the original prompts and reconstructed text. Pearson correlation analysis and paired t-tests indicated no statistically significant differences among models (p > 0.05), although Stable Diffusion exhibited slightly higher scores. The results highlight the strengths and limitations of current multi-modal models in maintaining semantic fidelity across complex prompts. The third study focuses on automatic text summarization (ATS) by evaluating four leading transformer models—Pegasus, BART, T5, and FLAN-T5—on Amazon review datasets. The models were fine-tuned and assessed using ROUGE metrics to measure contextual fluency and coherence. Statistical analyses, including paired t-tests, revealed Pegasus as the top-performing model, excelling in fluency and structural coherence. These findings provide valuable insights into the effectiveness of transformer models for summarization tasks. These studies offer a comprehensive understanding of AI applications across diverse domains. They provide developers, researchers, and organizations with the knowledge to make informed decisions about integrating AI technologies to enhance content generation, optimize data processing, and improve overall system performance.

Research Data and Supplementary Material

Yes

Share

COinS