ANLP 108 – Multimodal NLP (Text + Vision + Speech) – NVIT | WIOA Approved AI, Data & Cloud Career College

(Weeks 15–17 | Lec 9 Hrs / Lab 18 Hrs / Ext 0 Hrs | 27 Total Hrs | 1.2 Credit Hours)
Students will:

Combine NLP, vision, and audio modalities in AI systems.
Work with CLIP models and multimodal datasets.
Build applications combining image, text, and audio inputs.
Prerequisite: ANLP 107 – Speech Processing & Real-Time NLP
Tools: OpenAI CLIP, Hugging Face Multimodal