ANLP 108 – Multimodal NLP (Text + Vision + Speech)

John Enoh · November 29, 2025

(Weeks 15–17 | Lec 9 Hrs / Lab 18 Hrs / Ext 0 Hrs | 27 Total Hrs | 1.2 Credit Hours)
Students will:

  • Combine NLP, vision, and audio modalities in AI systems.
  • Work with CLIP models and multimodal datasets.
  • Build applications combining image, text, and audio inputs.
    Prerequisite: ANLP 107 – Speech Processing & Real-Time NLP
    Tools: OpenAI CLIP, Hugging Face Multimodal

About Instructor

John Enoh

121 Courses

Not Enrolled
This course is currently closed

Course Includes

  • 10 Lessons

Ratings and Reviews

0.0
Avg. Rating
0 Ratings
5
0
4
0
3
0
2
0
1
0
What's your experience? We'd love to know!
No Reviews Found!
Show more reviews
What's your experience? We'd love to know!