π€ Want your team maximizing Claude? I run 1:1 and team AI workshops for companies doing $1M+ per year: https://aibuilder.academy/yt/Ot2c5MKN_-w Multimodal (Large) Language Models expand an LLM's text-only capabilities to include other modalities. Here are three ways to do this. Resources: π° Blog: https://medium.com/towards-data-science/multimodal-models-llms-that-can-see-and-hear-5c6737c981d3?sk=d0897db8457c91706170d3043ebdbcf0 βΆοΈ LLM Playlist: https://youtu.be/eC6Hd1hFvos π» GitHub Repo: https://github.com/ShawhinT/YouTube-Blog/tree/main/multimodal-ai References: [1] Multimodal Machine Learning: https://arxiv.org/abs/1705.09406 [2] A Survey on Multimodal Large Language Models: https://arxiv.org/abs/2306.13549 [3] Visual Instruction Tuning: https://arxiv.org/abs/2304.08485 [4] GPT-4o System Card: https://arxiv.org/abs/2410.21276 [5] Janus: https://arxiv.org/abs/2410.13848 [6] Learning Transferable Visual Models From Natural Language Supervision: https://arxiv.org/abs/2103.00020 [7] Flamingo: https://arxiv.org/abs/2204.14198 [8] Mini-Omni2: https://arxiv.org/abs/2410.11190 [9] Emu3: https://arxiv.org/abs/2409.18869 [10] Chameleon: https://arxiv.org/abs/2405.09818 Introduction - 0:00 Multimodal LLMs - 1:49 Path 1: LLM + Tools - 4:24 Path 2: LLM + Adapaters - 7:20 Path 3: Unified Models - 11:19 Example: LLaMA 3.2 for Vision Tasks (Ollama) - 13:24 What's next? - 19:58

The 8 Claude Skills Running My Business
1.2K views

How to Use Claude Better than 99% of Founder-CEOs
798 views

Claude Cowork Explained in 29 Minutes (for non-coders)
1.7K views

How I Taught Claude To Edit My YouTube Videos
4.5K views

How to Automate Anything with Claude (4-Step Framework)
4.4K views

Claude Code for SWE Teams: Building a Shared AI Coding Toolkit
1.9K views