How to build a Jarvis like super interactive AI that can listen, watch and talk back? We rebuilt the Gemini demo with GPT4V + Whisper + TTS, here is how it really performed… Build AI powered ad assets at scale with Hubspot campaign assistant for free: https://www.hubspot.com/campaign-assistant?utm_source=youtube&utm_medium=social&utm_campaign=CR00163Dec2023_AIJason%2Fpartner_youtube 🔗 Links - Follow me on twitter: https://twitter.com/jasonzhou1993 - Join my AI email list: https://crafters.ai/ - My discord: https://discord.gg/eZXprSaCDE - Github - Gemini demo with GPT4V: https://www.crafters.ai/aitools/rebuild-gemini-demo-with-gpt-4-vision ⏱️ Timestamps 0:00 Quick demo 1:41 Project plan & challenges 3:11 Open source Gemini demo & overview 9:37 Project setup 11:22 Setup video recorder 14:37 Setup silence aware audio recorder 16:36 Create img grid 19:44 Whisper 24:31 Connect to GPT4V 27:36 Streaming result & TTS 29:19 Demo 👋🏻 About Me My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com #gpt4v #gemini #autogen #gpt4 #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #chatgpt #largelanguagemodels #largelanguagemodel #bestaiagent #chatgpt #agentgpt #agent #babyagi

Ralph-loop 2.0? The real autonomous coder is coming...
20.3K views

New AI coding paradiagm - OpenAI Symphony
42.1K views

Okay, this unleashed my agent
19.0K views

wtf is Harness Engineer & why is it important
84.8K views

How to prompt Gemini 3.1 for Epic animations
24.4K views

Anthropic killed Tool calling
206.1K views