GPT-4o is WAY More Powerful than Open AI is Telling us...
262,700
Published 2024-05-16
▼ Link(s) From Today’s Video:
GPT-4o Page: openai.com/index/hello-gpt-4o/
Min Choi's Awesome Thread: twitter.com/minchoi/status/1790416703404302463
Open AI YT channel: / @openai
Greg Brokman GPT4o image gen: twitter.com/gdb/status/1790869434174746805/photo/1
Smoke away prediction: twitter.com/SmokeAwayyy/status/1791142705244127481
► MattVidPro Discord: discord.gg/bQgcbjs2Sg
► Follow Me on Twitter: twitter.com/MattVidPro
► Buy me a Coffee! buymeacoffee.com/mattvidpro
-------------------------------------------------
▼ Extra Links of Interest:
AI LINKS MASTER LIST: www.futurepedia.io/
General AI Playlist: • General MattVidPro AI Playlist
AI I use to edit videos: www.descript.com/?lmref=nA4fDg
Instagram: instagram.com/mattvidpro
Tiktok: tiktok.com/@mattvidpro
Second Channel: / @matt_pie
Let's work together!
- For brand & sponsorship inquiries: tally.so/r/3xdz4E
- For all other business inquiries: [email protected]
Thanks for watching Matt Video Productions! I make all sorts of videos here on Youtube! Technology, Tutorials, and Reviews! Enjoy Your stay here, and subscribe!
All Suggestions, Thoughts And Comments Are Greatly Appreciated… Because I Actually Read Them.
Timestamps:
00:00 Introduction and Initial Reactions
00:36 Overview of GPT-4o and Multimodal AI
01:42 Comparison with GPT-4 Turbo
03:22 Text Generation Capabilities
07:22 Audio Generation Capabilities
12:22 Image Generation Capabilities
19:04 Advanced Features
23:27 Video Understanding Capabilities
27:34 Conclusion
All Comments (21)
-
I think the image editing is one of THE most mind blowing pieces of this... What do you guys think?
-
14:17 Matt, the multiple whiteboards/chalkboards at the top ARE realistic. This is actually how chalkboards in older classrooms used to work. They would have multiple chalkboards on sliders that you could pull up and down.
-
One of the things I think I would have try with GPT-4o is take a photo of a page from a manga or comic book or even a novel and ask it to read back the text in voice of of the characters as they speak.
-
I don't know about everyone else but most of the people I come in contact with have no clue about the rapid developments in AI. Kind of eery...
-
Idk if i'm more impressed with the life-like sound of the voice, or how human it feels to interact with (ie. it understands our emotions)
-
Chalkboards often have multiple boards that slide onto of each other
-
Timestamps for yall: 00:00 - Introduction and Initial Reactions Introduction to the video. Reaction to OpenAI's real-time AI companion. 00:36 - Overview of GPT-4o and Multimodal AI Explanation of GPT-4o. What does "multimodal" mean? 01:42 - Comparison with GPT-4 Turbo Differences between GPT-4o and GPT-4 Turbo. Audio capabilities of GPT-4o. 03:22 - Text Generation Capabilities Speed and quality of GPT-4o's text generation. Examples of high-speed text generation. 07:22 - Audio Generation Capabilities Demonstration of GPT-4o's audio generation. Examples of emotive and natural voice outputs. 12:22 - Image Generation Capabilities Explanation of GPT-4o's image generation. Examples of high-quality image outputs. 19:04 - Advanced Features Image recognition and video understanding. Examples of practical applications and scenarios. 23:27 - Video Understanding Capabilities Discussion on GPT-4o's video capabilities. Potential future developments and limitations. 27:34 - Conclusion Final thoughts on GPT-4o's impact and potential. Invitation to viewers to subscribe and join the community.
-
GPT-4o is also A LOT more reliable when it comes to long-form text processing. Not even comparable to either GPT-4 or Gemini. It follows the prompt much better, doesn't get lazy so easily, and doesn't start to hallucinate so quickly. I tried four hours to get GPT-4 and Gemini to do what I wanted, and they failed miserably. GPT-4o completed the whole damn task in 40 minutes without so much as a hiccup.
-
Services like Audible should release AI that reads the books, but also allows you to talk about the topics, do quiz tests, and more, making the entire book library an instant interactive homeschooling study resource for anyone wanting to level up in life. In contrast to just 'consuming' audiobooks as we do in todays passive one way relationship dynamic.
-
the most mind blowing think is the speed. With that speed and variety of natural voices you can make a real rpg game with Ai NPC
-
Man, the image understanding of GPT-4o is crazy
-
15:53 Actually no, the image generation didn't screw up. If you look that's actually EXACTLY what is written, including capitalisation (or lack-thereof). What's even more impressive is that it actually split the word "sound's" across multiple lines and it did it completely corrctly! Actually mind-blowing! 🤯🤯
-
Honestly regarding images: What we really need IS multi-modality. The images produced by common models like SD are good enough. The problem is that it doesn't really understand what it is doing. If they can keep the quality of current models and just add a deep understanding to it, that multiplies the actual quality of the outcome by orders of magnitude in the sense that you get what you actually want AND can change specific things instead of getting images that so-so follow a prompt somewhat and then inpainting and hoping for the best.
-
About the chalkboard. I think the dual chalkboards are not unrealistic. We had those a lot when I was studying. You could move them up and down to have more space.
-
The ability to read / show screen share your desktop and dictate is game changer for context as you can demonstrate what you want done rather than just trying to describe it.
-
An odd thing about GPT-4o is that it's better at poetry than it used to be. It has a better idea of the meter of a limerick or a sonnet than it did before it had a multimodal understanding of what words sounded like. Words like "love" and "prove" don't rhyme any more. You can see this by asking GPT-4 turbo and GPT-4o to produce poems using the existing text interface. It's also the first time I found a model that can reliably produce a Petrarchan/Italian sonnet instead of a Shakespearean/Elizabethan sonnet--previous models always used the much-more-common Elizabethan rhyming scheme.
-
This is the first AI model that I feel the urge to use. The capabilities are incredible.
-
14:10 many university blackboards like this come in sets of three at different depths above the wall. You can slide them up and down to access the other boards. It allows the lecturer to keep writing on new board while allowing students to still see previous steps in the lesson if they need to look back and also means the professor doesn't have to waste time erasing the whole board every 5/10 mins.
-
Cracked me up at “I wouldn’t even be able to tell you this was a missile in the first place! This things a professional!” 😂
-
The image editing capabilities are truly mind-blowing. With music, video, and audio generation advancements on the horizon, the creative possibilities are endless. Many thanks.