GPT-4o is WAY More Powerful than Open AI is Telling us...

262,700
0
Published 2024-05-16
OpenAI just unveiled their new GPT-4o model, and it's more powerful than we ever imagined! In this video, we dive deep into what makes GPT-4o truly multimodal, capable of generating text, images, audio, and even video. Discover the groundbreaking features and hidden capabilities that OpenAI didn't fully reveal. From stunning image creation to lifelike audio generation, GPT-4o is set to revolutionize the AI landscape. Watch now to uncover the full potential of this game-changing model!

▼ Link(s) From Today’s Video:

GPT-4o Page: openai.com/index/hello-gpt-4o/

Min Choi's Awesome Thread: twitter.com/minchoi/status/1790416703404302463

Open AI YT channel:    / @openai  

Greg Brokman GPT4o image gen: twitter.com/gdb/status/1790869434174746805/photo/1

Smoke away prediction: twitter.com/SmokeAwayyy/status/1791142705244127481

► MattVidPro Discord: discord.gg/bQgcbjs2Sg

► Follow Me on Twitter: twitter.com/MattVidPro

► Buy me a Coffee! buymeacoffee.com/mattvidpro
-------------------------------------------------

▼ Extra Links of Interest:

AI LINKS MASTER LIST: www.futurepedia.io/

General AI Playlist:    • General MattVidPro AI Playlist  

AI I use to edit videos: www.descript.com/?lmref=nA4fDg

Instagram: instagram.com/mattvidpro

Tiktok: tiktok.com/@mattvidpro

Second Channel:    / @matt_pie  

Let's work together!
- For brand & sponsorship inquiries: tally.so/r/3xdz4E
- For all other business inquiries: [email protected]

Thanks for watching Matt Video Productions! I make all sorts of videos here on Youtube! Technology, Tutorials, and Reviews! Enjoy Your stay here, and subscribe!

All Suggestions, Thoughts And Comments Are Greatly Appreciated… Because I Actually Read Them.

Timestamps:
00:00 Introduction and Initial Reactions
00:36 Overview of GPT-4o and Multimodal AI
01:42 Comparison with GPT-4 Turbo
03:22 Text Generation Capabilities
07:22 Audio Generation Capabilities
12:22 Image Generation Capabilities
19:04 Advanced Features
23:27 Video Understanding Capabilities
27:34 Conclusion

All Comments (21)
  • @MattVidPro
    I think the image editing is one of THE most mind blowing pieces of this... What do you guys think?
  • 14:17 Matt, the multiple whiteboards/chalkboards at the top ARE realistic. This is actually how chalkboards in older classrooms used to work. They would have multiple chalkboards on sliders that you could pull up and down.
  • @reifuTD
    One of the things I think I would have try with GPT-4o is take a photo of a page from a manga or comic book or even a novel and ask it to read back the text in voice of of the characters as they speak.
  • @chrisbtr7657
    I don't know about everyone else but most of the people I come in contact with have no clue about the rapid developments in AI. Kind of eery...
  • @MikeWoot65
    Idk if i'm more impressed with the life-like sound of the voice, or how human it feels to interact with (ie. it understands our emotions)
  • @evil1knight
    Chalkboards often have multiple boards that slide onto of each other
  • @MattVidPro
    Timestamps for yall: 00:00 - Introduction and Initial Reactions Introduction to the video. Reaction to OpenAI's real-time AI companion. 00:36 - Overview of GPT-4o and Multimodal AI Explanation of GPT-4o. What does "multimodal" mean? 01:42 - Comparison with GPT-4 Turbo Differences between GPT-4o and GPT-4 Turbo. Audio capabilities of GPT-4o. 03:22 - Text Generation Capabilities Speed and quality of GPT-4o's text generation. Examples of high-speed text generation. 07:22 - Audio Generation Capabilities Demonstration of GPT-4o's audio generation. Examples of emotive and natural voice outputs. 12:22 - Image Generation Capabilities Explanation of GPT-4o's image generation. Examples of high-quality image outputs. 19:04 - Advanced Features Image recognition and video understanding. Examples of practical applications and scenarios. 23:27 - Video Understanding Capabilities Discussion on GPT-4o's video capabilities. Potential future developments and limitations. 27:34 - Conclusion Final thoughts on GPT-4o's impact and potential. Invitation to viewers to subscribe and join the community.
  • @helge666
    GPT-4o is also A LOT more reliable when it comes to long-form text processing. Not even comparable to either GPT-4 or Gemini. It follows the prompt much better, doesn't get lazy so easily, and doesn't start to hallucinate so quickly. I tried four hours to get GPT-4 and Gemini to do what I wanted, and they failed miserably. GPT-4o completed the whole damn task in 40 minutes without so much as a hiccup.
  • @fynnjackson2298
    Services like Audible should release AI that reads the books, but also allows you to talk about the topics, do quiz tests, and more, making the entire book library an instant interactive homeschooling study resource for anyone wanting to level up in life. In contrast to just 'consuming' audiobooks as we do in todays passive one way relationship dynamic.
  • @kfrfansub
    the most mind blowing think is the speed. With that speed and variety of natural voices you can make a real rpg game with Ai NPC
  • @SpikyBlade
    Man, the image understanding of GPT-4o is crazy
  • @starblaiz1986
    15:53 Actually no, the image generation didn't screw up. If you look that's actually EXACTLY what is written, including capitalisation (or lack-thereof). What's even more impressive is that it actually split the word "sound's" across multiple lines and it did it completely corrctly! Actually mind-blowing! 🤯🤯
  • @johannesdolch
    Honestly regarding images: What we really need IS multi-modality. The images produced by common models like SD are good enough. The problem is that it doesn't really understand what it is doing. If they can keep the quality of current models and just add a deep understanding to it, that multiplies the actual quality of the outcome by orders of magnitude in the sense that you get what you actually want AND can change specific things instead of getting images that so-so follow a prompt somewhat and then inpainting and hoping for the best.
  • @fabiankliebhan
    About the chalkboard. I think the dual chalkboards are not unrealistic. We had those a lot when I was studying. You could move them up and down to have more space.
  • @user-xj6ke4qk8t
    The ability to read / show screen share your desktop and dictate is game changer for context as you can demonstrate what you want done rather than just trying to describe it.
  • @nathanbanks2354
    An odd thing about GPT-4o is that it's better at poetry than it used to be. It has a better idea of the meter of a limerick or a sonnet than it did before it had a multimodal understanding of what words sounded like. Words like "love" and "prove" don't rhyme any more. You can see this by asking GPT-4 turbo and GPT-4o to produce poems using the existing text interface. It's also the first time I found a model that can reliably produce a Petrarchan/Italian sonnet instead of a Shakespearean/Elizabethan sonnet--previous models always used the much-more-common Elizabethan rhyming scheme.
  • @wannaBtraceur
    This is the first AI model that I feel the urge to use. The capabilities are incredible.
  • @alansmithee419
    14:10 many university blackboards like this come in sets of three at different depths above the wall. You can slide them up and down to access the other boards. It allows the lecturer to keep writing on new board while allowing students to still see previous steps in the lesson if they need to look back and also means the professor doesn't have to waste time erasing the whole board every 5/10 mins.
  • @iamjohnbuckley
    Cracked me up at “I wouldn’t even be able to tell you this was a missile in the first place! This things a professional!” 😂
  • @I-Dophler
    The image editing capabilities are truly mind-blowing. With music, video, and audio generation advancements on the horizon, the creative possibilities are endless. Many thanks.