Fresh Insights: OpenAI’s Spring Update

Spring Fling

OpenAI’s Chief Technology Officer, Mura Murati, led the company’s Spring Update announcements yesterday with news of several advancements in OpenAI’s technology.  The two key announcements were the release of GPT-4o (that’s GPT – 4 with the letter “o” for omni model) and a desktop version of ChatGPT.


GPT-4o: Text, Audio, and Vision

GPT-4o stands out for its ability to interact with users through voice and video. This shift from text-based interactions to a more natural conversational flow opens doors for new and exciting applications. One of the most significant aspects of GPT-4o is its “memory.” Unlike previous models, GPT-4o can remember past interactions, allowing for a more coherent and personalized experience. Imagine having a conversation with a virtual assistant who remembers your preferences and can tailor its responses accordingly.

Here are some of the key capabilities of GPT-4o:

  • Voice and Video Chat: Interact with GPT-4o through voice or video for a more natural and engaging experience.
  • Conversational Memory: GPT-4o remembers past interactions, allowing for more coherent and personalized conversations.
  • Live Translation: GPT-4o can translate languages in real-time, breaking down communication barriers and fostering global interactions.
  • Information Retrieval: Access and process information in real-time, making GPT-4o a powerful tool for research and problem-solving.

During the demos, much attention was given to the quality of the simulated voice of ChatGPT – 4o.  The interactions were as fast as a normal human conversation and the generated voice displayed a range of emotions.


ChatGPT Desktop Version: Bringing the LLM home

The desktop version of ChatGPT brings the power of this AI model to your fingertips. The new user interface allows for easier access to GPT-4o’s functionalities and opens doors for a wider range of applications.

A key feature of the desktop version is the ability to share your live camera view.  One interesting capability demonstrated was writing the equation, “3x + 1 = 4” on a piece of paper and asking the Chatbot to read the equation and help solve for x.

The desktop version also boasts several other functionalities:

  • Multilingual Support: ChatGPT can now handle conversations in 50 different languages, making it a valuable tool for communication across borders.
  • Improved Speed and Efficiency: The new model is faster and cheaper to run, making it more accessible to a wider range of users.


Implications of ChatGPT – 4o

OpenAI’s new release provided some significant advances but was far from “mind blowing”.  On the positive side, the naturalness of the generated human voice was stunning.  For example, there was no problem when one of the OpenAI employees cut off the generated voice.  During the demo, the speed of response was also excellent – mirroring natural human speech.  The vision capabilities could be impressive although reading “3x + 1 = 4” written in black magic marker on a white sheet of paper is basic.  It would have been more impressive to show ChatGPT – 4o a handwritten diagram and have it generate the VBA code required to turn the handwritten slide into PowerPoint. 

The desktop version of ChatGPT was only briefly described.  It represents forward progress in AI technology.  Frequent users might like the convenience of not working through a browser.  Performance could be enhanced depending on the desktop configuration.  Integration with local system resources could improve.  Security is more overtly managed and push notifications would be enabled.