OpenAI has been showing some of its customers a new multimodal ai model that can talk to you and recognize objects, according to a ai-voice-assistant-as-it-chases-google-apple?rc=r6gev9″>new report Information. Citing anonymous sources who saw it, the outlet says this could be part of what the company plans to show on Monday.
The new model reportedly offers faster and more accurate interpretation of images and audio than its existing standalone transcription and text-to-speech models can do.. It could apparently help customer service agents “better understand the intonation of callers' voices or whether they are being sarcastic” and “theoretically,” the model can help students with math or translating real-world signs. , writes. Information.
The outlet's sources say the model may outperform the GPT-4 Turbo in “answering some types of questions,” but it's still susceptible to getting things wrong for sure.
OpenAI may also be preparing a new built-in ChatGPT capability for making phone calls, according to developer Ananay Arora, who posted the screenshot above of code related to the calls. Arora too x.com/ananayarora/status/1789085434779259331″>stained evidence that OpenAI had provided servers intended for real-time audio and video communication.
None of this would be GPT-5, if it is introduced next week. CEO Sam Altman has explicitly denied that his upcoming announcement has anything to do with what the model is supposed to be.”materially better”than GPT-4. Information writes GPT-5 may be published before the end of the year.