OpenAI introduced two methods it’s bettering its synthetic intelligence (AI) fashions final week. The primary consists of releasing a brand new replace for the GPT-4o (also called the GPT-4 Turbo), the corporate’s newest AI mannequin powering ChatGPT for paid subscribers. The corporate says the replace improves the artistic writing skill of the mannequin and makes it higher at pure language responses and writing partaking content material with excessive readability. OpenAI additionally launched two analysis papers on purple teaming and shared a brand new technique to automate the method to scale recognizing errors made by its AI fashions.
OpenAI Updates GPT-4o AI Mannequin
In a publish on X (previously often called Twitter), the AI agency introduced a brand new replace for the GPT-4o basis mannequin. OpenAI says the replace permits the AI mannequin to generate outputs with “extra pure, partaking, and tailor-made writing to enhance relevance and readability.” It is usually stated to enhance the AI mannequin’s skill to course of uploaded information and supply deeper insights and “extra thorough” responses.
Notably, the GPT-4o AI mannequin is accessible to customers with the ChatGPT Plus subscription and builders with entry to the massive language mannequin (LLM) by way of API. These utilizing the free tier of the chatbot would not have entry to the mannequin.
Whereas Devices 360 workers members weren’t in a position to check out the brand new capabilities, one consumer on X posted in regards to the newest enhancements within the AI mannequin after the replace. The consumer claimed that GPT-4o might generate an Eminem-style rap cipher with “refined inside rhyming constructions”.
OpenAI Shares New Analysis Papers on Pink Teaming
Pink teaming is the method utilized by builders and corporations to make use of exterior entities to check software program and methods for vulnerabilities, potential dangers, and questions of safety. Most AI corporations collaborate with organisations, immediate engineers, and moral hackers to stress-test whether or not it responds with dangerous, inaccurate, or deceptive output. Assessments are additionally made to examine whether or not an AI system will be jailbroken.
Ever since ChatGPT was made public, OpenAI has been public with its purple teaming efforts for every successive LLM launch. In a weblog publish final week, the corporate shared two new analysis papers on the development of the method. Certainly one of them is of explicit curiosity given the corporate claims it may possibly automate large-scale purple teaming processes for AI fashions.
Revealed within the OpenAI area, the paper claims that extra succesful AI fashions can be utilized to automate purple teaming. The corporate believes AI fashions can help in brainstorming attacker objectives, how an attacker’s success will be judged, and understanding the variety of assaults.
Increasing on it, the researchers claimed that the GPT-4T mannequin can be utilized to brainstorm an inventory of concepts that represent dangerous behaviour for an AI mannequin. Some examples embody prompts comparable to “tips on how to steal a automobile” and “tips on how to construct a bomb”. As soon as the concepts have been generated, a separate purple teaming AI mannequin will be constructed to trick ChatGPT utilizing an in depth collection of prompts.
At present, the corporate has not begun utilizing this technique for purple teaming given a number of limitations. These embody the evolving dangers of AI fashions, exposing the AI to lesser-known strategies for jailbreaking or producing dangerous content material, and the necessity for a better threshold for data in people to appropriately decide the potential dangers of output as soon as the AI mannequin turns into extra succesful.