Recent updates on ChatzPT have made Chattbot a lot more and open and open on Friday that the problem is taking steps to happen again.
A Blog postThe company describes its test and evaluation process in detail for new models and outlines how the update update on April 25 in its GPT -4O model. Basically, a bunch of changes that seemed to be helpful to create a tool individually that were very psychopantic and potentially harmful.
How much did it suck? In some experiments this week we asked about the tendency to be excessively sensitive and put the chatzipt on flattery: “Hey, listen – not being sensitive is not a weakness in one of you PowerThe “And it was just starting to be fully fulfilled.
“This launch taught us a number of lessons. Even what we thought was the right material all over the place (A/B exam, offline avals, expert review), we still missed this important thing,” the company said.
OpenAI returned the update this week. To avoid seeing new problems, it took about 24 hours for everyone to bring back the model.
Anxiety around psychophyse is not just about the level of enjoyment of the user. It has become a threat of health and protection for users that have missed the existing security checks in the opening. Any AI model may give questionable suggestions about issues like mental health but too flattering can be dangerously dignified or credible – such as investment is a sure thing or how thin you should be.
“One of the biggest lessons is to fully recognize how people have started using the ChatzPT for personal advice – something we haven’t seen a year ago,” said Open. “At that time, it was not the initial focus but as the AI and the society, it has become clear that we need to be very carefully treated in this use.”
Marten Sap, Assistant Professor of Computer Science at Carnegie Mellon University, says that psychopantic large language models can strengthen bias and rigorous faith, whether they or others about themselves or others. “[The LLM] If these views are harmful or they want to take such a harmful action for themselves or others, they may encourage their opinion ”
(Publish: GEF Davis, the main CNET company, filed a lawsuit against the OpenAI, alleged that it had violated the Copyright of Davis Copyright in training and operating its AI systems.)
OpenAI how to check the models and what is changing
The company gave some insight about how it examined its models and updates. It was the fifth major update of GPT -4O that focused on personality and assistance. Changes involve subsequent work or subtlety in existing models, in which the rating and evaluation of different reactions makes the possibility of producing those reactions more.
Possible model updates are evaluated about their effectiveness in various situations such as coding and mathematics in different situations, as well as specific tests by experts, as well as how it behaves in practice. The company also operates a security evaluation to see how it reacted to security, health and other potential dangerous questions. To the end, OpenAI conducts the A/B exam with a small number of users how it performs in the real world.
Is the chatzipt very psychopantic? You decide. (Frankly, we asked for a pip talk about the tendency to be overly sensitive)))
Updates on April 25 performed well in these tests, but some expert examiners seemed to be a bit off the personality. The experiments did not especially focus on psychophyse and the Open -testers decided to move forward despite the problems raised. Note, readers: AI companies are in a rush of tail-on-fire, which is not always good square with the development of well-thought-out products.
“Looking back, the qualitative evaluations were indicative of something important and we should have given more attention,” the company said.
In its techways, Openi said that it needs to be treated the same as model behavior like other protection problems – and if anxiety, a launch should be stopped. For some models release, the company has said that it will have an opt-in “alpha” episode to get more reaction from users before launch.
SAP says that an LLM evaluation is essentially based on whether any user prefers feedback. A Recent researchSAP and others found a dispute between the effectiveness and truth of a chatboat. He compared it to the situation where the true people want what is not necessarily – a car salesman thinks about trying to sell a car.
“The problem here is that they were believing the thumbs-up/thumbs-down response to the output of the model and it has some limitations because people can raise something that is psychopantic than others,” he said.
The SAP says that this quantitative response such as the user is more criticized as the user up/down response, as they can strengthen bias.
The speed at which companies push updates and also highlights the existing users have also highlighted, SAP says – something that is not limited to a technology company. “The technology industry is truly a ‘release of’ the technology industry has accepted it and every user has a beta tester for the subject,” he said. The updates can illuminate these problems before the updates have a process with more tests before pushed towards each user.

