On Monday, Openai launched a new family of models called GPT -4.1. Yes, “4.1” – as the organization’s naming was not already confusing enough.
Here are GPT -1.5, GPT -5.5 mini and GPT -1.5 nano, all of them are called “Excel” followed by Openi Coding and Instructions. OpenAI API is available through API but not ChatzP, multimodal models have 1 million-token context window, which means they can take about 750,000 words simultaneously (more than “war and peace”).
OpenAI rivals like Google and the state -of -the -art programming models, such as GPT -1.3 Google, have appeared as an opponent. Google’s recently published Gemini 2.5 Pro, which has 1 million-token context window, is highly top in the popular coding benchmarks. Thus Earthly Claud 3.7 Sonnet and Chinese AI Startup Upgrade v3 of the DEPSECThe
This is the goal of many technicians, including OpenAI, to train AI coding models capable of performing complex software engineering. Openai’s great ambition is to create a “agent software engineer”, as CFO put it in the whole freeer During a Tech Summit in London last month. The company has claimed that its future models will be able to program the entire applications from end to the end, guaranteeing quality guarantee, bug testing and documentation writing aspects.
GPT -4.1 is a step in this direction.
A spokesperson for the OpenAIA told Techkranch through the email, “We have made GPT-1.3 for real-world use on the basis of direct reaction to the development of the most careful fields: Front end coding, low external editing, formats reliably followed by reaction structures and many improvements.” Enables to create good agents in software engineering tasks. ”
OpenAI has claimed that the entire GPT -5.1 model has surpassed its GPT -4O and GPT -4O mini models in coding benches with SWI -Bench. GPT-5.5 Mini and Nano are considered to be more efficient and fast spending on some accuracy, OpenAI says GPT is the fastest and cheap-model of 1.0 nano.
Millions of input token for GPT -1.5 and $ 2 per million output tokens per million. GPT -1.5 Minute $ 0.40/m input token and $ 1.60/m output token, and GPT -4.1 Nano $ 0.10/m input token and $ 0.40/m output token.
According to the internal examination of the OpenAI, GPT -1.5, which can produce more tokens once more than GPT -4 and (32,768 vs 16,384), which scored between 52% to 54.6% in a human -laundering subset in the Sweet -Bench. (Opena mentions in a blog post that some solutions to the need for a needle-beech problem cannot run in its infrastructure, so the score of the score)) These statistics are under Google’s score for Geni 2.5 Pro (63.8%) and Cloud 3.7 Sonet (62.3%), respectively.
In a separate assessment, OpenAI has searched GPT -1.0 using video -MME, which is designed to measure the skills of a model for “understanding” content in videos. OpenAI has claimed that GPT -1.3 “Long, no subtitles” has reached 72% accuracy in the video section.
Although the GPT -1.5 score has a good and more recent “Knowledge Cutoff” in benchmarks, it offers better frame references for current events (till June 2021), it is important to remember that even some best models fight with experts that do not travel to experts. For example, Many Study There is Showed Code-producing models often fail to fix security weaknesses and bugs and even introduce.
Opina also admits that more input tokens need to be dealt with with GPT -1.5 less reliable (ie, the choice of mistakes). The company’s own examination has decreased open AAI-MRCR, with about 84% to 50% with 1,024 token with about 84% token tokens. GPT -5.1 also becomes “literal” more than GPT -1 and more, the company says, sometimes more specific, obvious prompts are needed.

