DeepSeek releases ‘sparse attention’ model that cuts API costs in half

Cyber Security, ICT, Most Popular, Trends News

September 30, 2025

No Comments

By Karla T Vasquez

WhatsApp Group Join Now

Telegram Group Join Now

Researchers on Monday published a new experimental model called V3.2-XP, which is designed to reduce the cost of low estimates when used in long-lone activities. The model announced the model with DEPSEC A post on the face of hugsAlso post A linked academic paper At Github.

The most important feature of the new model is called DEPSEC Spars’ attention, it is a complex system described in detail in the image below. In short, the system uses a module called a “thunder index” to prioritize certain parts from the context window. After that, a separate system called “fine-donated token selection system” chooses the specified token from within those parts to load the limited attention to the module. Taken together, they allow unnecessary attention models in the long parts of the context with relatively small server loads.

For long-tactful activities, the benefits of the system are significant. Initial examination of DIPSEC has shown that the price of a common API call can be reduced as half in long-lone circumstances. Further examinations need to be created for further visits, but the model is open to open weight and hugs freely for the face, and third -party tests will not be too long before evaluating the paper demands.

The new model of the DEPSEC is a string of recent breakthroughs that deal with the expenditure expenditure-in all, the cost of managing a pre-educated AI model as separate from training costs. In the case of DIPSEC, researchers were looking for ways to handle the basic transformer architecture more efficiently – and find that significant improvement should be made here.

China -based, DEPSEC has become an unusual figure in the AI Boom, especially for those who see AI research as a nationalist struggle between the United States and China. The company created a wave at the beginning of the year with its R1 model, trained in learning reinforcement initially at a much lower cost than American competitors. However, the model did not have a wholesale revolution in AI training, as some forecast, and the company had dropped from the spotlight within a few months.

The new “rare attention” method is unlikely to be made of the same as the R1 – but it can provide some necessary strategies to help keep our suppliers low cost.