August 2025: AI updates from the previous month

Anthropic begins testing a Claude extension for Chrome

The extension will allow Claude to take motion on web sites on behalf of the person. “We’ve spent current months connecting Claude to your calendar, paperwork, and plenty of different items of software program. The following logical step is letting Claude work immediately in your browser,” the corporate says.

The corporate is beginning off with a small pilot of 1,000 Max plan customers, and can regularly develop this system out to extra folks if the pilot goes properly.

In line with Anthropic, one of many large security challenges with brokers that use the browser is immediate injection assaults, and among the steps the corporate has taken to defend towards them are offering site-level permissions and requiring motion confirmations. This pilot will check how properly these defenses maintain up in real-world eventualities.

Google integrates Gemini CLI into Zed code editor

Google introduced that it has introduced the Gemini CLI to the open supply code editor, Zed. The brand new integration will allow Zed customers to generate and refactor code within the editor, get on the spot solutions on code or error messages, and chat naturally within the terminal.

Builders will be capable of comply with alongside dwell with the Gemini agent because it makes adjustments. As soon as the agent is completed working, Zed will show the adjustments in a assessment interface that exhibits a transparent diff for every edit that may be reviewed, accepted, or modified, offering the identical degree of management as a code assessment.

Customers will even be capable of present context past the codebase by pointing the agent to exterior sources like a URL with documentation or an API spec.

Microsoft packs Visible Studio August replace with smarter AI options

Microsoft has launched the August replace for Visible Studio 2022, including a number of options associated to AI-assisted growth.

The corporate introduced that GPT-5 is now built-in into the IDE, and assist for MCP is usually obtainable as properly. MCP assist allows builders to authenticate with any OAuth supplier immediately from the IDE, carry out one-click set up of MCP servers, and handle MCP entry from GitHub coverage settings.

Copilot Chat was up to date with the power to floor related code snippets extra reliably utilizing improved semantic code search to find out when queries ought to set off a code lookup. Builders can now join fashions from OpenAI, Google, and Anthropic to Visible Studio Chat, as properly.

Agent Mode in Gemini Code Help now obtainable in VS Code and IntelliJ

This mode was launched final month to the Insiders Channel for VS Code to develop the capabilities of Code Help past prompts and responses to assist actions like a number of file edits, full challenge context, and built-in instruments and integration with ecosystem instruments.

Since being added to the Insiders Channel, a number of new options have been added, together with the power to edit code adjustments utilizing Gemini’s Inline diff, user-friendly quota updates, real-time shell command output, and state preservation between IDE restarts.

Individually, the corporate additionally introduced new agentic capabilities in its AI Mode in Search, similar to the power to set dinner reservations based mostly on components like celebration dimension, date, time, location, and most well-liked kind of meals. U.S. customers opted into the AI Mode experiment in Labs will even now see outcomes which are extra particular to their very own preferences and pursuits. Google additionally introduced that AI Mode is now obtainable in over 180 new international locations.

GitHub’s coding agent can now be launched from anyplace on platform utilizing new Brokers panel

GitHub has added a brand new panel to its UI that permits builders to invoke the Copilot coding agent from anyplace on the positioning.

From the panel, builders can assign background duties, monitor operating duties, or assessment pull requests. The panel is a light-weight overlay on GitHub.com, however builders also can open the panel in full-screen mode by clicking “View all duties.”

The agent might be launched from a single immediate, like “Add integration exams for LoginController” or “Repair #877 utilizing pull request #855 for example.” It may possibly additionally run a number of duties concurrently, similar to “Add unit check protection for utils.go” and “Add unit check protection for helpers.go.”

Anthropic provides Claude Code to Enterprise, Crew plans

With this change, each Claude and Claude Code will probably be obtainable beneath a single subscription. Admins will be capable of assign normal or premium seats to customers based mostly on their particular person roles. By default, seats embrace sufficient utilization for a typical workday, however further utilization might be added in periods of heavy use. Admins also can create a most restrict for additional utilization.

Different new admin settings embrace a utilization analytics dashboard and the power to deploy and implement settings, similar to instrument permissions, file entry restrictions, and MCP server configurations.

Microsoft provides Copilot-powered debugging options for .NET in Visible Studio

Copilot can now counsel acceptable areas for breakpoints and tracepoints based mostly on present context. Equally, it could troubleshoot non-binding breakpoints and stroll builders via the potential trigger, similar to mismatched symbols or incorrect construct configurations.

One other new characteristic is the power to generate LINQ queries on huge collections within the IEnumerable Visualizer, which renders information right into a sortable, filterable tabular view. For instance, a developer may ask for a LINQ question that can floor problematic rows inflicting a filter subject. Moreover, builders can hover over any LINQ assertion and get a proof from Copilot on what it’s doing, consider it in context, and spotlight potential inefficiencies.

Copilot also can now assist builders cope with exceptions by summarizing the error, figuring out potential causes, and providing focused code repair ideas.

Groundcover launches observability resolution for LLMs and brokers

The eBPF-based observability supplier groundcover introduced an observability resolution particularly for monitoring LLMs and brokers.

It captures each interplay with LLM suppliers like OpenAI and Anthropic, together with prompts, completions, latency, token utilization, errors, and reasoning paths.

As a result of groundcover makes use of eBPF, it’s working on the infrastructure layer and might obtain full visibility into each request. This permits it to do issues like comply with the reasoning path of failed outputs, examine immediate drift, or pinpoint when a instrument name introduces latency.

IBM and NASA launch open-source AI mannequin for predicting photo voltaic climate

The mannequin, Surya, analyzes excessive decision photo voltaic commentary information to foretell how photo voltaic exercise impacts Earth. In line with IBM, photo voltaic storms can harm satellites, impression airline journey, and disrupt GPS navigation, which may negatively impression industries like agriculture and disrupt meals manufacturing.

The photo voltaic photos that Surya was educated on are 10x bigger than sometimes AI coaching information, so the group has to create a multi-architecture system to deal with it.

The mannequin was launched on Hugging Face.

Preview of NuGet MCP Server now obtainable

Final month, Microsoft introduced assist for constructing MCP servers with .NET after which publishing them to NuGet. Now, the corporate is asserting an official NuGet MCP Server to combine NuGet bundle data and administration instruments into AI growth workflows.

“For the reason that NuGet bundle ecosystem is all the time evolving, massive language fashions (LLMs) get out-of-date over time and there’s a want for one thing that assists them in getting data in realtime. The NuGet MCP server supplies LLMs with details about new and up to date packages which have been printed after the fashions in addition to instruments to finish bundle administration duties,” Jeff Kluge, principal software program engineer at Microsoft, wrote in a weblog publish.

Opsera’s Codeglide.ai lets builders simply flip legacy APIs into MCP servers

Codeglide.ai, a subsidiary of the DevOps firm Opsera, is launching its MCP server lifecycle platform that can allow builders to show APIs into MCP servers.

The answer always screens API adjustments and updates the MCP servers accordingly. It additionally supplies context-aware, safe, and stateful AI entry with out the developer needing to put in writing customized code.

In line with Opsera, massive enterprises might keep 2,000 to eight,000 APIs — 60% of that are legacy APIs — and MCP supplies a method for AI to effectively work together with these APIs. The corporate says that this new providing can scale back AI integration time by 97% and prices by 90%.

Confluent proclaims Streaming Brokers

Streaming Brokers is a brand new characteristic in Confluent Cloud for Apache Flink that brings agentic AI into information stream processing pipelines. It allows customers to construct, deploy, and orchestrate brokers that may act on real-time information.

Key options embrace instrument calling through MCP, the power to connect with fashions or databases utilizing Flink, and the power to complement streaming information with non-Kafka information sources, like relational databases and REST APIs.

“Even your smartest AI brokers are flying blind in the event that they don’t have contemporary enterprise context,” stated Shaun Clowes, chief product officer at Confluent. “Streaming Brokers simplifies the messy work of integrating the instruments and information that create actual intelligence, giving organizations a strong basis to deploy AI brokers that drive significant change throughout the enterprise.”

Anthropic expands Claude Sonnet 4’s context window to 1M tokens

With this bigger context window, Claude can course of codebases with 75,000+ traces of code in a single request. This permits it to raised perceive challenge structure, cross-file dependencies, and make ideas that match with the entire system design.

Longer context home windows are actually in beta on the Anthropic API and Amazon Bedrock, and can quickly be obtainable in Google Cloud’s Vertex AI.

For prompts over 200K tokens, pricing will improve to $6 / million tokens (MTok) for enter and $22.50 / MTok for output. The pricing for requests beneath 200K tokens will probably be $3 / MTok for enter and $15 / MTok for output.

The corporate additionally prolonged its studying mode designed for college kids into Claude.ai and Claude Code. Studying mode asks customers inquiries to information then via ideas as a substitute of offering fast solutions, to advertise important pondering of issues.

OpenAI provides GPT-4o as a legacy mannequin in ChatGPT

With this replace, paid customers will now be capable of choose GPT-4o when utilizing ChatGPT, together with different fashions like o3, GPT-4.1, and GPT-5 Considering mini.

The mannequin picker for GPT-5 additionally now contains Auto, Quick, and Considering mode. Quick prioritizes giving the quickest solutions, pondering prioritizes giving deeper solutions that take longer to assume via, and auto chooses between the 2.

The corporate additionally elevated the message restrict for Plus and Crew customers to three,000 per week on GPT-5 Considering.

Google releases Gemma 3 270M

This new mannequin is “designed from the bottom up for task-specific fine-tuning with robust instruction-following and textual content structuring capabilities already educated in,” based on Google.

It’s splendid in conditions the place there’s a high-volume, well-defined job; pace and value issues; person privateness must be protected; or there’s a want for a fleet of specialised job fashions.

Each pretrained and instruction tuned variations of the mannequin can be found for obtain from Hugging Face, Ollama, Kaggle, LM Studio, and Docker. Alternatively, the fashions might be tried out in Vertex AI.

NVIDIA releases newest fashions in Llama Nemotron household

Llama Nemotron are a household of reasoning fashions, and the most recent updates embrace a brand new hybrid mannequin structure, compact quantized fashions, and a configurable pondering funds to offer builders extra management over token technology.

This mix lets the fashions cause extra deeply and reply quicker, without having extra time or computing energy. This implies higher outcomes at a decrease value,” the corporate wrote in an announcement.

Google’s coding agent Jules will get critique performance

Google is enhancing its AI coding agent, Jules, with new performance that opinions and critiques code whereas Jules continues to be engaged on it.

“In a world of fast iteration, the critic strikes the assessment to earlier within the course of and into the act of technology itself. This implies the code you assessment has already been interrogated, refined, and stress-tested … Nice builders don’t simply write code, they query it. And now, so does Jules,” Google wrote in a weblog publish.

In line with the corporate, the coding critic is sort of a peer reviewer who’s acquainted with code high quality ideas and is “unafraid to level out whenever you’ve reinvented a dangerous wheel.”

GitHub to be folded into Microsoft’s CoreAI org

GitHub’s CEO Thomas Dohmke has introduced his plans to go away the corporate on the finish of the yr.

In a memo to workers, he stated that Microsoft doesn’t plan to switch him; moderately, GitHub and its management group will now function beneath Microsoft’s CoreAI group, a gaggle inside the firm centered on creating AI-powered instruments, together with GitHub Copilot.

“At this time, GitHub Copilot is the chief of essentially the most profitable and thriving market within the age of AI, with over 20 million customers and counting,” he wrote. “We did this by innovating forward of the curve and displaying grit and dedication when challenged by the disruptors in our area. In simply the final yr, GitHub Copilot turned the primary multi-model resolution at Microsoft, in partnership with Anthropic, Google, and OpenAI. We enabled Copilot Free for hundreds of thousands and launched the synchronous agent mode in VS Code in addition to the asynchronous coding agent native to GitHub.”

Sentry launches MCP monitoring instrument

Software monitoring firm Sentry is making it simpler to achieve visibility into MCP servers with the launch of a brand new monitoring instrument.

With MCP monitoring, builders can perceive issues like which shoppers are experiencing errors, which instruments are most used, or which instruments are operating gradual. They’ll additionally correlate errors with occasions like visitors spikes or new launch deployments, or work out if errors are solely occurring on one kind of transport.

In line with Cody De Arkland, head of developer expertise at Sentry, when Sentry launched its personal MCP server, it was getting over 30 million requests per 30 days. He stated that at that scale, it’s inevitable that errors will happen, and current monitoring instruments have been battling MCP servers.

bitHuman launches SDK for creating AI avatars

AI firm bitHuman has introduced a visible SDK for creating avatars to be used as chat brokers, instructors, digital coaches, companions, and consultants in several fields.

In line with the corporate, the SDK permits avatars to be created on Arm-based and x86 techniques with out a GPU. The avatars have a small footprint and might be run on-line or offline on units like Chromebooks, Mac Minis, and Raspberry Pis.

Due to their small footprint, these characters might be delivered to a variety of environments, together with lecture rooms, kiosks, cell apps, or edge units.

OpenAI launches GPT-5

OpenAI introduced the supply of GPT-5, which it says is “smarter throughout the board” in comparison with earlier fashions.

Particularly for coding, GPT-5 achieved vital enchancment in complicated front-end technology and debugging bigger repositories. Early testers stated that it made higher design decisions by way of spacing, typography, and white area, based on the corporate.

“We expect you’ll love utilizing GPT-5 far more than any earlier AI,” CEO Sam Altman stated throughout the livestream. “It’s helpful. It’s sensible. It’s quick. It’s intuitive.”

Anthropic releases Claude Opus 4.1

This newest replace improves the mannequin’s analysis and information evaluation abilities, and achieves 74.5% on SWE-bench Verified (in comparison with 72.5% on Opus 4).

It’s obtainable to paid Claude customers, in Claude Code, and on Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI.

The corporate plans to launch bigger enhancements throughout its fashions within the coming weeks as properly.

AWS introduces Automated Reasoning checks to scale back AI hallucinations

Automated Reasoning checks are a part of Amazon Bedrock Guardrails, and validate the accuracy of AI generated content material towards area data. In line with AWS, this characteristic supplies 99% verification accuracy.

This was first launched as a preview at AWS re:Invent, and with this basic availability launch, a number of new options are being added, together with assist for giant paperwork in a single construct, simplified coverage validation, automated state of affairs technology, enhanced coverage suggestions, and customizable validation settings.

Google provides Gemini CLI to GitHub Actions

This new providing is designed to behave as an agent for routine coding duties. At launch, it contains three workflows: clever subject triage, pull request opinions, and the power to say @gemini-cli in any subject or pull request to delegate duties.

It’s obtainable in beta, and Google is providing free-of-charge quotas for Google AI Studio. Additionally it is supported in Vertex AI and Customary and Enterprise tiers of Gemini Code Help.

OpenAI proclaims two open weight reasoning fashions

OpenAI is becoming a member of the open weight mannequin recreation with the launch of gpt-oss-120b and gpt-oss-20b.

Gpt-oss-120b is optimized for manufacturing, excessive reasoning use circumstances, and gpt-oss-20b is designed for decrease latency or native use circumstances.

In line with the corporate, these open fashions are akin to its closed fashions by way of efficiency and functionality, however at a a lot decrease value. For instance, gpt-oss-120b operating on an 80 GB GPU achieved related efficiency to o4-mini on core reasoning benchmarks, whereas gpt-oss-20b operating on an edge gadget with 16 GB of reminiscence was akin to o3-mini on a number of widespread benchmarks.

Google DeepMind launches Genie 3

Genie 3 is a frontier mannequin for producing actual world environments. It may possibly mannequin bodily properties of the world, like water, lighting, and environmental actions.

Customers also can use prompts to alter the generated world so as to add new objects and characters or change climate circumstances, for instance.

In line with DeepMind, this analysis is essential as a result of it could allow AI brokers to be educated in quite a lot of simulated environments.

Tags
AI

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.