
OpenAI proclaims agentic safety researcher that may discover and repair vulnerabilities
OpenAI has launched a personal beta for a brand new AI agent referred to as Aardvark that acts as a safety researcher, discovering vulnerabilities and making use of fixes, at scale.
“Software program safety is among the most important—and difficult—frontiers in know-how. Annually, tens of 1000’s of recent vulnerabilities are found throughout enterprise and open-source codebases. Defenders face the daunting duties of discovering and patching vulnerabilities earlier than their adversaries do. At OpenAI, we’re working to tip that steadiness in favor of defenders,” OpenAI wrote in a weblog submit.
The agent constantly analyzes supply code repositories to establish vulnerabilities, assess their exploitability, prioritize severity, and suggest patches. As an alternative of utilizing conventional evaluation strategies like fuzzing of software program composition evaluation, Aardvark makes use of LLM-powered reasoning and tool-use.
Cursor 2.0 permits eight brokers to work in parallel with out interfering with one another
The AI coding editor Cursor introduced the launch of Cursor 2.0, the following iteration of the platform, that includes a brand new interface for working with a number of brokers and its first ever coding mannequin.
The brand new multi-agent interface facilities round brokers as a substitute of information. With this new interface, as much as eight brokers can work in parallel, utilizing git worktrees and distant timber to stop them from interfering with one another. It additionally permits builders to have a number of fashions try the identical drawback and see which one produces the most effective output.
Whereas this new interface is designed for brokers, builders will nonetheless be capable to open information or change again to the traditional IDE as wanted.
The brand new coding mannequin, Composer, is 4 occasions sooner than related fashions, the corporate claims. It was designed for low-latency agentic coding duties in Cursor, and it might full most turns in lower than 30 seconds.
Workato launches Enterprise MCP for SaaS platforms
Organizations are spending enormous {dollars} on AI brokers, however are discovering that integrating the brokers into all of the programs the enterprise must perform is a really excessive hurdle.
To assist make SaaS platforms agent-ready, integration orchestration firm Workato launched Workato Enterprise MCP, which the corporate mentioned in its announcement can “flip present workflows, integrations, and APIs into wealthy, multi-step agent abilities that any large-language-model (LLM)-based agent can name, together with ChatGPT, Claude, Gemini, and Cursor.”
Adam Seligman, chief know-how officer at Workato, advised SD Instances that “the factor we maintain coming again to time and again is brokers present a variety of promise, however to actually work for enterprise, they need to get entry to enterprise information. They usually have to have the ability to do issues inside what you are promoting, however do it in a manner that you just belief. And it’s actually laborious to get these two issues proper.”
JetBrains launches open benchmarking platform for measuring AI productiveness
JetBrains has launched a brand new instrument designed to allow builders to measure their precise productiveness positive aspects from AI instruments.
The corporate’s Developer Productiveness AI Area (DPAI Area) is an open benchmarking platform for the way nicely AI growth instruments full real-world software program engineering duties. In line with the corporate, present benchmarks that LLMs are run in opposition to depend on outdated datasets, cowl a slender vary of applied sciences, and focus primarily on issue-to-patch workflows.
“As AI coding instruments advance quickly, the trade nonetheless lacks a impartial, standards-based framework to measure their actual impression on developer productiveness,” the corporate wrote in a weblog submit.
DPAI Area makes use of a versatile, track-based structure to allow reproducible comparisons throughout workflows like patching, bug fixes, PR overview, check technology, static evaluation, and extra.
GitHub unveils Agent HQ, the following evolution of its platform that focuses on agent-based growth
Throughout its annual convention, GitHub Universe, GitHub shared its plans for Agent HQ, its imaginative and prescient for the way forward for the platform the place AI brokers are natively built-in throughout all of GitHub.
As a part of this Agent HQ initiative, over the following a number of months, paid GitHub Copilot customers will achieve direct entry to standard coding brokers from Anthropic, OpenAI, Google, Cognition, xAI, and extra.
Agent HQ brings with it a number of new capabilities to help this subsequent evolution, the primary of which is mission management, a central command heart for assigning, steering, and monitoring the work of a number of brokers throughout GitHub, Copilot CLI, and VS Code.
Mission management’s department controls offers builders granular oversight over working checks for code created by the brokers. Id options can even be launched to permit builders to handle brokers like they’d different coworkers and management which agent is constructing a process, handle entry, and implement insurance policies.
OpenAI completes restructuring, strikes new take care of Microsoft
OpenAI as we speak introduced that it has accomplished the restructuring of its enterprise. When the corporate was based in 2015, it was launched as a non-profit group and that non-profit has managed the for-profit arm of the enterprise.
As we speak’s restructuring turns the for-profit arm right into a public profit company referred to as OpenAI PBC. The OpenAI Basis—the brand new identify for the non-profit—will nonetheless management the for-profit and maintain a 26% fairness stake in OpenAI PBC, which is presently valued at round $130 billion.
Being a public profit company differs from conventional company buildings in that they’re “required to advance its said mission and think about the broader pursuits of all stakeholders, making certain the corporate’s mission and business success advance collectively,” OpenAI’s web site explains.
Microsoft proclaims public preview for planning functionality that improves how Copilot in Visible Studio handles complicated duties
Microsoft has introduced a public preview for a brand new characteristic that goals to allow Copilot in Visible Studio to deal with extra complicated tasks.
With its new planning functionality in Agent Mode, Copilot will analysis the codebase to interrupt down large duties into smaller and extra manageable duties, whereas additionally iterating on its plan as it really works by way of the steps.
“Planning makes Copilot extra predictable and constant by giving it a structured strategy to purpose about your mission. It builds on strategies from hierarchical and closed-loop planning analysis – enabling Copilot to plan at a excessive stage, execute step-by-step, and alter dynamically because it learns extra about your codebase and points encountered throughout implementation,” Rhea Patel, product supervisor at Microsoft, wrote in a weblog submit.
GitKraken releases Insights to assist firms measure ROI of AI
GitKraken, a software program engineering intelligence firm that focuses on enhancing the developer expertise, introduced the launch of GitKraken Insights to supply firms with higher insights into AI’s impression on developer productiveness.
Matt Johnston, CEO of Gitkraken, advised SD Instances that regardless of the incremental investments in and perceived velocity positive aspects from AI, they battle to know the impression. “I used to be speaking to a VP of developer expertise at a big Silicon Valley firm, and he was principally saying, ‘We’ve made investments of 1000’s of seats in Cursor and Copilot and Claude, and we are able to’t actually inform what’s getting used… and how on earth do I measure this in a manner that’s compelling to my enterprise leaders.”
GitKraken Insights brings collectively a number of totally different metrics—DORA metrics, code high quality evaluation, technical debt monitoring, AI impression measurement, and developer expertise indicators—to color an image of what’s taking place inside the growth lifecycle.
Mabl proclaims updates to Agentic Testing Teammate
The Agentic Testing Teammate works alongside human testers to make the method extra environment friendly. New updates embody AI vectorizations and check semantic search, enhancements to check protection, and enhancements to the MCP Server that allow testers to do a lot of duties instantly inside their IDE, together with Take a look at Impression Evaluation, clever check creation, and failure suggestions.
“This new work is constructed on the concept an agent can develop into an integral a part of your testing workforce,” mentioned Dan Belcher, co-founder of mabl. “Not like scripting frameworks and general-purpose massive language fashions, mabl builds deep information about your utility over time and makes use of that information to make it–and your workforce–more practical.”
Couchbase 8.0 provides three new vector indexing and retrieval capabilities
These new capabilities are designed to help various vector workloads that facilitate real-time AI functions.
Hyperscale Vector Index is predicated on the DiskANN nearest-neighbor search algorithm and permits operation throughout partitioned disks for distributed processing. Composite Vector Index helps pre-filtered queries that may scope the particular vector being sought. Search Vector Index helps hybrid searches containing vectors, lexical search, and structured question standards in a single SQL++ request.
Anthropic expands reminiscence to all paid Claude customers
Anthropic introduced that the current reminiscence characteristic in Claude is being rolled out to Professional and Max plan customers, making it out there to all paid customers now.
Reminiscence was initially introduced in early September, however was solely out there to Crew and Enterprise customers to start with.
Reminiscence permits Claude to recollect your tasks and preferences so that you just don’t must re-explain necessary context throughout periods. “Nice work builds over time. With reminiscence, every dialog with Claude improves the following,” Anthropic wrote in its preliminary announcement.
Harness brings vibe coding to database migration with new AI-Powered Database Migration Authoring characteristic
Harness is on a mission to make it simpler for builders to do database migrations with its new AI-Powered Database Migration Authoring characteristic. This new functionality permits customers to explain schema modifications in pure language to obtain a production-ready migration.
For instance, a developer may ask “Create a desk named animals with columns for genus_species and common_name. Then add a associated desk named birds that tracks unladen airspeed and correct identify. Add rows for Captain Canary, African swallow, and European swallow.”
Harness’ platform would then analyze the present schema and insurance policies, generate a backward-compatible migration, validate the change for security and compliance, commit it to Git for testing, and create rollback migrations.
Purple Hat Developer Lightspeed brings AI help to Purple Hat’s Developer Hub and migration toolkit
Purple Hat Developer Lightspeed has been built-in into each the Purple Hat Developer Hub and the migration toolkit for functions (MTA).
Within the Purple Hat Developer Hub, it acts as an assistant to hurry up non-coding duties, like exploring utility design approaches, writing documentation, producing check plans, and troubleshooting functions.
Within the migration toolkit, Purple Hat Developer Lightspeed automates supply code refactoring inside the IDE. It leverages MTA’s static code evaluation to know migration points and tips on how to repair them, and in addition improves over time by studying what made previous modifications profitable.
MariaDB unifies transactional, analytical, and vector databases in MariaDB Enterprise Platform 2026 launch
MariaDB’s Enterprise Platform 2026 launch was introduced this week, with the promise that it’ll act as “the definitive database platform for constructing next-generation clever functions.”
To help agentic AI, the corporate added native RAG for grounding LLMs with context from MariaDB while not having embeddings, vector shops, or retrieval pipelines. The corporate additionally added ready-to-use brokers inside the platform, together with a developer copilot that connects to the database and may reply to pure language queries, and a DBA copilot that may handle duties like efficiency tuning and debugging.
Moreover, the corporate added an built-in MCP server in order that brokers can work together with MariaDB databases. The MCP interface in MariaDB permits customers to combine vector search, LLMs, and commonplace SQL operations, and permits brokers to launch serverless databases within the cloud.
Spotify Portal now typically out there and filled with options for enhancing dev expertise
Spotify Portal for Backstage supplies builders with a ready-to-use model of Backstage, its open supply resolution for constructing inner developer portals (IDPs).
AiKA, which is an AI assistant for Portal, can now hook up with third-party MCP servers and set off actions in Portal. AiKA itself additionally features as an MCP server, permitting builders to attach it as much as instruments like Cursor or Copilot and entry Portal information.
“The overall availability of Spotify Portal marks a pivotal second in how organizations construct, measure, and optimize developer expertise. What started as an inner instrument for Spotify engineers is now a fully-fledged platform for enterprises, combining the reliability of Backstage, the perception of Confidence, and the velocity of AI-driven workflows,” Spotify wrote.
Sonar proclaims new resolution to optimize coaching datasets for coding LLMs
Sonar, an organization that focuses on code high quality, introduced a brand new resolution that may enhance how LLMs are educated for coding functions.
In line with the corporate, LLMs which might be used to assist with software program growth are sometimes educated on publicly out there, open supply code containing safety points and bugs, which develop into amplified all through the coaching course of. “Even a small quantity of flawed information can degrade fashions of any measurement, disproportionately degrading their output,” Sonar wrote in an announcement.
SonarSweep (now in early entry) goals to mitigate these points by making certain that fashions are studying from high-quality, safe examples.
It really works by figuring out and fixing code high quality and safety points within the coaching information itself. After analyzing the dataset, it applies a strict filtering course of to take away low-quality code whereas additionally balancing the up to date dataset to make sure it would nonetheless provide various and consultant studying.
Amazon launches Fast Suite to supply agentic AI throughout functions and AWS providers
Amazon Fast Suite permits customers to ask questions, conduct deep analysis, analyze and visualize information, and create automations.
It could possibly hook up with inner repositories, like wikis or intranet, and AWS providers. Amazon additionally provides 50+ built-in connectors to functions like Adobe Analytics, SharePoint, Snowflake, Google Drive, OneDrive, Outlook, ServiceNow, and Databricks, in addition to help for over 1,000+ apps through connecting to their MCP servers.
This deep connection throughout the enterprise permits Fast Sight to investigate information throughout all of an organization’s programs and create complicated enterprise workflows throughout a number of functions and departments.
“Not like conventional enterprise intelligence instruments that work solely with databases and information warehouses, Fast Sight’s agentic expertise analyzes all types of information throughout all of your programs and apps, together with your paperwork,” Amazon wrote in a weblog submit.
Google unveils Gemini Enterprise to supply firms a extra unified platform for AI innovation
Google is saying a brand new providing constructed round Gemini, designed particularly with massive enterprise use in thoughts.
Gemini Enterprise consolidates six core elements:
- Superior Gemini fashions
- A no-code workbench for analyzing data and orchestrating brokers
- Pre-built Google brokers for duties like deep analysis or information insights
- The power to connect with firm information
- A central governance framework for visualizing and securing all brokers
- Entry to an ecosystem of over 100,000 trade companions
“By bringing all of those elements collectively by way of a single interface, Gemini Enterprise transforms how groups work. It strikes past easy duties to automate complete workflows and drive smarter enterprise outcomes — all on Google’s safe, enterprise-grade structure,” Thomas Kurian, CEO of Google Cloud, wrote in a weblog submit.
Atlassian shares main updates to its genAI assistant Rovo at Crew ‘25 Europe
Atlassian is internet hosting its annual person convention Crew ‘25 Europe this week in Barcelona, and in the course of the occasion, the corporate shared a number of new and upcoming updates to its generative AI assistant Rovo.
Atlassian introduced the final availability of its AI coding agent Rovo Dev. Rovo Dev will help with code evaluations, documentation, dependency cleanups, and extra, and it leverages context from tickets, docs, incidents, and enterprise objectives to supply builders with data that may assist them make extra knowledgeable selections.
Moreover, beginning early subsequent yr, Rovo Search will develop into the default search in Jira, which can permit Jira’s search to recommend related points and tasks.
Rovo Chat can even be getting over 100 out-of-the-box modular capabilities from Atlassian and its companions that can be utilized in chat, brokers, and workflows. Different new Chat capabilities embody the flexibility to recollect previous conversations and preferences and a brand new collaborative workspace referred to as Canvas.
Google launches ecosystem of extensions for Gemini CLI
Google is launching Gemini CLI extensions to permit totally different growth instruments to attach as much as the Gemini CLI.
Every extension features a playbook that teaches the CLI tips on how to successfully use that instrument, eliminating the necessity for builders to configure them. “If you wish to look below the hood, Gemini CLI extensions package deal directions, MCP servers and customized instructions into a well-recognized and user-friendly format,” Google wrote in a weblog submit.
Twenty-two extensions can be found at launch from Google companions Atlassian, Canva, Confluent, Dynatrace, Elastic, Figma, GitLab, Grafana Labs, Harness, HashiCorp, MongoDB, Neo4j, Pinecone, Postman, Qodo, Shopify, Snyk, Sonar, Stripe, ThoughtSpot, Weights & Biases by CoreWeave, and WIX.
IBM provides new capabilities to watsonx Orchestrate to facilitate agentic AI at scale
As IBM kicked off its annual developer occasion TechXchange 2025, it introduced a number of new capabilities to allow organizations to unlock worth from agentic AI.
“There’s actually been a variety of buzz within the trade,” mentioned Bruno Aziza, vp of Knowledge, AI, and Analytics Technique at IBM Software program. “I believe for those who take a look at the context of all the pieces that’s happening, prospects are struggling. They’re struggling to get worth from their funding.
It introduced many updates to its AI agent orchestration platform, watsonx Orchestrate. The platform now consists of AgentOps, an observability and governance layer for AI brokers; Agentic Workflows, standardized and reusable flows that can be utilized to construct and sequence multi-agent programs; and Langflow integration to scale back agent setup time.
OpenAI DevDay: ChatGPT Apps, AgentKit, and GA launch of Codex
OpenAI held its annual Developer Day occasion this week the place it introduced a number of updates to its merchandise.
The corporate unveiled apps in ChatGPT in addition to an SDK for builders to construct them. Corporations which have created apps which might be already out there embody Reserving.com, Canva, Coursera, Figma, Expedia, Spotify, and Zillow.
When a person says the identify of an out there app in a immediate, ChatGPT will routinely floor that app within the chat. For instance, saying “Spotify, make a playlist for my get together this Friday” will convey within the Spotify app. ChatGPT can even be capable to recommend apps when it thinks they’re related to the dialog, akin to suggesting Zillow’s app in a dialog about shopping for a home.
Google’s coding agent Jules now works within the command line
Google’s coding agent Jules now can be utilized instantly in developer’s command traces in order that it might act as extra of a coding companion.
In line with Google, it created this new command line interface—referred to as Jules Instruments—out of a recognition that the terminal is the place builders spend most of their time.
Jules Instruments permits builders to spin up duties, examine what Jules is doing, and combine Jules into automation. “Consider Jules Instruments as each a dashboard and a command floor on your coding agent,” Google wrote in a weblog submit.
Amazon Bedrock AgentCore MCP server now out there
The AgentCore MCP server provides built-in help for runtime, gateway integration, id administration, and agent reminiscence. It was created to hurry up the method of making elements which might be suitable with Bedrock AgentCore.
“What sometimes takes important effort and time, for instance studying about Bedrock AgentCore providers, integrating Runtime and Instruments Gateway, managing safety configurations, and deploying to manufacturing can now be accomplished in minutes by way of conversational instructions along with your coding assistant,” AWS wrote in a weblog submit.
DigitalOcean updates Gradient AI Platform
The Gradient AI Platform is a platform for constructing AI brokers while not having to handle the underlying infrastructure. New options which have been added embody help for picture technology, auto-indexing of information bases, and VPC integration.
Moreover, DigitalOcean revealed that it is going to be increasing the platform additional within the subsequent few weeks with new choices just like the Gradient AI AgentDevelopmentKit and Gradient AI Genie, which integrates into IDEs and can be utilized to handle multi-agent programs utilizing pure language.
Microsoft proclaims preview of its new Agent Framework
Microsoft has introduced a preview of the Microsoft Agent Framework, an open-source growth equipment for .NET and Python for creating AI brokers and multi-agent workflows.
It helps creating particular person brokers in addition to graph-based workflows to attach up a number of brokers.
In line with Microsoft, the Agent Framework is a direct successor to its different tasks Semantic Kernel and AutoGen, using foundations from each. It brings collectively Semantic Kernel’s enterprise-grade options like thread-based state administration, kind security, filters, telemetry, and mannequin and embedding help, with AutoGen’s abstractions for single- and multi-agent patterns.
Mendix updates its low-code platform with agentic AI options
New agent and genAI options embody an agent builder, the flexibility to create mission plans utilizing generative AI, the flexibility to create microflows and workflows with AI, and help for MCP.
One other focus space of the discharge is enterprise course of automation, and new options associated to that embody the flexibility for Mendix Workflows to name AI brokers, dynamic case administration, and World Inbox, a single view for all duties from a number of distributed workflows.
California passes regulation to make sure protected innovation of frontier AI fashions
Earlier this week, California’s governor Gavin Newsom signed a brand new regulation designed to make sure protected growth and deployment of frontier AI fashions.
“California has confirmed that we are able to set up rules to guard our communities whereas additionally making certain that the rising AI trade continues to thrive,” Newsom mentioned. “This laws strikes that steadiness. AI is the brand new frontier in innovation, and California is just not solely right here for it – however stands sturdy as a nationwide chief by enacting the first-in-the-nation frontier AI security laws that builds public belief as this rising know-how quickly evolves.”
The regulation, SB 53, establishes necessities for firms growing frontier AI fashions, spanning 5 classes: transparency, innovation, security, accountability, and responsiveness.
Slack evolves to help agentic capabilities constructed on dialog information
Salesforce is saying a number of main updates to Slack that may allow prospects to leverage their dialog historical past for AI apps and brokers.
The corporate is saying a real-time search (RTS) API, which surfaces up-to-date discussions, information, and channels to supply brokers entry with context-aware data. To make sure safe use of knowledge, information stays in Slack and the API adheres to present person entry permissions and solely retrieves information that’s related to the question.
“It unlocks your group’s collective intelligence, securely connecting brokers to conversations and selections that had been as soon as trapped in silos,” Salesforce wrote in a weblog submit.
Anthropic claims its newly launched Claude Sonnet 4.5 is the “greatest coding mannequin on this planet”
Claude Sonnet 4.5 achieves a 77.2% on the SWE-bench for software program engineering, in comparison with 74.5% for Claude Opus 4.1 and 72.7% for Claude Sonnet 4. For exterior comparability, GPT-5 Codex scored at 74.5%, GPT-5 scored 72.8%, and Gemini 2.5 Professional scored 67.2%.
Moreover, it leads within the OSWorld benchmark, which assessments AI fashions on real-world pc duties. It scored 61.4% on that benchmark, beating out Claude Sonnet 4, which scored 42.2%.
“Sonnet 4.5 can produce near-instant responses or prolonged, step-by-step considering that’s made seen to the person,” Anthropic says.
In line with Anthropic, Claude Sonnet 4.5 additionally reveals higher domain-specific information and reasoning within the fields of finance, regulation, and medication.
Workato proclaims MCP platform
Workato Enterprise MCP supplies prospects with entry to over 100 absolutely managed MCP servers that may join with totally different LLMs and brokers, together with ChatGPT, Claude.AI, Amazon Q, Cursor, and Google Gemini. A number of the MCP servers out there within the platform embody ones from Atlassian, Field, Reddit, Salesforce, Okta, and Shopify.
“At Workato, we hear day by day that whereas MCP is thrilling, enterprises nonetheless face challenges making MCP work securely, successfully, and reliably at scale,” mentioned Adam Seligman, Chief Know-how Officer at Workato. “Workato Enterprise MCP modifications that by bringing the total spectrum of enterprise processes, from the entrance workplace to the again workplace and all the pieces in between, to AI brokers by way of MCP. With pre-built, enterprise-grade servers and abilities, we’re giving world enterprises a first-of-its-kind resolution that unlocks AI brokers to soundly execute actual enterprise processes at scale, delivering measurable enterprise worth.”
VibeSec embeds safety evaluation into AI coding fashions to stop technology of insecure code
OX Safety is shifting safety as far left as it might go together with the launch of VibeSec, which it says can cease insecure AI-generated code earlier than the code even will get generated.
It does this by embedding dynamic safety context into the coding mannequin in order that it doesn’t recommend code that incorporates safety points.
“VibeSec doesn’t simply speed up safety – it basically modifications how safety operates. For the primary time, safety strikes sooner than vulnerabilities,” mentioned Neatsun Ziv, co-founder and CEO, at OX Safety.
OutSystems launches Agent Workbench
Agent Workbench permits customers to create and orchestrate AI brokers that leverage their firm’s information units and workflows. For instance, in early entry, Axos Financial institution constructed a log evaluation agent to interpret error logs and Thermo Fisher Scientific used it to construct a Buyer Escalation Agent that interprets unstructured information from buyer interactions.
“Agent Workbench was created to provide our prospects the instruments they should construct the agentic future with OutSystems. Our Early Entry Program contributors have realized spectacular outcomes with Agent Workbench, positioning them as trade leaders in agentic AI,” mentioned Woodson Martin, CEO of OutSystems.