
Coder creates AI Maturity Self-Evaluation and AI Maturity Curve
These new instruments will allow software program improvement groups to evaluate how successfully they’ve adopted AI. The evaluation asks groups questions like how standardized their developer atmosphere is, what their governance strategy for AI is, how they deal with dangers like delicate information publicity, and extra.
“As AI brokers tackle extra accountability inside engineering workflows, organizations want a clearer, extra tangible technique to perceive maturity and governance readiness,” stated Eric Paulsen, discipline CTO at Coder. “With out that baseline, it turns into tough to scale agentic AI safely or predictably. Our self-assessment provides groups a concrete view of the place they stand, to allow them to plan adoption deliberately, handle danger and scale with confidence.”
Anthropic makes instruments inside Claude interactive
Anthropic has introduced that customers will now be capable of immediately work together with sure instruments inside Claude.
Claude already had the power to hook up with instruments and take motion on a person’s behalf in these instruments, so what’s new right now is the power for the person to really go in and work together with these instruments immediately within the Claude window.
The instruments that at the moment assist interactivity embody Amplitude, Asana, Field, Canva, Clay, Figma, Hex, monday.com, and Slack, and there are plans to quickly add assist for Salesforce as effectively.
OpenAI will retire GPT-4o, GPT-4.1, GPT-4.1 mini, and OpenAI o4-mini in ChatGPT
The corporate had initially deprecated GPT-4o when it launched GPT-5, however introduced it again because of customers stating they wanted extra time to make the transition and that they most popular the older mannequin’s conversational fashion and heat.
OpenAI has since integrated that suggestions into GPT-5.1 and GPT-5.2 by making persona enhancements, providing larger assist for artistic ideation, and including extra methods to customise responses.
“We’re asserting the upcoming retirement of GPT‑4o right now as a result of these enhancements at the moment are in place, and since the overwhelming majority of utilization has shifted to GPT‑5.2, with solely 0.1% of customers nonetheless selecting GPT‑4o every day,” OpenAI wrote in a submit.
Teleport tackles agentic belief with new Agentic Identification Framework
Teleport has introduced the launch of its new Agentic Identification Framework that defines insurance policies, practices, developer instruments, and a reference structure for securely deploying brokers in manufacturing.
In line with the corporate, agentic AI introduces new safety challenges, as they invoke instruments, entry delicate information, delegate duties, and function throughout environments at scale, all with out human involvement.
Teleport says our present identification, entry, and safety fashions weren’t designed for non-deterministic methods, and present makes an attempt at deploying agentic methods have led to identification fragmentation, secrets and techniques sprawl, restricted visibility, and systemic danger.
The Agentic Identification Framework makes an attempt to resolve these points by establishing an identification layer that’s secured cryptographically with a {hardware} root of belief. It permits zero belief authentication, zero standing privileges, and real-time visibility into identification habits.
Apiiro pronounces Guardian Agent
Guardian Agent rewrites developer prompts to make them safer and guarantee they meet present wants of the software program structure, runtime environments, organizational insurance policies, and regulatory necessities.
In line with the corporate, due to AI, safety debt is being added quicker than it may be mounted, and asking builders to repair vulnerabilities after code is written is not ample. “The truth is evident: Detection won’t ever scale on the pace of AI. Solely prevention will,” the corporate wrote in a weblog submit.
Ai2 releases Open Coding Brokers
Open Coding Brokers are a household of open brokers that make the most of a coaching technique that makes it simpler for builders to construct their very own coding agent educated on their inner codebases.
The primary launch is SERA (Smooth-verified Environment friendly Repository Agent), which makes use of a fine-tuning technique that may be specialised to any codebase. The corporate can also be releasing SERA’s coaching information to assist researchers examine what labored and enhance on it.
“Accessible open fashions can now inherit robust agentic habits by way of a easy, reproducible pipeline—no large-scale RL infrastructure or engineering staff required. Working example, SERA was constructed largely by a single Ai2 researcher,” Ai2 wrote in a weblog submit.
Rocket Software program launches AI assistant for operational diagnostics
Rocket EVA permits groups to ask questions on their core methods and hint points from preliminary signs to system interactions to the accountable code. It additionally offers suggestions to assist groups resolve points extra shortly.
“By tracing points from the primary symptom to the precise line of code, EVA offers a unified path to perception with out the multi-product complexity different distributors require,” stated Michael Curry, president of information modernization at Rocket Software program. “Its capability to increase diagnostics throughout platforms and combine third-party MCP instruments reduces the time to resolve points, setting a brand new benchmark for the way enterprises keep resilient, excessive‑performing methods.”
Report: AI hallucinates 27% of improve suggestions for open supply tasks
Open-source adoption is being accelerated by AI and automation, however builders must proceed with warning to make sure they’re not introducing further danger into their software program provide chain.
Brian Fox, co-founder and CTO of Sonatype, defined that AI can speed up good engineering, however it may possibly additionally scale errors quicker, particularly if it doesn’t have real-world information to drag from. For instance, if a mannequin doesn’t know what variations exist or which of them have vulnerabilities, it predicts and fills within the clean, resulting in upgrades to variations that don’t exist or suggestions that break builds.
In its 2026 State of Software program Provide Chain report, Sonatype analyzed over 1.2 million malicious packages, 1,700 vulnerability data, and 37,000 AI-driven improve suggestions. It discovered that AI fashions advisable over 10,000 non-existent variations, which is a 27.75% hallucination fee.
GitHub Copilot SDK now in technical preview
The SDK permits builders to embed agentic capabilities into their functions utilizing the identical execution loop utilized by the GitHub Copilot CLI. The SDK repository contains setup directions, starter examples, and SDK references for the entire supported languages.
GitHub recommends beginning by defining a single job, equivalent to updating information or working a command, and letting Copilot plan and execute steps whereas the appliance provides domain-specific instruments and constraints.
Anthropic drafts new structure for Claude fashions
The structure is Anthropic’s imaginative and prescient for Claude’s values and habits. The primary sections on this up to date model embody specs associated to helpfulness, ethics, security, nature, and tips for how you can deal with particular points, like medical recommendation or cybersecurity requests.
“The structure is a vital a part of our mannequin coaching course of, and its content material immediately shapes Claude’s habits. Coaching fashions is a tough job, and Claude’s outputs may not at all times adhere to the structure’s beliefs. However we predict that the best way the brand new structure is written—with an intensive rationalization of our intentions and the explanations behind them—makes it extra more likely to domesticate good values throughout coaching,” Anthropic wrote.
OpenAI provides age prediction to ChatGPT
The corporate introduced that it is going to be utilizing age prediction expertise on ChatGPT client plans to find out if a person is beneath 18.
“Age prediction builds on protections already in place. Teenagers who inform us they’re beneath 18 once they join mechanically obtain extra safeguards to cut back publicity to delicate or probably dangerous content material. This additionally permits us to deal with adults like adults and use our instruments in the best way that they need, inside the bounds of security,” OpenAI wrote in a submit.
GitLab’s Duo Agent Platform is now usually obtainable
GitLab has made its Duo Agent Platform usually obtainable, offering improvement groups with agentic AI automation that has entry to a company’s full context, requirements, and guardrails.
The GA launch contains Agentic Chat, offering context-aware help all through the GitLab platform. Agentic Chat builds on the beforehand launched Duo Chat, and brings in context from points, merge requests, pipelines, safety findings, and extra, and may carry out actions on a developer’s behalf.
For instance, within the Internet UI, Agentic Chat can create points, epics, merge requests, and spotlight key findings and create actionable steering based mostly on organizational context. Moreover, within the IDE, it may possibly generate code, configurations, and infrastructure-as-code, in addition to repair bugs, generate texts, and produce documentation.
Different methods Agentic Chat can be utilized are serving to builders perceive, configure, or troubleshoot CI/CD pipelines or create new ones, and on the safety entrance, it may possibly clarify vulnerabilities, assist with problem prioritization, and suggest fixes.
Codenotary updates its free SBOM scanning software with capabilities that higher assist AI apps
Codenotary is including new capabilities to its SBOM.sh service, which offers free evaluation of software program payments of supplies (SBOMs).
In line with the corporate, the updates had been made in consideration of AI functions, and the software now treats datasets as software program provide chain artifacts.
“Conventional SBOM instruments had been constructed for an earlier period – focusing totally on supply code to enhance visibility into the software program provide chain,” stated Moshe Bar, CEO and co-founder of Codenotary. “Safety groups are swimming in SBOMs, however they’re not getting the actionable readability they want — particularly as AI transforms software program with AI functions are constructed on datasets that are completely ignored by conventional SBOMs.”
Testlio launches new AI-powered QA evaluation resolution
Testlio has introduced the discharge of a brand new AI-driven QA evaluation resolution referred to as LeoInsights.
The brand new platform is powered by the corporate’s intelligence layer LeoAI Engine, which was educated on 13 years of testing information, 2.6+ million take a look at circumstances, and 600,000+ gadgets.
It could actually present govt summaries that includes key adjustments, rising dangers, and demanding points, simplifying a number of QA studies into one that may be shared with leaders.
LeoInsights additionally affords a worth calculator that quantifies effectivity features, price financial savings, and high quality influence, serving to QA groups higher show their worth to management. The calculator can combination information throughout workspaces, do state of affairs modeling with adjustable inputs, and generate PDFs that may be shared with executives for budgeting and funding discussions.
New Relic provides monitoring for ChatGPT apps
New Relic clients will now be capable of monitor their customized ChatGPT apps to make sure they’re delivering the meant efficiency, reliability, and person expertise.
“Bringing enterprise companies into the pure stream of a ChatGPT dialog is a strong, intuitive, and revenue-generating technique,” stated Brian Emerson, chief product officer of New Relic. “However as soon as your rigorously crafted utility instantiates inside ChatGPT, it historically enters a black field the place customary browser monitoring instruments can fail.”
The corporate went on to clarify that when an app is rendered in a dialog, builders can’t see issues like format shifts or damaged buttons. Moreover, safety headers, content material safety insurance policies, i-frame sandbox guidelines, and limitations on client-side storage can disguise essential efficiency and person expertise information.
Google unveils new open-source customary for agentic commerce
Google has introduced a brand new open-source customary for agentic commerce referred to as the Common Commerce Protocol (UCP).
Developed in collaboration with numerous commerce firms, together with Shopify, Etsy, Wayfair, Goal, and Walmart, UCP establishes a typical language and primitives for the commerce journey between client surfaces, companies, and cost suppliers.
“As customers embrace conversational experiences, they count on seamless transitions from brainstorming and analysis to last buy. Meaning it’s essential to assist real-time stock checks, dynamic pricing, and immediate transactions, all inside the person’s present conversational context,” Google wrote in a weblog submit.
Newly redesigned Slackbot is now usually obtainable
Salesforce introduced that the newly redesigned Slackbot is now usually obtainable, providing customers an out-of-the-box AI agent that lives inside Slack.
“By bringing the total energy of the Agentic Enterprise the place billions of office conversations already occur each week, working with enterprise-grade AI turns into as pure as speaking to a coworker,” Salesforce wrote in an announcement.
In line with Salesforce, Slackbot leverages context inside Slack and related instruments to assist discover solutions, arrange work, create content material, schedule conferences, and take motion.
Kaggle introduces Neighborhood Benchmarks to permit for customized evaluations of AI fashions
Kaggle has introduced that it now affords Neighborhood Benchmarks, enabling AI practitioners to design, run, and share their very own benchmarks for evaluating AI fashions.
Kaggle is a neighborhood platform run by Google that provides fashions and sources for information scientists and machine studying practitioners. Final yr, it had launched Kaggle Benchmarks to offer evaluations from analysis teams, equivalent to Meta’s MultiLoKo and Google’s FACTS suite benchmarks.
This newest announcement extends this to the neighborhood as a complete, permitting them to create benchmarks particular to their very own use circumstances. In line with Google, AI capabilities are evolving so shortly that the prevailing methods of benchmarking and evaluating them aren’t capable of sustain. With Neighborhood Benchmarks, the corporate hopes to bridge this hole and supply a extra versatile and clear framework for analysis.
Copilot Studio Extension now obtainable in VS Code
Microsoft has introduced the overall availability of its Copilot Studio Extension for Visible Studio Code.
The extension permits builders to construct and handle Copilot Studio brokers immediately from inside their IDE.
In line with Microsoft, the extension is beneficial as a result of builders must have comparable controls and processes when growing brokers as they do for different functions: supply management, pull requests, change historical past, and repeatable deployments.
Field Extract intelligently pulls data from unstructured content material to assist with workflow automation
Field introduced the launch of Field Extract, which intelligently pulls data from content material and saves it as metadata, serving to organizations automate workflows and speed up decision-making by making data extra simply accessible.
In line with the corporate, numerous organizational data lives in contracts, product specs, coverage paperwork, charts, and different kinds of unstructured content material. Field Extract makes use of agentic capabilities and AI fashions from Google, Anthropic, and OpenAI to precisely extract this data.
Field defined that legacy instruments usually focus solely on extracting textual content, whereas Field Extract understands doc construction and that means. It breaks the doc down into elements like paragraphs, tables, and charts, after which pulls out essential data from these elements.
Google releases TranslateGemma
TranslateGemma is a set of open translation fashions constructed on Gemma 3. They had been educated and evaluated on 55 language pairs, and had been moreover educated on virtually 500 language pairs as a place to begin for researchers regardless that they haven’t been evaluated but.
In line with Google, TranslateGemma considerably reduces error charges in translation in comparison with baseline Gemma fashions alone.
The 4B mannequin is optimized for cellular and edge deployment, the 12B mannequin is optimized for client laptops, and the 27B is designed for max constancy and may run on one thing like a single H100 GPU or TPU within the cloud.