OpenAIが新しいツールを発表、開発者が強力で信頼性の高いAIエージェントを構築を支援します。

OpenAI has released a new tool today to help developers build powerful and reliable AI Agents. 🤖🔧

It feels like there's a lot of stuff in OpenAI’s library just waiting to be unleashed, and whenever something eye-catching happens in the industry, they quickly pull out a game-changing tool from their pocket.

In recent years, AI Agents have gradually become an important direction for the development of AI technology. Unlike traditional AI models that can only answer questions or make predictions, AI Agents can independently complete complex tasks assigned by users, such as helping users quickly search for information, automatically processing documents, or even manipulating computers to perform specific operations.

However, despite significant improvements in reasoning, multimodal interaction, and security in existing large language models (LLMs), there are still many challenges in converting these powerful functions into stable and reliable product-level AI agents, such as requiring extensive prompt engineering and complex logical orchestration, and lacking built-in visual debugging tools.

To address these issues, OpenAI officially launched a series of new APIs and tools specifically designed to simplify the AI agent application development process.

Overview of the new tools:

Responses API: Combining the ease of use of the Chat Completions API with the tool-calling functionality of the Assistants API, making it convenient for developers to quickly set up AI agents.
Built-in practical tools: Including web search, file search, and computer operation functions, enabling AI agents to autonomously complete more complex tasks.
Agents SDK: Supporting single-agent and multi-agent process orchestration, significantly reducing the complexity of AI agent development.
Integrated observability tools: Can track and review the task execution of AI agents in real time, helping developers quickly locate and solve problems.

OpenAI stated that these new tools will greatly lower the threshold for developers entering the AI agent development field, making it possible to quickly deploy efficient and stable AI agents. In the coming months, OpenAI will also release more features to continuously optimize and simplify the development and deployment process of AI agent applications.

OpenAI releases Responses API

OpenAI has just launched the new Responses API, which is a core API tool for building powerful AI Agents. It effectively combines the ease of use of Chat Completions with the tool-calling capabilities of Assistants API.

The Responses API provides a series of built-in tools, including web search, file search, and computer use, allowing models to interact more deeply with the real world and thus more efficiently complete various tasks.

Powerful tool integration

Web search tool：

Helps developers quickly obtain real-time and accurate information while providing clear source references.
Applied to gpt-4o and gpt-4o-mini models.
Provides clear source links, making it convenient for users to explore further and helping content providers reach a wider audience.

const response = await openai.responses.create({
    model: "gpt-4o",
    tools: [ { type: "web_search_preview" } ],
    input: "What was a positive news story that happened today?",
});

console.log(response.output_text);

For example, Hebbia integrates the web search function to help investment managers and law firms quickly gain real-time insights, achieving efficient market intelligence analysis.
Pricing: $30 per thousand queries for GPT-4o and $25 for 4o-mini.
Performance: GPT‑4o search preview and GPT‑4o mini search preview achieved accuracies of 90% and 88%, respectively, on the SimpleQA benchmark test.

File search (File search) tool

Can quickly review FAQs, legal cases, and technical documents, significantly improving efficiency.

const productDocs = await openai.vectorStores.create({
    name: "Product Documentation",
    file_ids: [file1.id, file2.id, file3.id],
});

const response = await openai.responses.create({
    model: "gpt-4o-mini",
    tools: [{
        type: "file_search",
        vector_store_ids: [productDocs.id],
    }],
    input: "What is deep research by OpenAI?",
});

console.log(response.output_text);

For example, Navan uses this tool to optimize customer support experiences, providing customized answers to different users, saving time and ensuring accuracy.

Computer operation (Computer use) tool

Supports AI models to autonomously execute mouse and keyboard operations, directly converting them into executable instructions in the environment.

const response = await openai.responses.create({
    model: "computer-use-preview",
    tools: [{
        type: "computer_use_preview",
        display_width: 1024,
        display_height: 768,
        environment: "browser",
    }],
    truncation: "auto",
    input: "I'm looking for a new camera. Help me find the best one.",
});

console.log(response.output);

Successfully applied to complex tasks, such as assisting Unify in detecting real estate expansion through automatic web map browsing.
Current performance in practical task automation stands at 38.1%, with room for further improvement.
Charging standard: $3 per million input tokens, $12 per million output tokens.

Innovative advantages of the Responses API:

Unified entry design and simplified polymorphic interface.
Intuitive streaming events and easy-to-access SDK helpers (such asresponse.output_text）。

Differences and connections between Chat Completions and Responses API

Responses API is a superset of Chat Completions and fully compatible. If developers do not need built-in tools, they can still safely use Chat Completions. Meanwhile, Responses API will gradually achieve full functional parity with Assistants API, including support for Assistant and Thread objects and code interpretation tools, expected to officially replace Assistants API by mid-2026.

Agents SDK

In addition to building the core logic for AI agents and providing tools, developers also need to effectively orchestrate workflows between agents. To simplify this process, OpenAI has released the new open-source Agents SDK. Compared to last year's experimental SDK "Swarm," the new version of the SDK has made significant improvements in orchestrating multi-agent processes.

Key improvements in Agents SDK

Flexible agent configuration: Easy-to-configure large language models (LLMs) with clear instructions and built-in tools.
Smart handoff control: Supports intelligent task handoffs between different agents.
Safety checks: Configurable input/output safety verification to ensure secure agent operation.
Visualization tracking and observability: Through execution trace visualization, it facilitates developers' debugging and optimization of performance.

from agents import Agent, Runner, WebSearchTool, function_tool, guardrail

@function_tool
def submit_refund_request(item_id: str, reason: str):
    # Your refund logic goes here
    return"success"

support_agent = Agent(
    name="Support & Returns",
    instructions="You are a support agent who can submit refunds [...]",
    tools=[submit_refund_request],
)

shopping_agent = Agent(
    name="Shopping Assistant",
    instructions="You are a shopping assistant who can search the web [...]",
    tools=[WebSearchTool()],
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Route the user to the correct agent.",
    handoffs=[shopping_agent, support_agent],
)

output = Runner.run_sync(
    starting_agent=triage_agent,
    input="What shoes might work best with my outfit so far?",
)

Use cases

Agents SDK applies to practical scenarios such as customer support automation, multi-step research, content generation, code reviews, and sales outreach.

Coinbase use case: Coinbase quickly developed and deployed the AgentKit toolkit using Agents SDK, enabling seamless interaction between AI agents and cryptocurrency wallets and blockchain activities. Custom operations were integrated within hours, greatly simplifying the agent setup process and allowing developers to focus on important feature integrations.
Box use case: Within days, Box quickly built an agent through Agents SDK and Web search tools that helps businesses securely search and extract insights from internal unstructured data in Box and public internet information. Enterprise users can access the latest information securely while strictly adhering to internal permissions and security policies. For example, financial services companies can use Box AI agents combined with internal market analysis and real-time economic data to assist analysts in making more comprehensive investment decisions.

Supported APIs and compatibility

Agents SDK is compatible with OpenAI's Responses API and Chat Completions API. Additionally, as long as similar Chat Completions-style API endpoints are provided, it can also be compatible with models from other providers. Currently, developers can directly integrate Agents SDK into Python libraries, and Node.js support will soon follow.

When designing Agents SDK, OpenAI drew on the successful experiences of excellent community projects such as Pydantic, Griffe, and MkDocs, and will continue to remain open source to allow the community to develop together.