Back to articles
AIOpenAI News

Introducing more enterprise-grade features for API customers

Increasing enterprise support with more security features and controls, updates to our Assistants API, and tools to better manage costs.

The RSS feed only provided an excerpt. FlowMarket recovered the public content available from the original page without bypassing restricted content.

April 23, 2024

Introducing more enterprise-grade features for API customers

More Enterprise Grade Features Hero Image

We work with many enterprises like Klarna ⁠ , Morgan Stanley ⁠ , Oscar ⁠ , Salesforce ⁠ , and Wix ⁠ to help them build AI solutions from scratch and safely deploy AI across their organizations and products. We’re deepening our support for enterprises with new features that are useful for both large businesses and any developers who are scaling quickly on our platform.

Enhanced enterprise-grade security

We’ve introduced Private Link, a new way that customers can ensure direct communication between Azure and OpenAI while minimizing exposure to the open internet. We’ve also released native Multi-Factor Authentication ⁠ (opens in a new window) (MFA) to help ensure compliance with increasing access control requirements. These are new additions to our existing stack of enterprise security features ⁠ (opens in a new window) including SOC 2 Type II certification, single sign-on (SSO), data encryption at rest using AES-256 and in transit using TLS 1.2, and role-based access controls. We also offer Business Associate Agreements ⁠ (opens in a new window) for healthcare companies that require HIPAA compliance and a zero data retention policy for API customers with a qualifying use case.

Better administrative control

With our new Projects ⁠ (opens in a new window) feature, organizations will have more granular control and oversight over individual projects in OpenAI. This includes the ability to scope roles and API keys to specific projects, restrict/allow which models to make available, and set usage- and rate-based limits to give access and avoid unexpected overages. Project owners will also have the ability to create service account API keys, which give access to projects without being tied to an individual user.

More Enterprise Grade Features Product Demo-1

Assistants API improvements

We’ve introduced several updates to the Assistants API for more accurate retrieval, flexibility around model behavior and tools used to complete tasks, and better control over costs. These features include:

  • Improved retrieval with ‘file_search’ which can ingest up to 10,000 files per assistant—a 500x increase from the previous file limit of 20. The tool is faster, supports parallel queries through multi-threaded searches, and has enhanced reranking and query rewriting.
  • Streaming support for real-time, conversational responses—one of the top requests from developers and enterprises. New ‘vector_store’ objects in the API so files can be added to a vector store and automatically parsed, chunked, and embedded in preparation for file search. Vector stores can be used across assistants and threads, simplifying file management and billing.
  • Control over the maximum number of tokens used per run, plus limits on previous and recent messages used in each run, so you can manage token usage costs. New ‘tool_choice’ parameter to select a specific tool (like ‘file_search’, ‘code_interpreter’, or ‘function’) in a particular run.
  • Support for fine-tuned GPT‑3.5 Turbo models in the API (to start, we’ll support fine-tunes of ‘gpt-3.5-turbo-0125’).
More Enterprise Grade Features Product Demo-2

More options for cost management

To help organizations scale their AI usage without over-extending their budgets, we’ve added two new ways to reduce costs on consistent and asynchronous workloads:

  • Discounted usage on committed throughput: Customers with a sustained level of tokens per minute (TPM) usage on GPT‑4 or GPT‑4 Turbo can request access to provisioned throughput to get discounts ranging from 10–50% based on the size of the commitment.
  • Reduced costs on asynchronous workloads: Customers can use our new Batch API ⁠ (opens in a new window) to run non-urgent workloads asynchronously. Batch API requests are priced at 50% off shared prices, offer much higher rate limits, and return results within 24 hours. This is ideal for use cases like model evaluation, offline classification, summarization, and synthetic data generation.

We plan to keep adding new features focused on enterprise-grade security, administrative controls, and cost management. For more information on these launches, visit our API documentation ⁠ (opens in a new window) or get in touch with our team ⁠ to discuss custom solutions for your enterprise.

  • API Platform
  • 2024

Author

Related research

A flock of paper planes flying over and through treetops.

Video generation models as world simulators

Sora

Image de l'article

Publication Jan 31, 2024

Weak To Strong Generalization

Safety Dec 14, 2023

Practices For Governing Agentic AI Systems

Practices for Governing Agentic AI Systems

Safety & Alignment

Need an n8n workflow or help installing it?

After the briefing, move to execution: find an n8n template or a creator who can adapt it to your tools.

Source

OpenAI News - openai.com

View original publication