How OpenAI is scaling the PostgreSQL database to 800 million customers

Whereas vector databases nonetheless have many legitimate use circumstances, organizations together with OpenAI are leaning on PostgreSQL to get issues achieved.

In a weblog publish on Thursday, OpenAI disclosed how it’s utilizing the open-source PostgreSQL database.

OpenAI runs ChatGPT and its API platform for 800 million customers on a single-primary PostgreSQL occasion — not a distributed database, not a sharded cluster. One Azure PostgreSQL Versatile Server handles all writes. Almost 50 learn replicas unfold throughout a number of areas deal with reads. The system processes tens of millions of queries per second whereas sustaining low double-digit millisecond p99 latency and five-nines availability.

The setup challenges standard scaling knowledge and affords enterprise architects perception into what truly works at large scale.

The lesson right here isn’t to repeat OpenAI’s stack. It’s that architectural choices ought to be pushed by workload patterns and operational constraints — not by scale panic or modern infrastructure decisions. OpenAI’s PostgreSQL setup exhibits how far confirmed methods can stretch when groups optimize intentionally as a substitute of re-architecting prematurely.

"For years, PostgreSQL has been probably the most vital, under-the-hood knowledge methods powering core merchandise like ChatGPT and OpenAI’s API," OpenAI engineer Bohan Zhang wrote in a technical disclosure. "Over the previous yr, our PostgreSQL load has grown by greater than 10x, and it continues to rise rapidly."

The corporate achieved this scale by way of focused optimizations, together with connection pooling that lower connection time from 50 milliseconds to five milliseconds and cache locking to forestall 'thundering herd' issues the place cache misses set off database overload.

Why PostgreSQL issues for enterprises

PostgreSQL handles operational knowledge for ChatGPT and OpenAI's API platform. The workload is closely read-oriented, which makes PostgreSQL match. Nonetheless, PostgreSQL's multiversion concurrency management (MVCC) creates challenges beneath heavy write masses.

When updating knowledge, PostgreSQL copies total rows to create new variations, inflicting write amplification and forcing queries to scan by way of a number of variations to search out present knowledge.  

Moderately than combating this limitation, OpenAI constructed its technique round it. At OpenAI’s scale, these tradeoffs aren’t theoretical — they decide which workloads keep on PostgreSQL and which of them should transfer elsewhere.

How OpenAI is optimizing PostgreSQL

At massive scale, standard database knowledge factors to one in all two paths: shard PostgreSQL throughout a number of major cases so writes will be distributed, or migrate to a distributed SQL database like CockroachDB or YugabyteDB designed to deal with large scale from the beginning. Most organizations would have taken one in all these paths years in the past, properly earlier than reaching 800 million customers.

Sharding or shifting to a distributed SQL database eliminates the single-writer bottleneck. A distributed SQL database handles this coordination routinely, however each approaches introduce vital complexity: utility code should route queries to the proper shard, distributed transactions turn into tougher to handle and operational overhead will increase considerably.

As an alternative of sharding PostgreSQL, OpenAI established a hybrid technique: no new tables in PostgreSQL. New workloads default to sharded methods like Azure Cosmos DB. Current write-heavy workloads that may be horizontally partitioned get migrated out. Every part else stays in PostgreSQL with aggressive optimization.

This method affords enterprises a sensible various to wholesale re-architecture. Moderately than spending years rewriting tons of of endpoints, groups can determine particular bottlenecks and transfer solely these workloads to purpose-built methods.  

Why this issues

OpenAI's expertise scaling PostgreSQL reveals a number of practices that enterprises can undertake no matter their scale.

Construct operational defenses at a number of layers. OpenAI's method combines cache locking to forestall "thundering herd" issues, connection pooling (which dropped their connection time from 50ms to 5ms), and price limiting at utility, proxy and question ranges. Workload isolation routes low-priority and high-priority site visitors to separate cases, making certain a poorly optimized new function can't degrade core providers.

Overview and monitor ORM-generated SQL in manufacturing. Object-Relational Mapping (ORM) frameworks like Django, SQLAlchemy, and Hibernate routinely generate database queries from utility code, which is handy for builders. Nonetheless, OpenAI discovered one ORM-generated question becoming a member of 12 tables that brought about a number of high-severity incidents when site visitors spiked. The comfort of letting frameworks generate SQL creates hidden scaling dangers that solely floor beneath manufacturing load. Make reviewing these queries a regular observe.

Implement strict operational self-discipline. OpenAI permits solely light-weight schema adjustments — something triggering a full desk rewrite is prohibited. Schema adjustments have a 5-second timeout. Lengthy-running queries get routinely terminated to forestall blocking database upkeep operations. When backfilling knowledge, they implement price limits so aggressive that operations can take over every week.

Learn-heavy workloads with burst writes can run on single-primary PostgreSQL longer than generally assumed. The choice to shard ought to depend upon workload patterns relatively than consumer counts.

This method is especially related for AI functions, which frequently have closely read-oriented workloads with unpredictable site visitors spikes. These traits align with the sample the place single-primary PostgreSQL scales successfully.

The lesson is simple: determine precise bottlenecks, optimize confirmed infrastructure the place potential, and migrate selectively when vital. Wholesale re-architecture isn't all the time the reply to scaling challenges.

What's Hot

Peter Zimmerman: Portray Guidelines at Nunu High quality Artwork Taipei

Trump Orders Strait of Hormuz Blockade After Iran Talks Collapse

Woven Is In! Replace Your Closet With These Editor-Accepted Picks

How OpenAI is scaling the PostgreSQL database to 800 million customers

Hottest tales on GeekWire for the week of April 5, 2026 – GeekWire

Goodbye, Llama? Meta launches new proprietary AI mannequin Muse Spark — first since Superintelligence Labs' formation

Nintendo drops new Swap 2 ‘Tremendous Mario Galaxy’ bundle

MacBook Neo vs. MacBook Air: Which One Ought to You Purchase?

Peter Zimmerman: Portray Guidelines at Nunu High quality Artwork Taipei

Trump Orders Strait of Hormuz Blockade After Iran Talks Collapse

Woven Is In! Replace Your Closet With These Editor-Accepted Picks

LAUSD and academics union attain tentative settlement days earlier than threatened strike

Latest Posts

Peter Zimmerman: Portray Guidelines at Nunu High quality Artwork Taipei

Trump Orders Strait of Hormuz Blockade After Iran Talks Collapse

Woven Is In! Replace Your Closet With These Editor-Accepted Picks

What's Hot

How OpenAI is scaling the PostgreSQL database to 800 million customers

Why PostgreSQL issues for enterprises

How OpenAI is optimizing PostgreSQL

Why this issues

Related Posts