* 2 core-site datacenters with Postgres infrastructure tested to be capable of handling well over 1,100,000 database queries per minute.
* Replication between those data centers running in a master-slave model with a scheduled role-reversal every 3 months.
* Distinct core-site SLA’s for read only availability, and read write availability. No scheduled outages; less than 30 minutes per year of scheduled read only time.
* 3 different flavors of replications with at least 5-6 distinct types of replication topologies.
* 3 distinct development environments for engineers complete with full database snapshots that refresh from production every week.
* Varying different workloads including:
* Servers with 768GB of RAM so everything fits in memory.
* Multi terabyte databases where only 5% can fit in RAM.
* A sharded core site (not warehouse!) table with over 2,600,000,000 tuples.
* Real time main site anomaly detection database that ingests 87,000 tuples per second.
* All of this is maintained by a a team of 1 person full time and 4 people part time.
This talk will be a look at how Postgres can form the backbone of a site at the scale of 315 million unique visitors a month.
Matt Kelly has been working with TripAdvisor for the past 3.5 years, starting as a general software engineer. Last year he switched over to the operations team to be the person focused on advancing the Postgres infrastructure at TripAdvisor.