PostgreSQL partitioning explained

rattray · on May 4, 2014

Reading this reminded me a little of old works of philosophy. I found it tremendously educational, not just about postgres partitioning, but about databases in general. That degree of patience and thoroughness on the part of the teacher isn't something you see every day.

I wish more such conversations A) took place, and B) were transcribed and shared in a similar fashion.

keithgabryelski · on May 4, 2014

Well, that is about the best compliment I've every heard. thank you!

FiloSottile · on May 4, 2014

This is quite a good informal high-level explanation.

A thing I can't understand is why this isn't made automatic.

If there are 100 companies, you don't want to create 100 tables, then add 100 FOREIGN KEY constrains, and then again each time a new company appears, do you? Wouldn't a "syntactic sugar" like "PARTITION BY company_id" that does all this automatically be possible?

Just asking if there is a reason something like this is not implemented.

jeffdavis · on May 4, 2014

It probably will be. There are two primary reasons it hasn't happened yet (in my opinion):

1. It takes a while to get widespread agreement on the semantics to make sure it really solves the problem in a comprehensive way.

2. The existing mechanism is "good enough", so nobody is quite experiencing enough pain that they dedicate the effort to solve it.

These aren't great reasons, because other databases have had partitioning features for a long time. But they are reasons.

clubhi · on May 4, 2014

I'm experiencing the pain. I've been championing postgres in my company for a couple of years now. The partitioning limitations have been quite embarrassing. It's not just the annoying lack of helper syntax to make partitioning better. It's the actual huge limitations on what the planner can consider when trying to find a derived table.

twic · on May 4, 2014

What are the other contenders you are championing PostgreSQL against? How do they do at partitioning?

aktau · on May 4, 2014

It pains me to say this, but when I was working with Oracle, its partitioning support was miles ahead of postgres (though not perfect by any stretch of the imagination).

keithgabryelski · on May 4, 2014

i suspect this was the simple solution that the postgresql team could implement easily.

that is actually why i wrote the partitioned gem (which this explanation is apart of): https://github.com/fiksu/partitioned the lack of automatic table management is pretty painful and the gem helps you manage such things.

there is some support for redshift also: https://github.com/fiksu/activerecord-redshift-adapter

(this is all for rails)

jeltz · on May 4, 2014

There is also pg_partman.

http://pgxn.org/dist/pg_partman/doc/pg_partman.html

windsurfer · on May 4, 2014

Why can't they just improve the indexing to do this automatically behind the scenes? Isn't the problem of finding things quickly in exchange for more work during a write exactly what indexing is supposed to solve?

calpaterson · on May 4, 2014

Well, be specific about what indexing is: it layers a B-tree on top of what programmers would think of as an array structure. Indexes are not just something which is always faster than a sequential scan in exchange for slightly slower writes - there are important cases where the index is slower than just doing a sequential scan. Obviously you hope the query planner will avoid using an index in these cases, but the important thing to bear in mind is that indexes are not magic go-faster stripes :)

The problem happens when this B-tree is too large to fit into your working memory: swapping parts of the B-tree in and out of memory repeatedly to answer a query is very slow. The solution that partitioning offers is to allow you to split your indexes (and your data) so that, depending on your access-patterns you could see less swapping of that index.

windsurfer · on May 4, 2014

Why can't the index store the branches of the b-tree as separate objects on disk instead of loading all branches into memory piece by piece?

ddorian43 · on May 4, 2014

example you are keeping daily logs(daily partition), and you only want to keep the last 30 days, so you drop-table the latest partitions, which is faster than deleting X number of rows

tshaddox · on May 4, 2014

In other words, it's different from a traditional index in that it physically stores like values together.

perlgeek · on May 4, 2014

Somehow http://www.postgresql.org/docs/current/static/ddl-inherit.ht... has always stopped me from using table inheritance; you don't get unique constraints over the table and its child tables, so for example you can't have an autoincrement primary key that works safely across them all. (Disclaimer: I haven't run into large enough tables yet that I'd need partitioning).

If this constraint was lifted, table inheritance would allow some really cool things, but currently I'm too conservative to use it in production.

pilif · on May 4, 2014

Sequences are guaranteed to produce unique values. Just manually configure each child table to use the same sequence for the primary key column. Yes, uniqueness won't be enforced on the parent table, but the values will still be unique (unless you manually update them in your queries)

trippy_biscuits · on May 4, 2014

Until the sequence rolls over. Why design a schema with built in failure (an integer PK that overflows)?

pilif · on May 4, 2014

Sequences won't roll over by default. nextval() fails if you're out of values. You can create a sequence which is allowed to roll over (by specifying cycle), but it's not the default.

Yes. It's not optimal and you can shoot yourself in the foot, but it works in its default configuration. I just gave a workable workaround for a problem, I wasn't saying it's the perfect solution (a working unique constraint on the parent table would be)

Negitivefrags · on May 4, 2014

The default sequence size is 64 bit. Even if you are getting 100,000 ids a second from it you wont run out for over 5 million years.

drewcrawford · on May 4, 2014

I'm no database expert, but I don't think the features you want are possible, really. To get a unique constraint over multiple tables you'd have to lock them all, and if you lock them all you have lost the performance benefit of partitioning tables.

There are things you could do, like key on a GUID, that would ensure each insert has a unique ID in a lock-free way. But the semantics of that are going to be very different from a traditional autoincrementing ID.

perlgeek · on May 4, 2014

Well, the locks would only be necessary for updates (or INSERTS where the constrained column doesn't come from the sequence), and that's a price I'd be willing to pay (hint, how often do you update the primary key column?)

keithgabryelski · on May 4, 2014

it's simply not a problem in practice -- even for very very large tables. Use bigserial if you must.

you can also modify the child table to use a different sequence for each table.

goldmar · on May 4, 2014

Great introduction to partitions. I did not know how exactly they are used. And now I think that I do. Thank you!

What I noticed (some improvement suggestions):

1) "the planner could tell (using knowledge from the check constraint) that employees_1 was the only table it needed to look at."

Well, I think it still checks the parent table, right?

2) You could mention that it is possible to create a trigger function which inserts new data automatically into the correct child table (as described in your final link, here: http://www.postgresql.org/docs/9.3/static/ddl-partitioning.h...).

I think it would be cool if PostgreSQL could create these trigger functions automatically based on the check constraints...

I'm also wondering why foreign key constraints are not inherited?

keithgabryelski · on May 4, 2014

yes, the parent is also a target table -- but for partitioning you never put anything in the parent. I could probably talk about the troubles of putting data in the parent table.

and yes, a trigger function is possible -- and I could have mentioned it (although this was written as an explanation of partitioning using the gem I created). I should mention the alternatives and why they weren't employed.

doronlevari · on May 7, 2014

Great intro to Pg partitioning. I understand partitions can improve performance of a single query, scanning less rows, but any of you guys have idea about throughput implications? Is partitioning better or worse when it comes to 1000 small queries/updates per second? Thanks!

Oculus · on May 4, 2014

So in large web applications would the partitioned tables be distributed among multiple DB servers?

keithgabryelski · on May 4, 2014

you could -- that isn't "partitioning" in the way I discuss it in the linked article

when it is split across machines it is generally referred to as sharding: http://en.wikipedia.org/wiki/Shard_(database_architecture)

Oculus · on May 5, 2014

I see, thanks for clearing that up!

RobotCaleb · on May 4, 2014

Can this be used to enforce a temporal limit on data? That is, if the check is for time to be within 24 hours of now will it purge older than 24 hours data as time progresses? My gut tells me no, but my gut is often wrong when it comes to databases.

keithgabryelski · on May 4, 2014

if I understand you correctly, this type of thing is implemented by partitioning by hour in a day and then having a janitor process drop tables that are older than 24 hours.

morgo · on May 4, 2014

ALTER TABLE .. TRUNCATE PARTITION should be much faster than deleting as well. At least it is in MySQL.

batbomb · on May 4, 2014

you can in oracle, and I believe db2. It's called range partitioning.

mrtimuk · on May 5, 2014

Could you use a check constraint of FALSE on the parent table to prevent rows from being erroneously inserted there instead of the actual partition tables?