_hans_'s comments

_hans_ · on Feb 20, 2014

We do have this problem! It is only a matter of how small the PGBouncer machine is compared to the PostgreSQL. In our case the PGBouncer had to deal with 16000 req/s, coming from 700 clients going through 100 connections to the PostgreSQL. We now have 4 PGBouncers for this DB, and we're 'load balancing' them with DNS.

_hans_ · on Feb 19, 2014

Thank you. We configured the system to accept more file descriptors. I think our issue was caused by something else.

_hans_ · on Feb 19, 2014

Neither AWS RDS PostgreSQL nor Heroku has a PGBouncer.

btown · on Feb 19, 2014

Heroku actually does seem to support/encourage it:

https://postgres.heroku.com/blog/past/2013/11/22/connection_... https://github.com/gregburek/heroku-buildpack-pgbouncer

robbles · on Feb 19, 2014

Saw that reference in the linked post from Heroku in the original article.

Has anyone used this? Is it configurable, or do they just choose some reasonable defaults?

thruflo · on Feb 19, 2014

Works great for me. There's some minimal configuration explained in the pgbouncer docs. What the pgbouncer buildpack does is just route your DATABASE_URL through the local pgbouncer.

In my python app, I switched from an sqlalchemy connection pool per process (eg per gunicorn worker) to using a Null pool everwhere. No increase in latency; far fewer connections needed.

_hans_ · on Feb 19, 2014

We enabled PGBouncer transaction mode and had to set 'prepared_statements: false' in our database.yml b/c the transaction mode doesn't support that. Works fine for us.

CoffeeOnWrite · on Feb 19, 2014

What are the implications of 'prepared_statements: false'? It sounds really bad.

justincormack · on Feb 19, 2014

Prepared statements are not a massive performance gain in postgres so I wouldn't worry.

nirvdrum · on Feb 19, 2014

They do provide a nice safety net against SQL injection though. A problem that Rails has from time to time.

zzzeek · on Feb 19, 2014

bound parameters are the safety net against SQL injection. Prepared statements are not a prerequisite for that depending on database client API. (e.g. psycopg2, doesn't use prepared statements).

nirvdrum · on Feb 19, 2014

I appreciate the comment, but since this was in regards to Rails, the two options were basically use prepared statements or don't. You don't get any safety without them and have to rely on Rails, which does an admirable job, but has faltered from time to time.

saurik · on Feb 19, 2014

If Rails stops using the two-step parse/execute PostgreSQL commands (which are supported without issue by PGbouncer, and which is how you are supposed to get bound parameters) when prepared_statements is turned off (a setting which intuitively should only turn off using standard SQL named prepared statements), then that's pathetic behavior (even for Rails) and should really be fixed.

nousernamesleft · on Feb 19, 2014

If done right they can be huge. The fact that they do nothing when you just do "prepare" followed immediately by "execute" in place of just running the query directly is not a problem with postgresql or prepared statements, it is a problem of doing something dumb. For simple selects the parsing and planning stage can be 90% of the time spent. The problem is that postgresql doesn't have stored procedures, so you have to make sure every connection is initialized by running all of your "prepare"s before it is made available to the pool.

dbenhur · on Feb 19, 2014

You can get detailed timing data about parse, plan, execute timings from postgresql logging. Parsing is almost always trivial. Planning can get expensive for complex queries with many joins (large space of solutions to explore). For simple queries there's almost no benefit for prepared statements in postgres. Prepared statements are a maintenance headache and don't play well with session multiplexed connection pooling (like PgBouncer); generally best to avoid them unless you have measured a concrete and significant benefit.

nousernamesleft · on Feb 19, 2014

>You can get detailed timing data about parse, plan, execute timings from postgresql logging. Parsing is almost always trivial.

And yet you can get >50% speedups for super common queries like "select * from foo where id = ?" if you prepare it once and then use execute instead of just running the select every time. Seems like maybe you're making assumptions you shouldn't.

justincormack · on Feb 19, 2014

I think the statements I benchmarked were more complex and slower, so there wasnt much in it. It was a while ago though. For simple selects it could make much more difference.

stefantalpalaru · on Feb 19, 2014

> The problem is that postgresql doesn't have stored procedures

Of course it has: http://www.postgresql.org/docs/9.3/static/sql-createfunction...

And you can choose between 4 languages: http://www.postgresql.org/docs/9.3/interactive/xplang.html

nousernamesleft · on Feb 19, 2014

Those are functions, not stored procedures. Every time that function is executed, it is re-parsed and re-planned.

chanks · on Feb 19, 2014

PL/pgSQL function plans are cached. Functions written in other languages may not be, I don't know, but in my experience PL/pgSQL is the most widely used.

http://www.postgresql.org/docs/9.3/static/plpgsql-implementa...

nousernamesleft · on Feb 19, 2014

>PL/pgSQL function plans are cached

Read your link. They can be cached. You have no way to ensure that they are. And you have no way to have the planning done up front at all, you have to wait until the function is run, and even then every code path in the function has to be exercised to get it planned. And then half the time it doesn't think it should bother to cache the plan anyways. And it is per-session, not shared. So every single connection has to start over with an empty cache every time, and there's tons of duplicate plans in the cache wasting space. Not cool, and by far the biggest thing keeping a ton of "enterprise" customers on SQL server and oracle.

chanks · on Feb 20, 2014

You have some valid criticisms of the implementation, though pgsql-hackers has their reasons (http://www.postgresql.org/message-id/CAHyXU0ybwZZUbuQQVFQMK3...) for the way things are. Regardless, my point is that the situation is not so bad as "Every time that function is executed, it is re-parsed and re-planned."

dbenhur · on Feb 19, 2014

The default stored procedure language is PL/pgSQL http://www.postgresql.org/docs/9.3/static/plpgsql.html and has been available in Postgres almost forever.

nousernamesleft · on Feb 19, 2014

Those are functions, not stored procedures.

_hans_ · on Feb 19, 2014

There was no apparent problem with turning it off for us. But maybe there is for others.

tbarbugli · on Feb 19, 2014

dont prepare statements?

_hans_ · on Feb 19, 2014

Thanks for your comment!

Re work_mem: I think this is per query and thus might affect every connection?! At least thats what I gather from 'PostgreSQL 9.0 High Performance' Re temp_buffers: 'Sets the maximum number of temporary buffers used by each database session.' For us it practically means per connection. Not for you?

Shit, I'm missing the version! 9.3. I edited the blog post.

Re memory: I think that really depends on the application I think. We have many idle connections.

lazyjones · on Feb 19, 2014

> Re temp_buffers: 'Sets the maximum number of temporary buffers used by each database session.' For us it practically means per connection. Not for you?

It means that potentially every connection may at some point use this amount of memory, but idle connections will not. If you have 20 busy connections and 350 idle ones, at most 20 x temp_buffers will be used and there is no benefit from PGbouncer in this regard. If you have 370 busy connections, all of them might use temp_buffers and your tradeoff can be 370 concurrently executing queries at higher memory cost vs. e.g. 20 concurrently executing queries via PGbouncer at lower memory cost and 350 stalled queries with potentially high latency.

> We have many idle connections.

In this case, you will not have any issues with memory use from temp_buffers, work_mem ...

_hans_ · on Feb 19, 2014

Yes you're right, idle connection do not use temp_buffers or work_mem.

I was trying to gather everything relevant to resources per connection. I'm not saying work_mem caused our problem.