Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think people are too quick to throw out Apache + mod_php. It remains the defacto deployment environment for most open source PHP apps, which means that it is the best understood and best supported. For example, if you deploy using Apache, you don't have to worry about converting .htaccess rules that wordpress and some caching plugins create into an Nginx equivalent.

People bitch a lot about the memory consumption of apache, but it's often overstated. They add up the resident set for each individual apache process and come up with a huge number, missing the fact that a significant amount is actually shared between processes.

Apache's memory consumption problems have more to do with how it handles clients. Each client ties up a worker process for the entire duration of a request, or beyond if keepalives are enabled (as they should be). That means that the memory overhead of the PHP interpreter is locked up doing nothing after the page is generated while it's being fed to the client. Even worse, that overhead is incurred while transmitting a static file.

I use nginx as a reverse proxy to Apache. It lets me deploy PHP apps easily and efficiently. Apache returns the dynamic and static requests as quickly as it can and moves on to the next request. Nginx buffers the result from apache and efficiently manages returning the data to all the clients. I used to have Nginx serve the static files directly, which is more efficient, but added complexity to my config. I chose simplicity.

A PHP opcode cache, like APC, is also a big win, because it cuts the overhead of parsing and loading the PHP source files. I'm not convinced of the value of other caching for most uses. CPU time usually isn't the scarce resource, RAM is. The DB and filesystem cache are already trying to keep needed data in RAM. Adding more caching layers generally means more copies of the same data in different forms, which means less of the underlying data fits in RAM.



> Nginx buffers the result from apache and efficiently manages returning the data to all the clients. I used to have Nginx serve the static files directly, which is more efficient, but added complexity to my config. I chose simplicity.

Funny, I felt the same way about ditching Apache entirely. Just one more moving part I don't need.


So true. Apache + mod_php is very reasonable choice. Anyone who advocates using FastCGI instead of mod_php to "save memory" just doesn't understand what the actual memory footprint of Apache + mod_php really is, and how adjust the number of Apache processes.

In fact FastCGI still ties up an Apache process for the duration of the request: Apache hands the PHP request off to a FastCGI worker, then waits for that PHP worker to send back the output, so the Apache process is still blocked waiting on PHP in either scenario.

Also the overhead of Apache serving static content is miniscule compared to the amount of work a PHP does per dynamic request, unless the static content is very large, like large media files.


It's not just about steady state memory footprint. It's about the whole stack, how apache and mod_php and mysql (if you use it) interact.

Excessive traffic means a spike in simultaneously served connections, which means a spike in apache threads (assuming worker MPM, iirc there's 1 process per 25 threads by default). With the mod_php model, the per-thread php memory usage can be very expensive when you have dozens or hundreds of apache threads serving requests. A spike in running php instances leads to a spike in mysql connections for a typical web app. If you haven't tuned mysql carefully, which most typical environments have not, mysql memory usage will also skyrocket.

Then for the coup de grace you get stupid apps which think it's perfectly fine to issue long-running queries occasionally (occasionally meaning something like .1% to a few percent of page loads). When that happens, if you're using myisam tables which were the default with mysql < 5.5 (and which therefore dominate deployments, even if "everyone knows" you're supposed to be using innodb), then those infrequent long-running queries block the mysql thread queue, leading to an often catastrophically severe mysql thread backlog. Since php threads are issuing those queries, the apache+mod_php threads stack up as well, and they do not use trivial amounts of memory.

The result is that you have to severely over-engineer the machine with excess memory if you want to survive large traffic spikes. If you don't, you can easily hit swap which will kill your site temporarily, or worse, run out of swap too and have the oomkiller kill something... either your webserver or mysql.

The benefit to fastcgi is it takes the memory allocation of php out of apache's hands, so every new apache thread is more limited in how much bloat it adds to the system. With a limited pool of fastcgi processes, you can also limit the number of db connections which further improves the worst-case memory usage scenario.

The advantage of in-apache-process php is that it serves php faster when there are few parallel requests, but it's on the order of single-digit milliseconds difference (the extra overhead of sending requests through a fastcgi socket), which is dwarfed by network rtt times even if none of the above pathologies rear their heads.

The apache+mod_php model is to do php processing for all active connections in parallel. The fastcgi model is to do php processing for at most x connections where x is the php fastcgi pool size, leaving all other requests to wait for an open slot. It may intuitively seem like the fastcgi model is going to be slower because some requests have to wait for a fastcgi process to become free, but if you think about average php request time it's going to be better for high parallelism, because the limiting factors are cpu and i/o. The apache model ends up using ridiculous amounts of resources just so no php request has to wait to begin getting processed by php. The high contention particularly for i/o created by apache and mysql when they have large numbers of threads is what makes the fastcgi model superior.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: