Optimizing VPS Server Memory
WHAT I'M TRYING TO DO
The server resource limits sometimes run tight; to help prevent memory exhaustion, I've had to limit server processes. I'm needing a little expert help to know if I'm on the right track, and perhaps spot any obvious settings changes that would help the system run more with more stability.
HISTORY
Recently my company upgraded to a VPS, from shared hosting. Basically we outgrew our shared hosting, and began to have problems due to the host suspending our site because of excessive CPU usage on the weekend. Our website users tend to double or triple on Friday and Saturday, every week, which is not unexpected in our case. (About 5000 visits [~2500 visitors] per day during the week, about 9500 visits [~4500 visitors] on weekends.)
Now that we are on a VPS, we have no CPU problems. (In fact, the CentOS WHM control panel says we are at ".000201% CPU load".) However, we are having out-of-memory problems, leading to crashes.
SUMMARY OF ISSUE
Our website is WordPress based. However, aside from comments, there is very little "write" activity; mostly users are simply seeing fairly static pages that we've created.
When we first upgraded to a VPS several months ago, in October 2012, the website ran well during the week, but choked on memory every weekend. Often it would crash repeatedly (5-20 times during a 24-hour period, sporadically), usually starting Friday evening, and continuing through Saturday afternoon.
During the week, the server ran consistently at 65-90% memory usage, and on the weekend it would hit 100%, causing crashes.
STEPS TAKEN TO CORRECT IT
Since I was new to VPS, I started with all the default settings. I later started tweaking, following advice I read about solving memory issues here on this site and other websites.
I've made adjustments to MySQL, PHP, and Apache, summarized below in "Current Configuration". I also recompiled Apache and PHP to remove unwanted modules. I installed a better caching plugin for WordPress (W3T), and added APC opcode caching. I also started using gz compression, and moved a lot of static files to a separate subdomain.
I wrote a nifty little script to check the server status on a schedule, and restart it as needed, and it also sends me a transcript of the server error log, to help troubleshoot. (I know, it's just a band-aid, if that. But it was important to keep the website online, since no one wants to sit around and monitor it on the weekend.)
Just recently, a week or so ago (January 2013), I upgraded the server RAM from 1 GB (2 GB burstable) to 2 GB (3 GB burstable). This seems to have fixed the majority of the problem, but still I get an occasional notice (once a week or so) that the server is hanging, along with "can't apply process slot" PHP errors.
CURRENT CONFIGURATION
It's an Apache server, running CentOS 6, Apache 2 (Worker MPM), PHP 5.3.20 (FastCGI/fcgi), and MySQL 5.5.28. 2 GB RAM (3 GB burstable), 24 CPUs.
MySQL currently uses about 618 MB, about 20.1% of RAM. PHP uses up to 89 MB per process. Apache uses up to 14 MB per process.
Typical weekday top
output:
top - 15:31:13 up 89 days, 5:26, 1 user, load average: 1.54, 1.00, 0.70
Tasks: 49 total, 1 running, 48 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 0.1%sy, 0.0%ni, 99.7%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3145728k total, 1046444k used, 2099284k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
Unfortunately I do not have a current example of weekend/busiest time top output.
Apache config:
StartServers: 5
MinSpareThreads: 5
MaxSpareServers: 10
ServerLimit: 80
MaxClients: 56
MaxRequestsPerChild: 5000
KeepAlive: Off
PHP config:
MaxRequestsPerProcess 500
FcgidMaxProcesses 15
FcgidMinProcessesPerClass 0
FcgidMaxProcessesPerClass 8
FcgidIdleTimeout 30
FcgidIdleScanInterval 15
FcgidProcessLifeTime 60
FcgidIOTimeout 300
FcgidMaxRequestLen 268435456
MySQL config:
[mysqld]
max_user_connections = 75
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
skip-external-locking
sort_buffer_size = 512K
# MyISAM #
key_buffer_size = 32M
myisam_sort_buffer_size = 16M
#myisam_recover = FORCE,BACKUP
# SAFETY #
max_allowed_packet = 8M
#max_connect_errors = 1000000
# CACHES AND LIMITS #
tmp_table_size = 104M
max_heap_table_size = 104M
join_buffer_size = 208K
#query_cache_type = 0
query_cache_size = 32M
max_connections = 150
thread_cache_size = 4
#open_files_limit = 65535
table_cache = 512
#table_definition_cache = 1024
table_open_cache = 2048
wait_timeout = 300
# INNODB #
#innodb_flush_method = O_DIRECT
#innodb_log_files_in_group = 2
#innodb_log_file_size = 64M
#innodb_flush_log_at_trx_commit = 1
#innodb_file_per_table = 1
innodb_buffer_pool_size = 416M
# This setting ensures that aio limits are not exceeded
# (default is 65536, each instance of mysql takes 2661 with this enabled)
innodb_use_native_aio = 0
# LOGGING #
log-slow-queries
log-queries-not-using-indexes
Any help/suggestions would be much appreciated. The website address is 3abn.org.
Solution 1:
You're already running PHP with FastCGI, so I'm not sure what else you can do to slim this down. You're kind of at a crossroad here..
A couple of options:
- Tune everything down to the smallest possible dataset. Replace Apache with nginx (if you can), tune MySQL so that it's not buffering more data than needed and so on
- Throw more RAM at the box
- Separate your tiers into dedicated VM's. One database server, one application server and one front-end. This will make it alot easier to scale.
Edit: You're saying that you have installed lots of caching stuff. Cache = eat more RAM so that the next request(s) will go faster. If you're very low on RAM - caching might not be the best thing in the world..
Solution 2:
My number one recommendation: GET OFF THE VPS.
I have heard enough grousing about memory (and OOM-Killer) related problems on VPS systems that I'm of the opinion the typical VPS hosting provider is not providing a Production Grade solution -- it's not a "virtual private server", it's "paravirtualization on top of an existing OS with poorly designed resource limiting that consequently behaves differently than a real machine would".
(You appear to be being bitten by the most common difference: a "VPS" has no swap space, so when you chew up even one byte more RAM than you were allocated by your provider things fall apart.)
If you are unable or unwilling to host on your own hardware colocated at a quality datacenter you should consider a cloud service that appears like a "normal" server with swap and the like (Amazon EC2 is one such option). These solutions are priced somewhere between "VPS" solutions and dedicated hardware, but provide an operational experience much closer to "real hardware", and let you avoid situations like you're in now.
Note that in any case you still need to size your system adequately -- your VPS/Cloud Solution/Dedicated Hardware should have enough RAM to handle peak load without swapping.
The advantage of (quality) Cloud or Dedicated Hardware solutions is you have more control over what happens when you reach the swap point (disabling the OOM-killer and letting malloc()
fail, for example).
Solution 3:
From the info you posted:
. Your VPS seems to be running under OpenVZ or Parallels Virtuozzo. If the hosting provider overallocate (a lot do), your server will never be able to use the third gig of memory.
Worst, your VPS may be allowed to burst for a small period, but then the OOM killer would start killing processes. The OOM killer can be tweaked (from the hosting provider side) to try to prevent the more important processes from being killed (let's say ssh, bind, apache, mysql - using priorities), but as the other clients on the same node might run the same typical setup, it is not help.
Get 3G garanteed, burst disabled.
. If the webpages are really static and small, you could use a caching reverse proxy. Yes, caching uses memory. But caching also prevent spawning PHP processes, which tend to be very demanding.
(you need to do some maths and/or try... This is not an absolute solution)
. Disable APC, or run PHP-FPM.
With Fcgi, each PHP process has its own APC opcode cache. Use FPM for all PHP processes to share a common cache. Or disable APC. You don't seem to need it anyway (an opcode cache is much more efficient for reducing CPU usage than memory usage ;))