Opened 4 years ago

Closed 2 years ago

#1485975 closed Bugs (fixed)

roundcube causes server high load with apache

Reported by: waster Owned by:
Priority: 5 Milestone: 0.6-beta
Component: IMAP connection Version: 0.2.2
Severity: major Keywords:
Cc: mcmorran@…, roundcube@…, james.doherty@…, ptaryasw@…, p.heinlein+roundcubetrac@…, raphael@…

Description

I've recently installed Roundcube 0.2.2 for our users on 1024 RAM with 3Mhz cpu server. But the problem was appearing with high server load.
According to the logs we have about 1000 messages sent everyday and about 4000 successfull and not successful logins. In the period of all day load is increasing and after midnight it can be about 9 and it caused by Apache processes. That is very strange because in that period almost all users are sleep and we have only few logins and sent messages. This is only one virtual host running on the host. I've tried to use Debian Etch and Lenny.

The problem as I see in 'ps aux | grep apache' is in some apache processes that are stuck in R process state and CLOSE_WAIT constant TCP state with IMAP port.

Also I can confirm that after apache restarting high load goes down to 0.2-0.4 but then starts to increase slowly again. So as the temporary solution I use "apache restart" instead "apache reload" in logrotate config.

I guess the problem is in Roundcube somewhere?

Change History (27)

comment:1 Changed 4 years ago by alec

  • Component changed from Core functionality to IMAP connection
  • Milestone set to 0.3-stable

What apache/php/imap versions? You should also check current Roundcube svn-trunk version.

comment:2 Changed 4 years ago by waster

Sorry, sure.

Apache 2.2.9
PHP 5.2.6
Courier-imap-4.2.1

Unfortunately, it is hard to check on svn version because of users reasons but I'll try.

comment:3 Changed 4 years ago by alec

In my opinion it could be Courier issue. Consider use of dovecot or newer Courier version.

comment:4 Changed 4 years ago by waster

We can't use dovecot because we have about 30000 Maildirs and migration to Dovecot can be very dangerous. I'll try to update Courier and check results.

comment:5 Changed 4 years ago by waster

Ok. Courier IMAP was upgraded. But it does not help. Apache process with CLOSE_WAIT stuck state still appear:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
www-data 28696 90.9 2.7 40268 14332 ? R 10:32 148:31 /usr/sbin/apache2 -k start
...

tcp 0 0 <IP>:51336 <IP>:143 CLOSE_WAIT

comment:6 Changed 4 years ago by alec

http://bugs.php.net/bug.php?id=46218 maybe related. Try with newer PHP version, also you can increase MaxClients? in httpd.conf or use other MPM module.

comment:7 Changed 4 years ago by waster

Increased MaxClients?, but it did not help. Now I've installed 5.2.9 version of PHP and monitoring load.

comment:8 Changed 4 years ago by waster

Same problem. There is already one apache process that stuck anyway. Are you sure that the problem is in the PHP or Apache? I'm leaning to RC to be caused the problem now.

comment:9 Changed 4 years ago by alec

Any errors in roundcube logs? Roundcube in shutdown_function logs out from IMAP and closes connection. So, it could be Apache/PHP issue that socket stays in CLOSE_WAIT state. Did you try with other MPM module?

comment:10 Changed 4 years ago by waster

There are many errors in roundcube logs, but it is no possible to find related error because we don't know when stuck apache process will appear. Any debugging ideas? As I know it is not safe to use other MPM module such as worker with PHP. Debian PHP package uses mpm-prefork apache package for this reason. If we need to use worker MPM module then it will be necessary to build PHP from sources.

comment:11 Changed 4 years ago by roymcmorran

  • Cc mcmorran@… added

For what it's worth, this sounds a lot like the problem I was having a year ago. See the thread at:
http://lists.roundcube.net/mail-archive/dev/2008-04/0000022.html

Things got much better when we migrated from UW-IMAP+mbox to Dovecot+Maildir. I've never used Courier here, so I can't say for certain that it applies. Might be a similar scenario though.

comment:12 Changed 4 years ago by waster

Upgraded RC to 0.3 stable. Problem with CLOSE_WAIT states and corresponding high load still stay.

comment:13 Changed 3 years ago by alec

Some IMAP code has been changed, so please test with current svn-trunk.

comment:14 Changed 3 years ago by Lazlo

  • Cc roundcube@… added

Today RoundCube again caused high load on my server:

user 16.01 0.29 0.3
Top Process %CPU 72.3 /usr/bin/php /home/user/public_html/webmail/index.php
Top Process %CPU 72.2 /usr/bin/php /home/user/public_html/webmail/index.php
Top Process %CPU 72.0 /usr/bin/php /home/user/public_html/webmail/index.php

Running on RoundCube 0.4 beta, Apache 2.0.63 and PHP 5.2.13.

Mailserver is Dovecot: OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS MULTIAPPEND UNSELECT IDLE CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS QUOTA]

comment:15 follow-up: Changed 3 years ago by jamesd

  • Cc james.doherty@… added

We're seeing the same thing on our roundcube 0.3 server (Apache 2.2.8, PHP 5.2.4, Xeon E5335, 2gb ram). Our roundcube server has imapproxy installed and accesses an imap cluster running Dovecot 1.1.11. The server is solely for roundcube.

Like others I see some apache child workers sitting in the CLOSE_WAIT state and using high CPU. We had just migrated from squirrelmail and our roundcube server was getting so overwhelmed with the load that I was forced to write a script to kill the out of control apache processes. I also had to create a second web frontend and move some of our users onto that. This has made things much more bearable and now our users can actually use our webmail interface. The initial problem with stale child workers is still there however and sometimes the load will pick up to 10-20 until the processes have been around long enough for my script to kill them. Once they have been killed, the load often returns to less than 0.5.

I'm quite satisfied that the machine is more than capable of handling the amount of users that it has. server-status typically shows 20-40 connections at a time.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 3 years ago by ptaryasw

  • Cc ptaryasw@… added

Replying to jamesd:

We're seeing the same thing on our roundcube 0.3 server (Apache 2.2.8, PHP 5.2.4, Xeon E5335, 2gb ram). Our roundcube server has imapproxy installed and accesses an imap cluster running Dovecot 1.1.11. The server is solely for roundcube.

Like others I see some apache child workers sitting in the CLOSE_WAIT state and using high CPU. We had just migrated from squirrelmail and our roundcube server was getting so overwhelmed with the load that I was forced to write a script to kill the out of control apache processes. I also had to create a second web frontend and move some of our users onto that. This has made things much more bearable and now our users can actually use our webmail interface. The initial problem with stale child workers is still there however and sometimes the load will pick up to 10-20 until the processes have been around long enough for my script to kill them. Once they have been killed, the load often returns to less than 0.5.

I'm quite satisfied that the machine is more than capable of handling the amount of users that it has. server-status typically shows 20-40 connections at a time.

May I know the PHP and Apache HTTPD server configuration please? Like,

  • Did PHP used as module (mod_php)or fastcgi (mod_fastcgi/mod_fcgid/php_fpm) for Apache?
  • Any PHP opcode (APC, XCache, eAccelerator) installed as extensions?
  • MPM child process configuration, like the amount of MaxClients? and MaxRequestsPerChild??

Some reasoning that related to issue above are, PHP will increase the resource used by httpd process if it was used with mod_php, since PHP interpreter itself embedded as module and at some point it also increasing the amount of resident memory used by single httpd process. There is also some calculation that exist about how many MaxClients? is suggested based on the previous base resident memory used. If amount of "MaxClients? x Base Resident Memory" bigger than available amount of RAM, it would be the source of problems. Next issue is MaxRequestPerChild?, it can help to kill httpd process children after serving certain amount of request. In case some memory leak happened or some httpd suffered from network bound process, in the next time the maximum request served, it will be killed.

I'm using PHP 5.2.11, Apache 2.2.13 and Roundcube 0.3.1,1. Hope some point above can help somebody out there, since I think it was not the whole roundcube issue.

comment:17 in reply to: ↑ 16 Changed 3 years ago by jamesd

Replying to ptaryasw:

May I know the PHP and Apache HTTPD server configuration please? Like,

  • Did PHP used as module (mod_php)or fastcgi (mod_fastcgi/mod_fcgid/php_fpm) for Apache?
  • Any PHP opcode (APC, XCache, eAccelerator) installed as extensions?
  • MPM child process configuration, like the amount of MaxClients? and MaxRequestsPerChild??

We're using mod_php without any opcode cache. MPM config is as follows:

StartServers? 5
MinSpareServers? 5
MaxSpareServers? 10
MaxClients? 150
MaxRequestsPerChild? 0

comment:18 Changed 3 years ago by robs

Same problem here with same configuration on the apache side.
Running Courier

I submitted #1486778 but seems the same.

comment:19 Changed 3 years ago by jamesd

I seem to have been able to solve this problem. First I upgraded from Ubuntu Hardy to Lucid and installed php-apc. This caused a massive slowdown when users were trying to use our Roundcube interface. Page loads took a long time and the server was generally unresponsive. I was able to fix this by increasing the MySQL innodb_buffer_pool_size from 8mb to 64mb. I also bumped innodb_additional_mem_pool_size to 20m.

This worked okay until I migrated users from our temporary secondary Roundcube server to the primary. Then I had the problem described above where Apache children would sit around for a long time chewing through cpu cycles and had to be killed off. I noticed that imapproxy had around 200 open connections to our IMAP servers. I disabled imapproxy and the connection count fell to a handful of connections. There was no noticeable increase in load on our IMAP servers. Over the next 30mins the load average came down from 30 to less than 1 as the old Apache children were killed off. The load average has stayed below one since then and it looks like this problem is now resolved.

If you're having problems like I was, try what I've mentioned above and if that doesn't fix it, look for other bottlenecks on your systems.

comment:20 Changed 3 years ago by till

I usually suggest the following setup:

  • php-cgi (fast-cgi)
  • a server like nginx or lighttpd
  • apc enabled

Can you guys get at least xdebug installed and some cachegrind files to see if RoundCube is eating resources anywhere?

If xdebug shows no evidence, you could try valgrind:
http://www.trafexisblogging.com/2009/12/how-to-debug-a-segmentation-fault-caused-by-php/

(I know this is not a segfault, but the howto for valgrind and php still applies.)

Please let me know what you find.

comment:21 Changed 3 years ago by pheinlein

  • Cc p.heinlein+roundcubetrac@… added

comment:22 Changed 3 years ago by rkallensee

  • Cc raphael@… added

I had at least a similar problem - Apache processes consumed 100% CPU and stuck until max_execution_time which made my server nearly unresponsive for this time. This doesn't occur on every request, but was reproducable - after a few clicks RC hang and most of the time I wasn't even able to re-login which made RC unusable.

My environment: VPS (Xen), Ubuntu 10.04 (PHP 5.3.2-1ubuntu4), Courier IMAP. I configured RC to use the imapd-ssl on port 993 and activated IMAP_TLS_REQUIRED for Courier (which forces STARTTLS for every IMAP connection).

I tracked the problem down to the fsockopen where PHP seemed to be unable to establish a TLS connection in rcube_imap_generic.php and stuck there. I asked my VPS provider for help because the problems appeared after my VPS was migrated to a new host system - the only possible difference seemed to be the kernel version provided by the host system.

I temporarily fixed this by using unencrypted IMAP (RC resides on the same system as the imapd) and disabled IMAP_TLS_REQUIRED. But then I had similar problems with sending mails (Postfix, STARTTLS required). So the problem really seemed to be STARTTLS.

After my VPS provider installed a new kernel version and I restarted my system to use this new version (2.6.32.20), the problems seemed to be completely done. No Apache processes stuck, no STARTTLS problems.

At least for me the problem seemed to be somewhere between PHP, OpenSSL and the used kernel in a Xen VPS environment. Maybe this helps someone.

comment:23 Changed 3 years ago by brandond

The core issue is that there are a number of places in the IMAP code where Roundcube will go into an infinite loop if the server hangs up unexpectedly. See the following thread for more information:

http://lists.roundcube.net/mail-archive/dev/2010-07/0000022.html

Until it's fixed in Roundcube, the best you can to is avoid triggering any conditions in your IMAP server that might cause it to crash or unexpectedly disconnect clients.

comment:24 Changed 2 years ago by alec

After reading some related (I think) findings here http://pear.php.net/bugs/bug.php?id=18176. We have many socket connections at a time in Roundcube too. So, maybe a fix for this issue would be to just close IMAP connection after SMTP and LDAP connections. See rcmail::shutdown().

comment:25 Changed 2 years ago by neubauer

Last edited 2 years ago by neubauer (previous) (diff)

comment:26 Changed 2 years ago by alec

@neubauer: Did you read the comments above? I think proxy will not solve this issue.

comment:27 Changed 2 years ago by alec

  • Resolution set to fixed
  • Status changed from new to closed

Done in [dd07e795]. I'm closing this for now. The issue should be re-tested using current code. Reopen with more feedback information.

Note: See TracTickets for help on using tickets.