Opened 3 years ago

Closed 19 months ago

#1487424 closed Bugs (fixed)

High load when parsing big HTML messages

Reported by: umount Owned by:
Priority: 5 Milestone: 0.8-beta
Component: Core functionality Version: git-master
Severity: normal Keywords:
Cc:

Description

I tried to open mail (html) file in the size 8bm.
Very big loading on the server. Use memory 150Mb.
Then I heve error Nginx 504 Gateway Time-out.

I look in profiler Xdebug, it is a crazy. Some parts of the program are caused on 500000 times (screen 1).

I am try use the uppercase S (study) modifier. http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php
It has a little helped. (screen 2)

I consider that incorrectly when it is possible to load strongly and even to drop the server if to users to dispatch html files in the size 10Mb


Attachments (3)

1.png (244.9 KB) - added by umount 3 years ago.
screen 1
2.png (243.2 KB) - added by umount 3 years ago.
screen 2
top.png (50.3 KB) - added by umount 3 years ago.

Download all attachments as: .zip

Change History (13)

Changed 3 years ago by umount

screen 1

Changed 3 years ago by umount

screen 2

comment:1 Changed 3 years ago by umount

  • Component changed from Addressbook to Core functionality
  • Milestone changed from later to 0.5-stable

comment:2 follow-up: Changed 3 years ago by alec

  • Summary changed from Hi load html size 8Mb to High load when parsing big HTML messages

Could you provide a patch? Here I've got 27MB memory peak for 2MB of the message.

comment:3 in reply to: ↑ 2 Changed 3 years ago by umount

Replying to alec:

Could you provide a patch? Here I've got 27MB memory peak for 2MB of the message.

What patch is necessary to you?

I show you screen (top.png) usage physical memory 124Mb virtual 268Mb.
Message(html) size 8.2Mb.

Changed 3 years ago by umount

comment:4 follow-up: Changed 3 years ago by alec

Where did you add the S flag? I've tried to add it in some places, but have not found any speedup nor memory usage reduction. What PHP version? Enable devel_mode in Roundcube config, this will print real PHP memory usage into logs/console.

comment:5 in reply to: ↑ 4 Changed 3 years ago by umount

Replying to alec:

Where did you add the S flag? I've tried to add it in some places, but have not found any speedup nor memory usage reduction. What PHP version? Enable devel_mode in Roundcube config, this will print real PHP memory usage into logs/console.

You are absolutely right flag S not help.

devel_mode show
[03-Dec-2010 10:28:57 +0300]: mail/preview [44 МБ/145 МБ]: 34,2061 sec
[03-Dec-2010 10:35:50 +0300]: mail/preview [44 МБ/145 МБ]: 33,5005 sec
with xcache
[03-Dec-2010 10:24:20 +0300]: mail/preview [39 МБ/141 МБ]: 33,4470 sec

processor
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz

php
PHP 5.3.1 (cli) (built: Dec 7 2009 18:10:26)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2009 Zend Technologies

comment:6 Changed 3 years ago by umount

Gmail crop big html mail, before to display.

comment:7 Changed 3 years ago by umount

I think to the analysis of the text we should cut off the letter, and in the end to letter link to downloading letter without сrop.
Because each time when we preview mail, we wait much time and server high load

comment:8 Changed 19 months ago by alec

For 5MB message current svn-trunk uses [15 MB/47 MB]: 26.1509 sec. So, we are getting better.

comment:9 Changed 19 months ago by alec

Now it's 43MB. More investigation took some more info. We have a few bottlenecks here. One group is preg_replace/str_replace usage which can be probably fixed by parsing in chunks. The second bottleneck is washtml class (DOMDocument). Both are making a memory peak of 5-6 x message body size.

comment:10 Changed 19 months ago by alec

  • Milestone changed from 0.7-stable to 0.8-beta
  • Resolution set to fixed
  • Status changed from new to closed

The simplest solution is to not even try to parse big messages. We can display a notice with link to download the part directly. Done in [e0960f63].

Note: See TracTickets for help on using tickets.