Ticket #1484608 (new Bugs)

Opened 11 months ago

Last modified 3 months ago

Serialization of header data with non ascii characters fails on unicode postgresql database

Reported by: NetCompany Owned by:
Priority: 5 Milestone: 0.2-beta
Component: Database Version: svn-trunk
Severity: normal Keywords: mdb2 utf8
Cc:

Description

Some non-compliant messages contain latin1 characters in their headers. These messages generate the following error when loading my inbox:

[Sat Oct 13 15:05:21 2007] [error] [client 213.84.186.140] PHP Notice:  DB Error: unknown error Query: INSERT INTO messages (user_id, del, cache_key, created, idx, uid,
 subject, "from", "to", cc, date, size, headers, structure) VALUES ('1', 0, 'INBOX.msg', now(), '20021', '102068', 'RE: Re: [Fwd: MAGISTER-SYNC: geen records opgehaald 
uit magister!]', 'Ren\xc3\xa9 Klein <rene.klein@hml.nl>', 'Kasper Schoonman <kasper@netcompany.nl>', '', '2007-10-11 13:55:35', 17378, 'O:14:"iilBasicHeader":25:{s:2:"i
d";s:5:"20021";s:3:"uid";s:6:"102068";s:7:"subject";s:66:"RE: Re: [Fwd: MAGISTER-SYNC: geen records opgehaald uit mag in /usr/share/roundcube/program/include/bugs.inc o
n line 80, referer: http://netmail.nu/

I think the problem is in rcube_imap.inc around line 2142. Here the header data is serialized with the PHP serialize() function. If this header data is not ASCII (or UTF-8) postgresql with a unicode database will complain about mall-formed UTF-8 sequences.

I do not directly see what the right solution will be. Either the field has to change from text to bytea type. Or the data has to be encapsulated using base64 or maybe even UTF-8.

Change History

  Changed 7 months ago by seansan

  • milestone set to 0.1.1

Review in 0.1.1

  Changed 4 months ago by alec

  • version changed from 0.1-rc1 to svn-trunk

Confirmed. The problem isn't only with headers and structure columns. Also subject, from and to must be taken into consideration. Bytea type is not a solution for such problem.

follow-up: ↓ 4   Changed 3 months ago by till

  • keywords mdb2 utf8 added

I thought MDB2 handled all that? Or is the data itself not UTF-8?

in reply to: ↑ 3   Changed 3 months ago by alec

Replying to till:

I thought MDB2 handled all that? Or is the data itself not UTF-8?

the data

Note: See TracTickets for help on using tickets.