#1488093 closed Bugs (fixed)

charset iso-8859-1 not decoded

Reported by: phKU Owned by:
Priority: 2 Milestone: 0.7-beta
Component: Core functionality Version: 0.5.4
Severity: normal Keywords: charset encoding reading
Cc: ralph-1@…

Description

Context: incoming mails

Incoming mails with iso-8859-1 encoding content are not decoded at all. Character codes are showing instead of the correct localized character. For example, '=E9' instead of 'é', '=2C' instead of ',' , etc (see the attached mail).

  • Same problem on different roundcube installs, on different servers.
  • These mails are showing correctly with other mail client (Squirrel, Entourage).
  • Windows-1252 and UTF-8 encoded mails are showing without problem.

Thanks in advance for any help ;)

Attachments (2)

example.txt (11.4 KB) - added by phKU 20 months ago.
Incoming mail source example
imap.txt (25.2 KB) - added by phKU 20 months ago.
imap transaction

Download all attachments as: .zip

Change History (19)

Changed 20 months ago by phKU

Incoming mail source example

comment:1 Changed 20 months ago by alec

  • Milestone set to 0.6-stable
  • Resolution set to worksforme
  • Status changed from new to closed

Works for me with svn-trunk version.

comment:2 Changed 20 months ago by phKU

  • Cc ralph-1@… added

comment:3 Changed 20 months ago by phKU

  • Resolution worksforme deleted
  • Status changed from closed to reopened

Sorry, but the issue is still there on:

comment:4 Changed 20 months ago by thomasb

Works for me, too. Tested with Roundcube release 0.5.4, 0.6-rc and trunk.

Where exactly are the strings not decoded? Subject, From/To?, Content?

comment:5 Changed 20 months ago by thomasb

What IMAP server sofware do you use? Maybe the message structure with the charset header isn't correctly transferred to Roundcube.

comment:6 Changed 20 months ago by phKU

Thanks for your help ;)

This issue affect only the [content] part though the [subject] and the [from] also contains iso-8859-1 encoded text. I don't know the imap server version (located at mail.dreamhost.com), but I doubt this is the cause since the mails are perfectly read by my cited above other mail clients.

1° Source extract of broken mail:

From: =?iso-8859-1?B?QWxiZXJ0IEr2cmltYW5u?= <username@hotmail.com>
To: <username@gmail.com>
Subject: =?iso-8859-1?Q?FW:_"Combi?= =?iso-8859-1?Q?en_co=FBte_u?=
 =?iso-8859-1?Q?n_revenu_u?= =?iso-8859-1?Q?niversel"=2C?=
 =?iso-8859-1?Q?_article_d?= =?iso-8859-1?Q?e_M._Vince?=
 =?iso-8859-1?Q?nt_Achetti?= =?iso-8859-1?Q?z_sur_votr?=
 =?iso-8859-1?Q?e_site_jlr?= =?iso-8859-1?Q?v?=
Date: Fri, 16 Sep 2011 16:02:05 +0200
Importance: Normal
In-Reply-To:
 <1462431152-1316180224-cardhu_decombobulator_blackberry.rim.net-1794688638-@b27.c19.bise7.blackberry>
References:
 <CABY5vwAqEuWK281_zCHhTODtex6hjEeu-4V-1fWKoiCPuxw3MQ@mail.gmail.com><CAEJGJ=2J397=c-9z_G8EvgwbpAacnCeoawMq5Z_uzszTXMMAzg@mail.gmail.com>,<1462431152-1316180224-cardhu_decombobulator_blackberry.rim.net-1794688638-@b27.c19.bise7.blackberry>
MIME-Version: 1.0
X-OriginalArrivalTime: 16 Sep 2011 14:02:06.0923 (UTC) FILETIME=[382CD9B0:01CC7479]

--_2cf2a1f3-712d-4c41-9e63-e381a1ad81f5_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable


voil=E0 quand-m=EAme les premi=E8res =E9l=E9ments de r=E9ponse du jplrv.
Bonne journ=E9e=2C Albert

render to:
voilà quand-même les premières éléments de réponse du jplrv.

Bonne journée, Albert


2° Source extract of working example (from same sender):

From: =?Windows-1252?B?QWxiZXJ0IEr2cmltYW5u?= <username@hotmail.com>
To: <username@gmail.com>
Subject: =?Windows-1252?Q?FW:_"Combi?= =?Windows-1252?Q?en_co=FBte_u?=
 =?Windows-1252?Q?n_revenu_u?= =?Windows-1252?Q?niversel"=2C?=
 =?Windows-1252?Q?_article_d?= =?Windows-1252?Q?e_M._Vince?=
 =?Windows-1252?Q?nt_Achetti?= =?Windows-1252?Q?z_sur_votr?=
 =?Windows-1252?Q?e_site_jlr?= =?Windows-1252?Q?v?=
Date: Fri, 16 Sep 2011 17:06:44 +0200
Importance: Normal
In-Reply-To:
 <1374314012-1316183924-cardhu_decombobulator_blackberry.rim.net-1558811235-@b27.c19.bise7.blackberry>
References:
 <CABY5vwAqEuWK281_zCHhTODtex6hjEeu-4V-1fWKoiCPuxw3MQ@mail.gmail.com><CAEJGJ=2J397=c-9z_G8EvgwbpAacnCeoawMq5Z_uzszTXMMAzg@mail.gmail.com><1462431152-1316180224-cardhu_decombobulator_blackberry.rim.net-1794688638-@b27.c19.bise7.blackberry><CABY5vwCBTdH+d1uf1h4BYWq8aDBHdMNnPhGZkAe-smLDH67uDg@mail.gmail.com>,<1374314012-1316183924-cardhu_decombobulator_blackberry.rim.net-1558811235-@b27.c19.bise7.blackberry>
MIME-Version: 1.0
X-OriginalArrivalTime: 16 Sep 2011 15:06:45.0384 (UTC) FILETIME=[3FEB0880:01CC7482]

--_a1b6c844-8334-47dc-a422-ab93a5981191_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable


et voil=E0 la suite et la fin temporaire de notre gentil =E9change de courr=
iers. Nous avons notre temps de pr=E9parer quelque chose dans les d=E9lais =
utiles.
=20
bonnne soir=E9e=2C Albert
=20

render to:
et voilà la suite et la fin temporaire de notre gentil échange de courriers. Nous avons notre temps de préparer quelque chose dans les délais utiles.

bonnne soirée, Albert

comment:7 Changed 20 months ago by alec

Please, do debug of IMAP conversation.

Changed 20 months ago by phKU

imap transaction

comment:8 Changed 20 months ago by phKU

IMAP conversation as attachment #2

comment:9 Changed 20 months ago by alec

The IMAP response looks fine. What PHP version, what OS?

comment:10 Changed 20 months ago by phKU

  • Linux 2.6.32.8-grsec-2.1.14-modsign-xeon-64 #2 SMP x86_64
  • Apache 2.2.17-1
  • PHP Version 5.3.5 or PHP Version 5.3.2 or 5.2.x cgi (same result on the different version)

Note: the imap server and the client server are provided by dreamhost.com, a very popular hosting company.

comment:11 follow-up: Changed 20 months ago by phKU

Some new clues:

  • the problem occurs only on some "iso-8859-1" messages from one of my correspondent.
  • other "iso-8859-1" messages are showing correctly, even from same sender.
  • if I kill line #2501 of rcube_imap.php containing $body = rcube_charset_convert($body, $o_part->charset); characters are showing correctly (it seems that this function call in this case adds a superfluous and faulty conversion layer).

For these reasons, I first thought it was an issue attributable to the sender fault (bugged message) before I learned that all other of his message receivers have no problem and before I did a successful test with my other mail clients.
At this stage, I guess the issue could be related to some deformation in the concerned messages which are corrected by the other mail clients but not by roundcube.

comment:12 in reply to: ↑ 11 Changed 20 months ago by alec

Replying to phKU:

  • if I kill line #2501 of rcube_imap.php containing $body = rcube_charset_convert($body, $o_part->charset); characters are showing correctly (it seems that this function call in this case adds a superfluous and faulty conversion layer).

Add console($body); before this line and check logs/console. You could also debug other places in the code this way.

At this stage, I guess the issue could be related to some deformation in the concerned messages which are corrected by the other mail clients but not by roundcube.

But why it works in my environment?

comment:13 follow-up: Changed 20 months ago by phKU

After further investigations, I noted following points:

  • Contrarily to what I thought at beginning, the issue is not dependent to one particular sender, nor to one particular encoding (it did happen also with "Windows-1252" charset).
  • It seems to happen only on replied and/or forwarded mails.
  • I tried a fresh and plain install (no plugins) on a local Mac OS X machine and got exactly the same result with the same mails...

As I've no knowledge on the RC program workflow, I'm sorry I don't think I can spend the learning time needed for a serious debug attempt. I will for the moment fallback to Squirrel mail or another client for managing the broken mails :(

@alec: If you think it could be useful, I could setup for you a test install with a test account and forward there (raw forwarding) some bugged mails for further testing.

Thanks anyway for your help ;)

Last edited 20 months ago by phKU (previous) (diff)

comment:14 in reply to: ↑ 13 Changed 20 months ago by alec

Replying to phKU:

@alec: If you think it could be useful, I could setup for you a test install with a test account and forward there (raw forwarding) some bugged mails for further testing.

Because it's working in my environment, to do an investivation I would need ssh access to your Roundcube instance. If it would be possible write to me - alec at alec dot pl.

comment:15 Changed 20 months ago by alec

  • Resolution set to worksforme
  • Status changed from reopened to closed

comment:16 Changed 19 months ago by alec

  • Milestone changed from 0.6-stable to 0.7-beta
  • Resolution worksforme deleted
  • Status changed from closed to reopened

I've investigated the issue on requestor's system. The problem is with html parts of the messages.

Precissely, the messages contain malformed HTML content where <meta> tag with charset specification is put inside of the <body> tag. This is not correctly handled by PHP's DOMDocument. Characters are broken inside of washtml class.

While it works on my system I assume it has been fixed in PHP 5.3. Broken system uses PHP 5.2.17. As a workaround we can make sure the <meta> tag is inside the head.

comment:17 Changed 19 months ago by alec

  • Resolution set to fixed
  • Status changed from reopened to closed

Fixed in [104e2353].

Note: See TracTickets for help on using tickets.