Opened 5 years ago

Closed 4 years ago

#1485311 closed Bugs (fixed)

Common office formats are misdetected by mime detection code

Reported by: arekm Owned by: thomasb
Priority: 5 Milestone: 0.2.1
Component: MIME parsing Version: 0.2-alpha
Severity: major Keywords:
Cc:

Description (last modified by thomasb)

Continuation of #1485152 (no permission to reopen original bug).

finfo doesn't solve this problem. finfo uses libmagic library which itself isn't able to detect many types of M$ attachments properly.

Example with example files found on internet:

$ rpm -q libmagic
libmagic-4.25-1.i686
$ file -i *
20080529_zalacznik_nr_3.xls:           application/octet-stream
Alice's Adventures in Wonderland.doc:  application/msword
Alice's Adventures in Wonderland.docx: application/zip
Office Open XML sample.doc:            application/msword
Office Open XML sample.docx:           application/zip
StockChart.ods:                        application/octet-stream
p35-47.ppt:                            application/octet-stream

xls is misdetected as octet-stream, docx are misdetected as zip files (well, these are zip files in reality but have own mime type for that), powerpoint file misdetected, openoffice calc ods file misdetected.

I think that roundcube should do filename based type detection for these commonly misdetected formats. Unfortunately M$ didn't think about making explict magic in these file formats to make detection easy. The filename (extension) based detection is the only way.

Common mime types: http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/conf/mime.types?view=markup , OpenXML types: http://blogs.msdn.com/dmahugh/archive/2006/08/08/692600.aspx

Ideas, comments?

This bug is quite important to fix since office formats are very comonly seen as email attachments. Emails sent from roundcube to outside world, with wrong mime types cause other email clients/webmails to misbehave.

Attachments (1)

mime-detect.patch (3.9 KB) - added by arekm 5 years ago.
proposed solution for comonly misdetected formats

Download all attachments as: .zip

Change History (6)

Changed 5 years ago by arekm

proposed solution for comonly misdetected formats

comment:2 Changed 5 years ago by tensor1982

An empty .rar file is often misdetected as text/plain, too. Consider including it into the list.

comment:3 Changed 4 years ago by alec

  • Milestone changed from later to 0.2.1

comment:4 Changed 4 years ago by thomasb

  • Description modified (diff)
  • Owner set to thomasb

I think having a hard-coded list will solve this issue best... unfortunately.

Marked #1485730 and #1485697 as duplicates if this bug.

comment:5 Changed 4 years ago by thomasb

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in [0ea569c2]

Note: See TracTickets for help on using tickets.