[Zope-dev] Re: inconsistent mimetype assignment for uploaded files
Tres Seaver
tseaver at zope.com
Thu Sep 30 10:03:05 EDT 2004
Jan-Wijbrand Kolman wrote:
> Hello,
>
>
> we recently realised mimetype assignment in Zope to e.g. Zope File
> objects is inconsistent and can vary when different clients (browsers)
> upload files with the same file extensions.
>
> Example: when a file called "foobar.rtf" is upload to a Zope File
> object from Linux Firefox, the mimetype assigned is (can be)
> 'application/rtf'. However, the same file uploaded to the same Zope
> File object in the same Zope instance, using IE on Window2000 (with MS
> Office installed) will get 'application/msword' assigned.
>
> The mimetype assignment for uploaded files is done in OFS.Image.py
> (maybe there're more places or other Products that do this - I know
> that at least ExtFile does this too). line 463 of OFS.Image.py, Zope
> 2.7.2:
>
> def _get_content_type(self, file, body, id, content_type=None):
> headers=getattr(file, 'headers', None)
> if headers and headers.has_key('content-type'):
> content_type=headers['content-type']
> else:
> if type(body) is not type(''): body=body.data
> content_type, enc=guess_content_type(
> getattr(file, 'filename',id), body, content_type)
> return content_type
>
> Then I understood that the headers as sent by the client for this file
> (may?) have a content-type entry that takes precedence over both the
> mimetypes 'database' and the content_type passed in as an argument.
>
> We could deal with the inconsistent assignment on the application
> level (in this case Silva), but I'd rather consider changing this
> behaviour on the Zope level. I could imagine changing the way a
> mimetype is 'guessed' from an uploaded File to something like:
>
> def _get_content_type(self, file, body, id, content_type=None):
> """
> Order of precedence:
> 1) see if guess_content_type resolves to a mimetype for the
> filename
> 2) if not use content_type as sent in the headers if
> available
> 3) else use argument passed in
> """
> headers = getattr(file, 'headers', {})
> content_type = headers.get('content-type', content_type)
> if type(body) is not type(''):
> body = body.data
> name = getattr(file, 'filename', id)
> content_type, enc = guess_content_type(name, body, content_type)
> return content_type
>
> Does anyone have an opinion on this? Is the current behaviour
> completely intentional, maybe even according to some specification
> (and thus I should deal with it on the application level)? Should I
> file a collector issue?
-1 for using the "guessed" value over the one from the headers; +1 for
using the argument over the guessed value (so that the application can
"fix" the problem). I agree that having different clients supply
different types is painful, but I don't think that "fixing" it at the
low level is reasonable (mechanism vs. policy).
In summary, I would prefer the precedence to be:
1. Passed value
2. Request header
3. Guessed value
Tres.
--
===============================================================
Tres Seaver tseaver at zope.com
Zope Corporation "Zope Dealers" http://www.zope.com
More information about the Zope-Dev
mailing list