One could use file(1) on *nix systems to analyze file content. It shouldn't be too hard to reimplement the needed parts of file(1) in python, because afaik it mainly uses the /etc/magic (/usr/lib/magic) database to do it's job. I just checked, and it is capable of recongnizing word,excel and rtf files, and won't be irritated by extensions.
Yes, that's what we use here. At least on Debian and Mandrake systems, file(1) has a "-i" option that outputs a MIME content-type instead of an english description. Very convenient. -- Florent -- Florent Guillaume, Nuxeo SARL (Paris, France) +33 1 40 33 79 10 http://nuxeo.com mailto:fg@nuxeo.com -- Florent Guillaume, Nuxeo SARL (Paris, France) +33 1 40 33 79 10 http://nuxeo.com mailto:fg@nuxeo.com