[Checkins] SVN: z3c.filetype/branches/1.1.0/ - added interfaces for Microsoft office files
Juergen Kartnaller
juergen at kartnaller.at
Mon Jan 19 07:30:04 EST 2009
Log message for revision 94826:
- added interfaces for Microsoft office files
Note : with the current magic.mimes file it is not possible to reliably
detect Microsoft Office files. All Office files are detected as
application/msword. The only way for now is to use the filename to
detect the type. (see README.txt)
Changed:
U z3c.filetype/branches/1.1.0/CHANGES.txt
U z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt
U z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py
U z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py
U z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt
A z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls
A z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf
A z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt
A z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc
-=-
Modified: z3c.filetype/branches/1.1.0/CHANGES.txt
===================================================================
--- z3c.filetype/branches/1.1.0/CHANGES.txt 2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/CHANGES.txt 2009-01-19 12:30:04 UTC (rev 94826)
@@ -5,6 +5,12 @@
After
=====
+ - added interfaces for Microsoft office files
+ Note : with the current magic.mimes file it is not possible to reliably
+ detect Microsoft Office files. All Office files are detected as
+ application/msword. The only way for now is to use the filename to
+ detect the type. (see README.txt)
+
2007/12/21 1.1.1
================
Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt 2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt 2009-01-19 12:30:04 UTC (rev 94826)
@@ -19,13 +19,15 @@
>>> for name in fileNames:
... if name==".svn": continue
... path = os.path.join(testData, name)
- ... i = api.getInterfacesFor(file(path, 'rb'))
+ ... i = api.getInterfacesFor(file(path, 'rb'), filename=name)
... print name
... print sorted(i)
DS_Store
[<InterfaceClass z3c.filetype.interfaces.filetypes.IBinaryFile>]
IMG_0504.JPG
[<InterfaceClass z3c.filetype.interfaces.filetypes.IJPGFile>]
+ excel.xls
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
faces_gray.avi
[<InterfaceClass z3c.filetype.interfaces.filetypes.IAVIFile>]
ftyp.mov
@@ -42,6 +44,10 @@
[<InterfaceClass z3c.filetype.interfaces.filetypes.IAudioMPEGFile>]
noface.bmp
[<InterfaceClass z3c.filetype.interfaces.filetypes.IBMPFile>]
+ portable.pdf
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IPDFFile>]
+ powerpoingt.ppt
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
test.flv
[<InterfaceClass z3c.filetype.interfaces.filetypes.IFLVFile>]
test.gnutar
@@ -62,7 +68,30 @@
[<InterfaceClass z3c.filetype.interfaces.filetypes.IHTMLFile>]
thumbnailImage_small.jpeg
[<InterfaceClass z3c.filetype.interfaces.filetypes.IJPGFile>]
+ word.doc
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
+It is not possible to reliably detect Microsoft Office files from file data.
+The only way right now is to use the filename.
+
+ >>> for name in fileNames:
+ ... if name==".svn": continue
+ ... i = api.getInterfacesFor(filename=name)
+ ... print name
+ ... print sorted(i)
+ DS_Store
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IBinaryFile>]
+ ...
+ excel.xls
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSExcelFile>]
+ ...
+ powerpoingt.ppt
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSPowerpointFile>]
+ ...
+ word.doc
+ [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
+
+
The filename is only used if no interface is found, because we should
not trust the filename in most cases.
Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py 2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py 2009-01-19 12:30:04 UTC (rev 94826)
@@ -13,7 +13,7 @@
def byMimeType(t):
"""returns interfaces implemented by mimeType"""
-
+
ifaces = [iface for name, iface in vars(filetypes).items() \
if name.startswith("I")]
res = InterfaceSet()
@@ -30,7 +30,7 @@
objects (file argument) with an optional filename as name or
mimeType as mime-type
"""
-
+
ifaces = set()
if file is not None:
types = magicFile.detect(file)
Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py 2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py 2009-01-19 12:30:04 UTC (rev 94826)
@@ -118,3 +118,22 @@
"""XML File"""
IXMLFile.setTaggedValue(MTM,re.compile('text/xml'))
IXMLFile.setTaggedValue(MT,'text/xml')
+
+class IMSOfficeFile(IBinaryFile):
+ """Microsoft Office File"""
+
+class IMSWordFile(IMSOfficeFile):
+ """Microsoft Word File"""
+IMSWordFile.setTaggedValue(MTM,re.compile('application/.*msword'))
+IMSWordFile.setTaggedValue(MT,'application/msword')
+
+class IMSExcelFile(IMSOfficeFile):
+ """Microsoft Excel File"""
+IMSExcelFile.setTaggedValue(MTM,re.compile('application/.*excel'))
+IMSExcelFile.setTaggedValue(MT,'application/msexcel')
+
+class IMSPowerpointFile(IMSOfficeFile):
+ """Microsoft Powerpoint File"""
+IMSPowerpointFile.setTaggedValue(MTM,re.compile('application/.*powerpoint'))
+IMSPowerpointFile.setTaggedValue(MT,'application/mspowerpoint')
+
Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt 2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt 2009-01-19 12:30:04 UTC (rev 94826)
@@ -15,6 +15,7 @@
... print '%s --> %r' % (name,sorted(m.detect(file(path))))
DS_Store --> []
IMG_0504.JPG --> ['image/jpeg']
+ excel.xls --> ['application/msword']
faces_gray.avi --> ['video/x-msvideo']
ftyp.mov --> ['video/quicktime']
ipod.mp4 --> ['video/mp4', 'video/quicktime']
@@ -23,6 +24,8 @@
logo.gif.bz2 --> ['application/x-bzip2']
mpeglayer3.mp3 --> ['audio/mpeg']
noface.bmp --> ['image/bmp']
+ portable.pdf --> ['application/pdf']
+ powerpoingt.ppt --> ['application/msword']
test.flv --> ['video/x-flv']
test.gnutar --> ['application/x-tar']
test.html --> ['text/html']
@@ -33,3 +36,5 @@
test2.html --> ['text/html']
test2.thml --> ['text/html']
thumbnailImage_small.jpeg --> ['image/jpeg']
+ word.doc --> ['application/msword']
+
Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls
===================================================================
(Binary files differ)
Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls
___________________________________________________________________
Added: svn:mime-type
+ application/octet-stream
Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf
===================================================================
(Binary files differ)
Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf
___________________________________________________________________
Added: svn:mime-type
+ application/octet-stream
Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt
===================================================================
(Binary files differ)
Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt
___________________________________________________________________
Added: svn:mime-type
+ application/octet-stream
Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc
===================================================================
(Binary files differ)
Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc
___________________________________________________________________
Added: svn:mime-type
+ application/octet-stream
More information about the Checkins
mailing list