Table Of Contents

Previous topic

PyMOTW: itertools

Next topic

PyMOTW: BaseHTTPServer

This Page

PyMOTW: zipfile

  • 模块: zipfile
  • 目的: 读写ZIP档案文件.
  • python版本: 1.6+

zipfile模块能用来处理ZIP档案文件.

局限

zipfile模块不支持附加评论的或者多磁盘ZIP文件, 支持大于4GB使用ZIP64扩展的ZIP文件.

测试ZIP文件

is_zipfile() 函数返回一个布尔值, 判断给定的文件是否是一个有效的ZIP文件.

import zipfile

for filename in [ 'README.txt', 'example.zip', 'bad_example.zip', 'notthere.zip' ]:
    print '%20s %s' % (filename, zipfile.is_zipfile(filename))

注意:如果文件不存在, is_zipfile() 返回False.

$ python zipfile_is_zipfile.py
README.txt False
example.zip True
bad_example.zip False
notthere.zip False

从ZIP存档中读取元数据

使用ZipFile类直接处理ZIP存档, 它支持从现有存档中读取数据也支持向存档中加入其它文件更改存档.

使用 namelist() 函数读取现有存档中所有文件的名字.

import zipfile

zf = zipfile.ZipFile('example.zip', 'r')
print zf.namelist()

返回的是存档内容名字的字符串列表.

$ python zipfile_namelist.py
['README.txt']

然而,名字列表只是存档中可用信息的一小部分, 使用 infolist() 或者 getinfo() 方法来访问存档内容的所有元数据.

import datetime
import zipfile

def print_info(archive_name):
    zf = zipfile.ZipFile(archive_name)
    for info in zf.infolist():
        print info.filename
        print '\tComment:\t', info.comment
        print '\tModified:\t', datetime.datetime(*info.date_time)
        print '\tSystem:\t\t', info.create_system, '(0 = Windows, 3 = Unix)'
        print '\tZIP version:\t', info.create_version
        print '\tCompressed:\t', info.compress_size, 'bytes'
        print '\tUncompressed:\t', info.file_size, 'bytes'
        print

if __name__ == '__main__':
    print_info('example.zip')

除了这儿输出的一些信息外, 还有别的东西, 但是需要仔细阅读ZIP文件说明书上的 PKZIP应用注释 才能将其解密成有用的东西.

$ python zipfile_infolist.py
README.txt
  Comment:
  Modified: 2007-12-16 10:08:52
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 23
  Compressed: 63 bytes
  Uncompressed: 75 bytes

如果你已经知道了存档中各文件的名字, 你也可以通过 getinfo() 方法获得它的ZipInfo对象.

import zipfile

zf = zipfile.ZipFile('example.zip')
for filename in [ 'README.txt', 'notthere.txt' ]:
    try:
        info = zf.getinfo(filename)
    except KeyError:
        print 'ERROR: Did not find %s in zip file' % filename
    else:
        print '%s is %d bytes' % (info.filename, info.file_size)

如果存档中的某个文件不存在, getinfo() 方法会产生一个KeyError.

$ python zipfile_getinfo.py
README.txt is 75 bytes
ERROR: Did not find notthere.txt in zip file

从ZIP档案中提取文件

为了访问存档文件的数据, 使用 read() 方法,并将该成员的名字传递给它.

import zipfile

zf = zipfile.ZipFile('example.zip')
for filename in [ 'README.txt', 'notthere.txt' ]:
    try:
        data = zf.read(filename)
    except KeyError:
        print 'ERROR: Did not find %s in zip file' % filename
    else:
        print filename, ':'
        print repr(data)
        print

必要时, 数据会自动解压缩.

$ python zipfile_read.py
README.txt :
'The examples for the zipfile module use this file and example.zip as data.\n'

ERROR: Did not find notthere.txt in zip file

创建一个新的档案

为了创建一个新的档案, 以‘w’模式简单实例化ZipFile对象. 档案中任何现有文件会被清空, 开始新档案. 使用 write() 方法可以在档案中增加文件.

from zipfile_infolist import print_info
import zipfile

print 'creating archive'
zf = zipfile.ZipFile('zipfile_write.zip', mode='w')
try:
    print 'adding README.txt'
    zf.write('README.txt')
finally:
    print 'closing'
    zf.close()

print
print_info('zipfile_write.zip')

默认情况下, 档案的文件不会被压缩:

$ python zipfile_write.py
creating archive
adding README.txt
closing

README.txt
  Comment:
  Modified: 2007-12-16 10:08:50
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 75 bytes
  Uncompressed: 75 bytes

zlib模块提供压缩功能. 如果zlib是可用的, 你能使用zipfile.ZIP_DEFLATED对个人文件或者整个档案设置压缩模式. 默认压缩模式为zipfile.ZIP_STORED.

from zipfile_infolist import print_info
import zipfile
try:
    import zlib
    compression = zipfile.ZIP_DEFLATED
except:
    compression = zipfile.ZIP_STORED

modes = { zipfile.ZIP_DEFLATED: 'deflated',
    zipfile.ZIP_STORED: 'stored',
}

print 'creating archive'
zf = zipfile.ZipFile('zipfile_write_compression.zip', mode='w')
try:
    print 'adding README.txt with compression mode', modes[compression]
    zf.write('README.txt', compress_type=compression)
finally:
    print 'closing'
    zf.close()

print
print_info('zipfile_write_compression.zip')

这次, 归档中的文件被压缩了:

$ python zipfile_write_compression.py creating archive
adding README.txt with compression mode deflated
closing

README.txt
  Comment:
  Modified: 2007-12-16 10:08:50
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 63 bytes
  Uncompressed: 75 bytes

使用备选的存档成员名

传递arcname参数给 wirte() 可以很容易将一个文件添加到存档中, 但命名不能是原始文件名.

from zipfile_infolist import print_info
import zipfile

zf = zipfile.ZipFile('zipfile_write_arcname.zip', mode='w')
try:
    zf.write('README.txt', arcname='NOT_README.txt')
finally:
    zf.close()
print_info('zipfile_write_arcname.zip')

在档案中, 新的文件没有使用原来的文件名.

$ python zipfile_write_arcname.py
NOT_README.txt
  Comment:
  Modified: 2007-12-16 10:08:50
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 75 bytes
  Uncompressed: 75 bytes

从源而非文件上写数据

有时候, 将那些不是来自现有文件的数据直接写入到ZIP档案中也是有必要的, 而不是通过先把这些数据写入到一个文件中, 再把这个文件添加到ZIP档案中, 你可以使用writestr()函数将字符串字节流直接写入到档案中.

from zipfile_infolist import print_info
import zipfile

msg = 'This data did not exist in a file before being added to the ZIP file'
zf = zipfile.ZipFile('zipfile_writestr.zip',
 mode='w',
 compression=zipfile.ZIP_DEFLATED,
)
try:
    zf.writestr('from_string.txt', msg)
finally:
    zf.close()

print_info('zipfile_writestr.zip')

zf = zipfile.ZipFile('zipfile_writestr.zip', 'r')
print zf.read('from_string.txt')

上述实例中, 我在ZipFile中使用compress参数来压缩数据, 但 writestr() 方法中不支持该参数.

$ python zipfile_writestr.py
from_string.txt
  Comment:
  Modified: 2007-12-16 11:38:14
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 62 bytes
  Uncompressed: 68 bytes

This data did not exist in a file before being added to the ZIP file

通过ZipInfo实例写入

默认情况下, 当你在档案中加入文件或者字符串时, 需要计算修改日期. 当使用 writestr() 方法时, 也需要传递一个ZipInfo实例给它, 该实例包含了修改日期和别的自定义元数据.

import time
import zipfile
from zipfile_infolist import print_info

msg = 'This data did not exist in a file before being added to the ZIP file'
zf = zipfile.ZipFile('zipfile_writestr_zipinfo.zip', mode='w',)
try:
    info = zipfile.ZipInfo('from_string.txt', date_time=time.localtime(time.time()),)
    info.compress_type=zipfile.ZIP_DEFLATED
    info.comment='Remarks go here'
    info.create_system=0
    zf.writestr(info, msg)
finally:
    zf.close()

print_info('zipfile_writestr_zipinfo.zip')

在这个例子中, 我修改时间为当前时间. 压缩数据, 赋create_system值为0. 还增添了评论.

$ python zipfile_writestr_zipinfo.pyfrom_string.txt
  Comment: Remarks go here
  Modified: 2007-12-16 11:44:14
  System: 0 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 62 bytes
      Uncompressed: 68 bytes

追加文件

除了创建一个新档案之外, 还可以在现有档案上追加一个文件或在一个现有文件(如a.exe, 自解压档案文件)的末尾增加一个档案文件. 使用模式‘a‘打开文件以便追加.

from zipfile_infolist import print_info
import zipfile

print 'creating archive'
zf = zipfile.ZipFile('zipfile_append.zip', mode='w')
try:
    zf.write('README.txt')
finally:
    zf.close()

print
print_info('zipfile_append.zip')

print 'appending to the archive'
zf = zipfile.ZipFile('zipfile_append.zip', mode='a')
try:
    zf.write('README.txt', arcname='README2.txt')
finally:
    zf.close()

print
print_info('zipfile_append.zip')

结果档案有2个文件:

$ python zipfile_append.py
creating archive

README.txt
  Comment:
  Modified: 2007-12-16 10:08:50
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 75 bytes
  Uncompressed: 75 bytes

appending to the archive

README.txt
  Comment:
  Modified: 2007-12-16 10:08:50
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 75 bytes
  Uncompressed: 75 bytes

README2.txt
  Comment:
  Modified: 2007-12-16 10:08:50
  System: 3 (0 = Windows, 3 = Unix)
  ZIP version: 20
  Compressed: 75 bytes
  Uncompressed: 75 bytes

Python ZIP档案

如果存档出现在sys.path中, Python 2.3及以后版本都有能力从ZIP档案内部引入模块. 使用类zpfile.PyZipFile可以构造一个模块来适合这种用法. 当你使用其他方法 writepy() 时,PyZipFile浏览目录寻找.py文件, 并且将关联文件 .pyo.pyc 加入到档案中. 如果两者都不存在, 则生成一个.pyc文件, 并将其加入到档案中.

import sys
import zipfile

if __name__ == '__main__':
    zf = zipfile.PyZipFile('zipfile_pyzipfile.zip', mode='w') ## 这段代码就可以直接将当前目录压缩打包,还能编译py脚本
    try:
        zf.debug = 3
        print 'Adding python files'
        zf.writepy('.')
    finally:
        zf.close()
        for name in zf.namelist():
            print name

    print
    sys.path.insert(0, 'zipfile_pyzipfile.zip')
    import zipfile_pyzipfile
    print 'Imported from:', zipfile_pyzipfile.__file__

当我设置PyZipFile的属性debug=3, 就激活了verbose debugging, 这在编译每一个.py文件时可以看到.

$ python zipfile_pyzipfile.py
Adding python files
Adding package in . as .
Compiling ./__init__.py
Adding ./__init__.pyc
Compiling ./zipfile_append.py
Adding ./zipfile_append.pyc
Compiling ./zipfile_getinfo.py
Adding ./zipfile_getinfo.pyc
Compiling ./zipfile_infolist.py
Adding ./zipfile_infolist.pyc
Compiling ./zipfile_is_zipfile.py
Adding ./zipfile_is_zipfile.pyc
Compiling ./zipfile_namelist.py
Adding ./zipfile_namelist.pyc
Compiling ./zipfile_printdir.py
Adding ./zipfile_printdir.pyc
Compiling ./zipfile_pyzipfile.py
Adding ./zipfile_pyzipfile.pyc
Compiling ./zipfile_read.py
Adding ./zipfile_read.pyc
Compiling ./zipfile_write.py
Adding ./zipfile_write.pyc
Compiling ./zipfile_write_arcname.py
Adding ./zipfile_write_arcname.pyc
Compiling ./zipfile_write_compression.py
Adding ./zipfile_write_compression.pyc
Compiling ./zipfile_writestr.py
Adding ./zipfile_writestr.pyc
Compiling ./zipfile_writestr_zipinfo.py
Adding ./zipfile_writestr_zipinfo.pyc
__init__.pyc
zipfile_append.pyc
zipfile_getinfo.pyc
zipfile_infolist.pyc
zipfile_is_zipfile.pyc
zipfile_namelist.pyc
zipfile_printdir.pyc
zipfile_pyzipfile.pyc
zipfile_read.pyc
zipfile_write.pyc
zipfile_write_arcname.pyc
zipfile_write_compression.pyc
zipfile_writestr.pyc
zipfile_writestr_zipinfo.pyc

Imported from: zipfile_pyzipfile.zip/zipfile_pyzipfile.pyc