代码之家 › 专栏 › 技术社区 › Wondercricket

Python ExifTool结合主题和关键字标记

exiftool pdf python-2.7

1

Wondercricket · 技术社区 · 10 年前

我有一个python脚本,使用 exiftool 以更新给定PDF中的元数据。的文档和下载 出口工具 可以在这里找到: PyExifTool

以下是我的当前代码:

if __name__ == '__main__':
    from exif_tool import ExifTool, fsencode

    source_file = 'D:\\my_file.pdf'
    author = 'some author'
    keywords = 'some keywords'
    subject = 'some subject'
    title = 'some title'   

    with ExifTool('exiftool.exe') as et:
        params = map(fsencode, ['-Title=%s' % title,
                                '-Author=%s' % author,
                                '-Creator=%s' % author,
                                '-Subject=%s' % subject,
                                '-Keywords=%s' % keywords,
                                '%s' % source_file])

        et.execute(*params)
        os.remove('%s_original' % source_file)

        for key, value in dict(et.get_metadata(source_file)).items():
              if key.startswith('PDF:') and ('Author' in key or 'Keywords' in key or 'Title' in key or 'Subject' in key):
                print key, value


>>> PDF:Keywords [u'some', u'keywords']
>>> PDF:Title some title
>>> PDF:Subject some subject
>>> PDF:Author some author

上述代码有效,并相应地更新PDF元数据。但是,当我在Adobe Acrobat或Adobe Reader中查看PDF元数据时,主题和关键字的值都显示在关键字字段中。

enter image description here

总的来说,在大多数情况下,这不是一个关键问题,但我可以预见会收到许多关于这方面的投诉。

我可能只是缺少了一些小的配置或设置,但我已经阅读了文档,并没有找到任何解决方法。

有人有什么想法吗?

1 回复 | 直到 10 年前

1

Wondercricket 10 年前

我找到了解决办法,这就是我想到的。为了防止主题和关键字在关键字字段中组合,需要使用 Exiftool .

params = map(fsencode, ['-PDF:Subject=%s' % subject,
                        '-XMP:Subject=',
                        '-PDF:Title=%s' % title,
                        '-XMP:Title=',
                        '-PDF:Author=%s' % author,
                        '-XMP:Author=',
                        '-PDF:Keywords=%s' % keywords,
                        '-XMP:Keywords=',
                        '-overwrite_original',
                        '%s' % source_file])