代码之家 › 专栏 › 技术社区 › Kai Huppmann

用python检查两个文件的新行独立标识的最佳方法

compare file python

Kai Huppmann · 技术社区 · 14 年前

我试过了

filecmp.cmp(file1,file2)

但它不起作用,因为除了新行字符,文件是相同的。在filecmp或其他一些方便的函数/库中有这样的选项吗?还是我必须逐行读取这两个文件并比较它们?

3 回复 | 直到 14 年前

AndiDog 14 年前

我认为像这样一个简单的便利功能应该可以做到:

from itertools import izip

def areFilesIdentical(filename1, filename2):
    with open(filename1, "rtU") as a:
        with open(filename2, "rtU") as b:
            # Note that "all" and "izip" are lazy
            # (will stop at the first line that's not identical)
            return all(myprint() and lineA == lineB
                       for lineA, lineB in izip(a.xreadlines(), b.xreadlines()))

Community CDub 5 年前

difflib module -它提供用于比较序列的类和函数。

为了你的需要 difflib.Differ 这门课看起来很有趣。

class difflib.Differ

这是一个类,用于比较文本行的序列,并产生人类可读的差异或增量。Differ使用SequenceMatcher来比较行序列,并比较相似(接近匹配)行中的字符序列。

differ example ,比较两个文本。所比较的序列也可以从 readlines() 类文件对象的方法。

Anurag Uniyal 14 年前

看起来您只需要检查文件是否相同或不忽略空格/换行符。

def do_cmp(f1, f2):
    bufsize = 8*1024
    fp1 = open(f1, 'rb')
    fp2 = open(f2, 'rb')
    while True:
        b1 = fp1.read(bufsize)
        b2 = fp2.read(bufsize)
        if not is_same(b1, b2):
            return False
        if not b1:
            return True

def is_same(text1, text2):
    return text1.replace("\n","") == text2.replace("\n","")

你可以改进 is_same

推荐文章

Google User · Django管理员在`list_display中未显示`creation_date`字段`

3 月前

user29747013 · 如何创建一个新的数据框架,其中包含原始数据框架中列的聚合列?

4 月前

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

4 月前

user29715306 · from_users=和chats=电视节目中的差异

4 月前

Redshoe · 当执行numpy.genfromtxt()时,python是否会读取文件的所有行?

4 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

4 月前

prayner · 更新嵌套字典包含列表中的项

4 月前

Bringo Jr · 我可以在O(n)中解决这个问题吗?

4 月前

Dave · 如何在for循环中修改列表值

4 月前

Shukurullox Komiljonov · 从记录中获得相互和解。使用SQL

4 月前