如何使用.NET快速比较两个文件?

18 回复 | 直到 7 年前

1

103

Community CDub 8 年前

校验和比较很可能比逐字节比较慢。

为了生成校验和,您需要加载文件的每个字节,并对其执行处理。然后你必须在第二个文件中这样做。处理过程几乎肯定会比比较检查慢。

至于生成校验和:使用加密类可以很容易地做到这一点。这里有一个 short example of generating an MD5 checksum 用C。

但是,如果您可以预先计算“测试”或“基本”情况的校验和,则校验和可能更快并且更有意义。如果您有一个现有文件,并且您正在检查新文件是否与现有文件相同,则预先计算“现有”文件上的校验和意味着只需要对新文件执行一次磁盘IO。这可能比逐字节比较快。

2

117

Andrew Arnott 7 年前

最慢的方法是逐字节比较两个文件。我能想到的最快的方法是进行类似的比较,但不是一次一个字节,而是使用一个大小为Int64的字节数组,然后比较得到的数字。

我想到的是:

    const int BYTES_TO_READ = sizeof(Int64);

    static bool FilesAreEqual(FileInfo first, FileInfo second)
    {
        if (first.Length != second.Length)
            return false;

        if (string.Equals(first.FullName, second.FullName, StringComparison.OrdinalIgnoreCase))
            return true;

        int iterations = (int)Math.Ceiling((double)first.Length / BYTES_TO_READ);

        using (FileStream fs1 = first.OpenRead())
        using (FileStream fs2 = second.OpenRead())
        {
            byte[] one = new byte[BYTES_TO_READ];
            byte[] two = new byte[BYTES_TO_READ];

            for (int i = 0; i < iterations; i++)
            {
                 fs1.Read(one, 0, BYTES_TO_READ);
                 fs2.Read(two, 0, BYTES_TO_READ);

                if (BitConverter.ToInt64(one,0) != BitConverter.ToInt64(two,0))
                    return false;
            }
        }

        return true;
    }

在我的测试中,我看到这比简单的readbyte()场景快了3:1。平均超过1000次运行,我在1063ms时得到了这个方法,下面的方法(逐字节比较)在3031ms时得到了这个方法。哈希总是以大约865ms的平均速度返回到次秒。这个测试使用了一个大约100MB的视频文件。

以下是我用于比较的readbyte和hashing方法:

    static bool FilesAreEqual_OneByte(FileInfo first, FileInfo second)
    {
        if (first.Length != second.Length)
            return false;

        if (string.Equals(first.FullName, second.FullName, StringComparison.OrdinalIgnoreCase))
            return true;

        using (FileStream fs1 = first.OpenRead())
        using (FileStream fs2 = second.OpenRead())
        {
            for (int i = 0; i < first.Length; i++)
            {
                if (fs1.ReadByte() != fs2.ReadByte())
                    return false;
            }
        }

        return true;
    }

    static bool FilesAreEqual_Hash(FileInfo first, FileInfo second)
    {
        byte[] firstHash = MD5.Create().ComputeHash(first.OpenRead());
        byte[] secondHash = MD5.Create().ComputeHash(second.OpenRead());

        for (int i=0; i<firstHash.Length; i++)
        {
            if (firstHash[i] != secondHash[i])
                return false;
        }
        return true;
    }

3

33

dtb 16 年前

除了 里德·科普西 答案是:

最坏的情况是两个文件是相同的。在这种情况下,最好逐字节比较文件。
如果两个文件不相同,您可以通过更快地检测到它们不相同来加快速度。

例如,如果这两个文件的长度不同,那么您就知道它们不可能是相同的,而且您甚至不需要比较它们的实际内容。

4

31

Glenn Slayden 7 年前

如果你 D.O.O.S. 决定你真的需要 逐字节比较 (关于散列的讨论,请参见其他答案),那么一行解决方案是:

bool filesAreEqual = File.ReadAllBytes(path1).SequenceEqual(File.ReadAllBytes(path2));

与其他一些已发布的答案不同,这对于 任何类型的文件: 二进制、文本、媒体、可执行文件等,但作为 满的 二元的 比较 ,不同的文件只有以“不重要”的方式(例如 BOM , line-ending , character encoding ,将始终考虑媒体元数据、空白、填充、源代码注释等。) 不等 .

这段代码将两个文件全部加载到内存中,因此不应将其用于比较 巨大的 文件夹。除此之外,完全加载并不是真正的惩罚;事实上,对于预期小于的文件大小,这可能是一个最佳的.NET解决方案。 85K ,因为在 .NET 非常便宜,我们最大限度地将文件性能和优化委托给 CLR / BCL .

此外,对于这种工作日场景,关注的是通过 LINQ 枚举器(如图所示)是无意义的,因为点击磁盘进行文件I/O将使不同内存比较替代方案的好处降低几个数量级。例如,即使 SequenceEqual 做事实上,给了我们 放弃第一次不匹配 ,这在已经获取了文件的内容之后就不重要了,每个内容都是确认匹配所必需的。

另一方面,上述代码不包括紧急中止 大小不同的文件 哪一个可以提供有形(可能可测量)的绩效差异。这个是有形的,因为,而文件长度在 WIN32_FILE_ATTRIBUTE_DATA 结构(对于任何文件访问,都必须首先获取),继续访问文件的内容需要完全不同的获取,这可能会被避免。如果您关心这个问题,解决方案将变成两行:

// slight optimization over the code shown above
bool filesAreEqual = new FileInfo(path1).Length == new FileInfo(path2).Length && 
       File.ReadAllBytes(path1).SequenceEqual(File.ReadAllBytes(path2));

如果(等效的),您还可以扩展此项以避免二次提取。 Length 两个值均为零(未显示)和/或避免分别生成 FileInfo 两次(也未显示)。

5

15

Lars 9 年前

如果不以8字节的小块读取,而是循环读取更大的块,则速度会更快。我把平均比较时间缩短到1/4。

    public static bool FilesContentsAreEqual(FileInfo fileInfo1, FileInfo fileInfo2)
    {
        bool result;

        if (fileInfo1.Length != fileInfo2.Length)
        {
            result = false;
        }
        else
        {
            using (var file1 = fileInfo1.OpenRead())
            {
                using (var file2 = fileInfo2.OpenRead())
                {
                    result = StreamsContentsAreEqual(file1, file2);
                }
            }
        }

        return result;
    }

    private static bool StreamsContentsAreEqual(Stream stream1, Stream stream2)
    {
        const int bufferSize = 1024 * sizeof(Int64);
        var buffer1 = new byte[bufferSize];
        var buffer2 = new byte[bufferSize];

        while (true)
        {
            int count1 = stream1.Read(buffer1, 0, bufferSize);
            int count2 = stream2.Read(buffer2, 0, bufferSize);

            if (count1 != count2)
            {
                return false;
            }

            if (count1 == 0)
            {
                return true;
            }

            int iterations = (int)Math.Ceiling((double)count1 / sizeof(Int64));
            for (int i = 0; i < iterations; i++)
            {
                if (BitConverter.ToInt64(buffer1, i * sizeof(Int64)) != BitConverter.ToInt64(buffer2, i * sizeof(Int64)))
                {
                    return false;
                }
            }
        }
    }
}

6

15

Glenn Slayden 8 年前

唯一可能比逐字节比较稍微快一点的校验和比较的是,您一次读取一个文件,这在一定程度上减少了磁盘头的查找时间。然而,计算散列值所增加的时间很可能会消耗掉这一微小的收益。

当然,校验和比较只有在文件相同的情况下才有可能更快。如果不是,逐字节比较将在第一个差异处结束,从而使其更快。

您还应该考虑到哈希代码比较只会告诉您 很可能 文件是相同的。要100%确定,您需要进行逐字节比较。

例如,如果散列代码是32位,则大约99.9999999 8%的人确定如果散列代码匹配,文件是相同的。这接近100%,但如果你真的需要100%的确定性,那就不是了。

7

10

Sam Harwell 16 年前

编辑: 这种方法会不比较二进制文件!

在.NET 4.0中, File 类具有以下两个新方法:

public static IEnumerable<string> ReadLines(string path)
public static IEnumerable<string> ReadLines(string path, Encoding encoding)

这意味着你可以使用:

bool same = File.ReadLines(path1).SequenceEqual(File.ReadLines(path2));

8

6

RandomInsano 10 年前

老实说,我认为你需要尽可能地修剪你的搜索树。

逐字节进行前要检查的内容:

尺寸相同吗?
文件中的最后一个字节是否与文件B不同?

此外,一次读取大的块会更有效,因为驱动器读取顺序字节的速度更快。一个字节一个字节地进行不仅会导致更多的系统调用,而且如果两个文件都在同一个驱动器上,它还会使传统硬盘驱动器的读取头更频繁地来回搜索。

将块A和块B读取到字节缓冲区中,并进行比较(不要使用array.equals,请参见注释)。调整块的大小,直到达到你认为内存和性能之间的良好平衡。您也可以多线程比较,但不要多线程读取磁盘。

9

2

romeok 14 年前

我的实验表明,调用stream.readbyte()的次数更少肯定有帮助,但是使用bitconverter打包字节与比较字节数组中的字节没有太大区别。

因此,可以用最简单的循环替换上面注释中的“math.ceiling and iterations”循环:

            for (int i = 0; i < count1; i++)
            {
                if (buffer1[i] != buffer2[i])
                    return false;
            }

我想这是因为bitconverter.toint64在比较之前需要做一些工作(检查参数,然后执行位移位),这与比较两个数组中8个字节的工作量相同。

10

2

Andrew Arnott 7 年前

我的答案是@lars的派生,但修复了 Stream.Read . 我还添加了一些其他答案的快速路径检查和输入验证。简而言之,这应该是这个答:

using System;
using System.IO;

namespace ConsoleApp4
{
    class Program
    {
        static void Main(string[] args)
        {
            var fi1 = new FileInfo(args[0]);
            var fi2 = new FileInfo(args[1]);
            Console.WriteLine(FilesContentsAreEqual(fi1, fi2));
        }

        public static bool FilesContentsAreEqual(FileInfo fileInfo1, FileInfo fileInfo2)
        {
            if (fileInfo1 == null)
            {
                throw new ArgumentNullException(nameof(fileInfo1));
            }

            if (fileInfo2 == null)
            {
                throw new ArgumentNullException(nameof(fileInfo2));
            }

            if (string.Equals(fileInfo1.FullName, fileInfo2.FullName, StringComparison.OrdinalIgnoreCase))
            {
                return true;
            }

            if (fileInfo1.Length != fileInfo2.Length)
            {
                return false;
            }
            else
            {
                using (var file1 = fileInfo1.OpenRead())
                {
                    using (var file2 = fileInfo2.OpenRead())
                    {
                        return StreamsContentsAreEqual(file1, file2);
                    }
                }
            }
        }

        private static int ReadFullBuffer(Stream stream, byte[] buffer)
        {
            int bytesRead = 0;
            while (bytesRead < buffer.Length)
            {
                int read = stream.Read(buffer, bytesRead, buffer.Length - bytesRead);
                if (read == 0)
                {
                    // Reached end of stream.
                    return bytesRead;
                }

                bytesRead += read;
            }

            return bytesRead;
        }

        private static bool StreamsContentsAreEqual(Stream stream1, Stream stream2)
        {
            const int bufferSize = 1024 * sizeof(Int64);
            var buffer1 = new byte[bufferSize];
            var buffer2 = new byte[bufferSize];

            while (true)
            {
                int count1 = ReadFullBuffer(stream1, buffer1);
                int count2 = ReadFullBuffer(stream2, buffer2);

                if (count1 != count2)
                {
                    return false;
                }

                if (count1 == 0)
                {
                    return true;
                }

                int iterations = (int)Math.Ceiling((double)count1 / sizeof(Int64));
                for (int i = 0; i < iterations; i++)
                {
                    if (BitConverter.ToInt64(buffer1, i * sizeof(Int64)) != BitConverter.ToInt64(buffer2, i * sizeof(Int64)))
                    {
                        return false;
                    }
                }
            }
        }
    }
}

或者,如果你想变得超级棒,你可以使用异步变量:

using System;
using System.IO;
using System.Threading.Tasks;

namespace ConsoleApp4
{
    class Program
    {
        static void Main(string[] args)
        {
            var fi1 = new FileInfo(args[0]);
            var fi2 = new FileInfo(args[1]);
            Console.WriteLine(FilesContentsAreEqualAsync(fi1, fi2).GetAwaiter().GetResult());
        }

        public static async Task<bool> FilesContentsAreEqualAsync(FileInfo fileInfo1, FileInfo fileInfo2)
        {
            if (fileInfo1 == null)
            {
                throw new ArgumentNullException(nameof(fileInfo1));
            }

            if (fileInfo2 == null)
            {
                throw new ArgumentNullException(nameof(fileInfo2));
            }

            if (string.Equals(fileInfo1.FullName, fileInfo2.FullName, StringComparison.OrdinalIgnoreCase))
            {
                return true;
            }

            if (fileInfo1.Length != fileInfo2.Length)
            {
                return false;
            }
            else
            {
                using (var file1 = fileInfo1.OpenRead())
                {
                    using (var file2 = fileInfo2.OpenRead())
                    {
                        return await StreamsContentsAreEqualAsync(file1, file2).ConfigureAwait(false);
                    }
                }
            }
        }

        private static async Task<int> ReadFullBufferAsync(Stream stream, byte[] buffer)
        {
            int bytesRead = 0;
            while (bytesRead < buffer.Length)
            {
                int read = await stream.ReadAsync(buffer, bytesRead, buffer.Length - bytesRead).ConfigureAwait(false);
                if (read == 0)
                {
                    // Reached end of stream.
                    return bytesRead;
                }

                bytesRead += read;
            }

            return bytesRead;
        }

        private static async Task<bool> StreamsContentsAreEqualAsync(Stream stream1, Stream stream2)
        {
            const int bufferSize = 1024 * sizeof(Int64);
            var buffer1 = new byte[bufferSize];
            var buffer2 = new byte[bufferSize];

            while (true)
            {
                int count1 = await ReadFullBufferAsync(stream1, buffer1).ConfigureAwait(false);
                int count2 = await ReadFullBufferAsync(stream2, buffer2).ConfigureAwait(false);

                if (count1 != count2)
                {
                    return false;
                }

                if (count1 == 0)
                {
                    return true;
                }

                int iterations = (int)Math.Ceiling((double)count1 / sizeof(Int64));
                for (int i = 0; i < iterations; i++)
                {
                    if (BitConverter.ToInt64(buffer1, i * sizeof(Int64)) != BitConverter.ToInt64(buffer2, i * sizeof(Int64)))
                    {
                        return false;
                    }
                }
            }
        }
    }
}

11

1

Cecil Has a Name 16 年前

如果文件不太大,可以使用:

public static byte[] ComputeFileHash(string fileName)
{
    using (var stream = File.OpenRead(fileName))
        return System.Security.Cryptography.MD5.Create().ComputeHash(stream);
}

只有当哈希值对存储有用时,才可以比较哈希值。

(把代码编辑得更清晰。)

12

1

Thomas Kjørnes 15 年前

对于长度相同的大文件的另一个改进可能是不按顺序读取文件,而是比较或多或少的随机块。

可以使用多个线程,从文件中的不同位置开始,然后向前或向后比较。

这样,您就可以在文件的中间/末尾检测更改,速度比使用顺序方法更快。

13

1

CAFxX 13 年前

如果只需要比较两个文件,我想最快的方法是(在C语言中,我不知道它是否适用于.NET)

打开两个文件f1,f2
获取相应的文件长度l1、l2
如果是L1!=l2文件不同;停止
mmap()两个文件
对mmap()ed文件使用memcmp()。

如果您需要查找一组n个文件中是否存在重复的文件,那么最快的方法无疑是使用哈希来避免n向逐位比较。

14

1

Zar Shardan 9 年前

相当有效的东西:

public class FileCompare
{
    public static bool FilesEqual(string fileName1, string fileName2)
    {
        return FilesEqual(new FileInfo(fileName1), new FileInfo(fileName2));
    }

    /// <summary>
    /// 
    /// </summary>
    /// <param name="file1"></param>
    /// <param name="file2"></param>
    /// <param name="bufferSize">8kb seemed like a good default</param>
    /// <returns></returns>
    public static bool FilesEqual(FileInfo file1, FileInfo file2, int bufferSize = 8192)
    {
        if (!file1.Exists || !file2.Exists || file1.Length != file2.Length) return false;

        var buffer1 = new byte[bufferSize];
        var buffer2 = new byte[bufferSize];

        using (var stream1 = file1.Open(FileMode.Open, FileAccess.Read, FileShare.Read))
        {
            using (var stream2 = file2.Open(FileMode.Open, FileAccess.Read, FileShare.Read))
            {

                while (true)
                {
                    var bytesRead1 = stream1.Read(buffer1, 0, bufferSize);
                    var bytesRead2 = stream2.Read(buffer2, 0, bufferSize);

                    if (bytesRead1 != bytesRead2) return false;
                    if (bytesRead1 == 0) return true;
                    if (!ArraysEqual(buffer1, buffer2, bytesRead1)) return false;
                }
            }
        }
    }

    /// <summary>
    /// 
    /// </summary>
    /// <param name="array1"></param>
    /// <param name="array2"></param>
    /// <param name="bytesToCompare"> 0 means compare entire arrays</param>
    /// <returns></returns>
    public static bool ArraysEqual(byte[] array1, byte[] array2, int bytesToCompare = 0)
    {
        if (array1.Length != array2.Length) return false;

        var length = (bytesToCompare == 0) ? array1.Length : bytesToCompare;
        var tailIdx = length - length % sizeof(Int64);

        //check in 8 byte chunks
        for (var i = 0; i < tailIdx; i += sizeof(Int64))
        {
            if (BitConverter.ToInt64(array1, i) != BitConverter.ToInt64(array2, i)) return false;
        }

        //check the remainder of the array, always shorter than 8 bytes
        for (var i = tailIdx; i < length; i++)
        {
            if (array1[i] != array2[i]) return false;
        }

        return true;
    }
}

15

1

Simon Mourier 9 年前

下面是一些实用程序函数,允许您确定两个文件(或两个流)是否包含相同的数据。

我提供了一个“快速”版本,它是多线程的,因为它比较不同线程中使用任务的字节数组(每个缓冲区都是从每个文件中读取的内容填充的)。

正如预期的那样,它更快(大约快3倍),但它消耗更多的CPU(因为它是多线程的)和更多的内存(因为它需要每个比较线程使用两个字节的数组缓冲区)。

    public static bool AreFilesIdenticalFast(string path1, string path2)
    {
        return AreFilesIdentical(path1, path2, AreStreamsIdenticalFast);
    }

    public static bool AreFilesIdentical(string path1, string path2)
    {
        return AreFilesIdentical(path1, path2, AreStreamsIdentical);
    }

    public static bool AreFilesIdentical(string path1, string path2, Func<Stream, Stream, bool> areStreamsIdentical)
    {
        if (path1 == null)
            throw new ArgumentNullException(nameof(path1));

        if (path2 == null)
            throw new ArgumentNullException(nameof(path2));

        if (areStreamsIdentical == null)
            throw new ArgumentNullException(nameof(path2));

        if (!File.Exists(path1) || !File.Exists(path2))
            return false;

        using (var thisFile = new FileStream(path1, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
        {
            using (var valueFile = new FileStream(path2, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
            {
                if (valueFile.Length != thisFile.Length)
                    return false;

                if (!areStreamsIdentical(thisFile, valueFile))
                    return false;
            }
        }
        return true;
    }

    public static bool AreStreamsIdenticalFast(Stream stream1, Stream stream2)
    {
        if (stream1 == null)
            throw new ArgumentNullException(nameof(stream1));

        if (stream2 == null)
            throw new ArgumentNullException(nameof(stream2));

        const int bufsize = 80000; // 80000 is below LOH (85000)

        var tasks = new List<Task<bool>>();
        do
        {
            // consumes more memory (two buffers for each tasks)
            var buffer1 = new byte[bufsize];
            var buffer2 = new byte[bufsize];

            int read1 = stream1.Read(buffer1, 0, buffer1.Length);
            if (read1 == 0)
            {
                int read3 = stream2.Read(buffer2, 0, 1);
                if (read3 != 0) // not eof
                    return false;

                break;
            }

            // both stream read could return different counts
            int read2 = 0;
            do
            {
                int read3 = stream2.Read(buffer2, read2, read1 - read2);
                if (read3 == 0)
                    return false;

                read2 += read3;
            }
            while (read2 < read1);

            // consumes more cpu
            var task = Task.Run(() =>
            {
                return IsSame(buffer1, buffer2);
            });
            tasks.Add(task);
        }
        while (true);

        Task.WaitAll(tasks.ToArray());
        return !tasks.Any(t => !t.Result);
    }

    public static bool AreStreamsIdentical(Stream stream1, Stream stream2)
    {
        if (stream1 == null)
            throw new ArgumentNullException(nameof(stream1));

        if (stream2 == null)
            throw new ArgumentNullException(nameof(stream2));

        const int bufsize = 80000; // 80000 is below LOH (85000)
        var buffer1 = new byte[bufsize];
        var buffer2 = new byte[bufsize];

        var tasks = new List<Task<bool>>();
        do
        {
            int read1 = stream1.Read(buffer1, 0, buffer1.Length);
            if (read1 == 0)
                return stream2.Read(buffer2, 0, 1) == 0; // check not eof

            // both stream read could return different counts
            int read2 = 0;
            do
            {
                int read3 = stream2.Read(buffer2, read2, read1 - read2);
                if (read3 == 0)
                    return false;

                read2 += read3;
            }
            while (read2 < read1);

            if (!IsSame(buffer1, buffer2))
                return false;
        }
        while (true);
    }

    public static bool IsSame(byte[] bytes1, byte[] bytes2)
    {
        if (bytes1 == null)
            throw new ArgumentNullException(nameof(bytes1));

        if (bytes2 == null)
            throw new ArgumentNullException(nameof(bytes2));

        if (bytes1.Length != bytes2.Length)
            return false;

        for (int i = 0; i < bytes1.Length; i++)
        {
            if (bytes1[i] != bytes2[i])
                return false;
        }
        return true;
    }

16

0

antonio 9 年前

我认为有些应用程序“hash”比逐字节比较快。如果你需要与其他人比较一个文件,或者有一个可以更改的照片的缩略图。这取决于它在哪里以及如何使用。

private bool CompareFilesByte(string file1, string file2)
{
    using (var fs1 = new FileStream(file1, FileMode.Open))
    using (var fs2 = new FileStream(file2, FileMode.Open))
    {
        if (fs1.Length != fs2.Length) return false;
        int b1, b2;
        do
        {
            b1 = fs1.ReadByte();
            b2 = fs2.ReadByte();
            if (b1 != b2 || b1 < 0) return false;
        }
        while (b1 >= 0);
    }
    return true;
}

private string HashFile(string file)
{
    using (var fs = new FileStream(file, FileMode.Open))
    using (var reader = new BinaryReader(fs))
    {
        var hash = new SHA512CryptoServiceProvider();
        hash.ComputeHash(reader.ReadBytes((int)file.Length));
        return Convert.ToBase64String(hash.Hash);
    }
}

private bool CompareFilesWithHash(string file1, string file2)
{
    var str1 = HashFile(file1);
    var str2 = HashFile(file2);
    return str1 == str2;
}

在这里,你可以得到最快的。

var sw = new Stopwatch();
sw.Start();
var compare1 = CompareFilesWithHash(receiveLogPath, logPath);
sw.Stop();
Debug.WriteLine(string.Format("Compare using Hash {0}", sw.ElapsedTicks));
sw.Reset();
sw.Start();
var compare2 = CompareFilesByte(receiveLogPath, logPath);
sw.Stop();
Debug.WriteLine(string.Format("Compare byte-byte {0}", sw.ElapsedTicks));

或者,我们可以将哈希保存在数据库中。

希望这能有所帮助

17

0

Andrew Taylor 7 年前

另一个答案来自@chsh。MD5与文件的using和shortcuts相同,文件不存在,长度不同:

/// <summary>
/// Performs an md5 on the content of both files and returns true if
/// they match
/// </summary>
/// <param name="file1">first file</param>
/// <param name="file2">second file</param>
/// <returns>true if the contents of the two files is the same, false otherwise</returns>
public static bool IsSameContent(string file1, string file2)
{
    if (file1 == file2)
        return true;

    FileInfo file1Info = new FileInfo(file1);
    FileInfo file2Info = new FileInfo(file2);

    if (!file1Info.Exists && !file2Info.Exists)
       return true;
    if (!file1Info.Exists && file2Info.Exists)
        return false;
    if (file1Info.Exists && !file2Info.Exists)
        return false;
    if (file1Info.Length != file2Info.Length)
        return false;

    using (FileStream file1Stream = file1Info.OpenRead())
    using (FileStream file2Stream = file2Info.OpenRead())
    { 
        byte[] firstHash = MD5.Create().ComputeHash(file1Stream);
        byte[] secondHash = MD5.Create().ComputeHash(file2Stream);
        for (int i = 0; i < firstHash.Length; i++)
        {
            if (i>=secondHash.Length||firstHash[i] != secondHash[i])
                return false;
        }
        return true;
    }
}

18

-1

Jay Byford-Rew 8 年前

我发现这个方法可以很好地比较没有读取数据的长度,然后比较读取字节序列

private static bool IsFileIdentical(string a, string b)
{            
   if (new FileInfo(a).Length != new FileInfo(b).Length) return false;
   return (File.ReadAllBytes(a).SequenceEqual(File.ReadAllBytes(b)));
}