代码之家  ›  专栏  ›  技术社区  ›  MAK

将文本文件与给定的两个列名作为输入进行比较

  •  0
  • MAK  · 技术社区  · 7 年前

    我有两个随机列列表的文本文件。

    文件1: file1.txt

    ID|Name|Number|Date
    1|John|991122|23-12-2017
    2|Smith|889911|24-12-2017
    3|Mak|776532|25-12-2107
    

    文件2: file2.txt

    Number|ID|Date|Name
    991122|1|23-Dec-2017|John
    889911|2|24-Dec-2017|Smith
    776532|3|25-Dec-2017|Mak
    987654|4|26-Dec-2017|Joseph
    765551|5|27-Dec-2017|William
    

    我想根据指定的2列比较file1和file2,并存储 作为 .txt .

    output.txt 基于指定的列 ID Date .

    Number|ID|Date|Name
    987654|4|26-Dec-2017|Joseph
    765551|5|27-Dec-2017|William
    

    注意 :列 在任何文件中可能有不同的(未知的)格式。

    尝试:

    file1 = 'E:\Python\File Comparison Files\File1.txt' 
    file2 = 'E:\Python\File Comparison Files\File2.txt' 
    file3 = 'E:\Python\File Comparison Files\outputfile.txt' 
    
    with open(file1) as b:
        first_line_b = b.readline()
        print 'File1 Columns:', first_line_b
    
    file1Column1 = raw_input('Enter File1 column1 name to compare:')
    file1Column2 = raw_input('Enter File1 column2 name to compare:')
    
    
    with open(file2) as a:
        first_line_a = a.readline()
        print '\nFile2 Columns:', first_line_a
    
    file2Column1 = raw_input('Enter File2 column1 name to compare:')
    file2Column2 = raw_input('Enter File2 column2 name to compare:')
    
    #Following will do all data comparison, but not specified column
    with open(file1) as b:
        blines = set(b)
    with open(file2) as a:
        first_line = a.readline()
        with open(file3, 'w') as result:
            result.write(first_line)
            for line in a:
                if line not in blines:
                    result.write(line)
    

    上面的代码将比较完整的数据,但不针对指定的列/字段。因为我想根据从每个文件传递的两列进行比较,并将结果存储在第三个文件中。

    1 回复  |  直到 7 年前
        1
  •  0
  •   Sunitha    7 年前

    你可以用 csv.DictReader csv.DictWriter

    import csv
    
    file1, file2, file3 = 'file1.txt', 'file2.txt', 'file3.txt'
    col_name = raw_input('Enter File1 column1 name to compare:')
    
    uids = set()
    with open(file1) as fo1:
        for row in csv.DictReader(fo1, delimiter='|'):
             udis.add(row[col_name])
    
    with open(file2) as fo2:
        with open(file3, 'w') as fo3:
            reader = csv.DictReader(fo2, delimiter='|')
            writer = csv.DictWriter(fo3, delimiter='|', fieldnames=reader.fieldnames)
            writer.writeheader()
            for row in reader:
                if row[col_name] in uids:
                    continue
                writer.writerow(row)
    

    file3.txt

    Number|ID|Date|Name
    987654|4|26-Dec-2017|Joseph
    765551|5|27-Dec-2017|William