代码之家 › 专栏 › 技术社区 › Franck Dernoncourt

当测试字符串100%包含查询字符串时,为什么fuzzywuzzy的process.extractBests不能给出100%的分数?

fuzzywuzzy string-matching nlp python

1

Franck Dernoncourt · 技术社区 · 1 年前

我在测试 fuzzywuzzy s process.extractBests() 如下所示:

from fuzzywuzzy import process

# Define the query string
query = "Apple"

# Define the list of choices
choices = ["Apple", "Apple Inc.", "Apple Computer", "Apple Records", "Apple TV"]

# Call the process.extractBests function
results = process.extractBests(query, choices)

# Print the results
for result in results:
    print(result)

它输出:

('Apple', 100)
('Apple Inc.', 90)
('Apple Computer', 90)
('Apple Records', 90)
('Apple TV', 90)

既然所有字符串都100%包含查询字符串(“Apple”),为什么记分员不给所有字符串加100?

我在Python 3.11.7中使用fuzzywuzzy==0.18.0。

1 回复 | 直到 1 年前

1

2

Mippy 1 年前

这个 fuzzywuzzy s extractBests() 函数不会给出100%,因为它不检查匹配,而是检查相似性,例如字符串的长度、与查询相比的字符串内容、查询字符串的位置以及其他一些因素。在您的情况下,它不会输出100%,因为“Apple股份有限公司”与您的查询“Apple”不完全匹配。这就是为什么只有“Apple”选项输出100%,因为它与查询“Apple”100%匹配。我希望这能有所帮助!