代码之家  ›  专栏  ›  技术社区  ›  Dan D.

使用布尔全文搜索按相关性排序

  •  2
  • Dan D.  · 技术社区  · 15 年前

    我的问题是:

    SELECT g.id, MATCH(g.searchable_name) AGAINST ('test*' IN BOOLEAN MODE) AS relevance
     FROM games g
     WHERE MATCH(g.searchable_name) AGAINST ('test*' IN BOOLEAN MODE)
     ORDER BY relevance DESC
     LIMIT 0, 31
    

    提前谢谢。

    1 回复  |  直到 15 年前
        1
  •  0
  •   Community Mohan Dere    9 年前

    首先你应该考虑一下 IN BOOLEAN MODE does not return a score ,而是返回二进制(1=已找到,0=未找到):

    mysql>SELECT
            topic_id,
            MATCH(topic_text) AGAINST('+tuning' IN BOOLEAN MODE) AS binary
        FROM
            topics_search
        LIMIT 10
    +----------+----------+
    | topic_id | binary   |
    +----------+----------+
    | 2        | 0        |
    | 4        | 0        |
    | 5        | 0        |
    | 6        | 1        |
    | 7        | 0        |
    | 8        | 0        |
    | 11       | 0        |
    | 12       | 0        |
    | 13       | 0        |
    | 14       | 0        |
    +----------+----------+
    10 rows in set (9 ms)
    

    只有自然的全文搜索才能生成一个分数 IN NATURAL LANGUAGE MODE

    mysql>SELECT SQL_NO_CACHE
            topic_id,
            MATCH(topic_text) AGAINST('tuning') AS score
        FROM
            topics_search
        WHERE
            host_id = 1
        ORDER BY
            score DESC
        LIMIT 10
    +--------------------+--------------------+
    | topic_id           | score              |
    +--------------------+--------------------+
    | 153257             | 5.161948204040527  |
    | 17925              | 4.781417369842529  |
    | 66459              | 4.648380279541016  |
    | 373176             | 4.570812702178955  |
    | 117173             | 4.55166482925415   |
    | 167016             | 4.462575912475586  |
    | 183286             | 4.4519267082214355 |
    | 366132             | 4.348565101623535  |
    | 95502              | 4.293642520904541  |
    | 29615              | 4.178250789642334  |
    +--------------------+--------------------+
    10 rows in set (478 ms)
    

    旁注: 难以置信的慢因为 score

    所以你需要自然搜索来按分数排序。但是自然搜索不支持像 * 通配符。现在我们陷入了两难的境地,因为寻找它是没有用的 tunin* 在里面 BOOLEAN NATURAL 使用钥匙 tunin 因为没有文本会包含那个部分单词。

    mysql>SELECT SQL_NO_CACHE
            topic_id,
            MATCH(topic_text) AGAINST('tunin') AS score
        FROM
            topics_search
        WHERE
            MATCH(topic_text) AGAINST('tunin*' IN BOOLEAN MODE)
        AND
            MATCH(topic_text) AGAINST('tunin') > 0
        ORDER BY
            score DESC
        LIMIT 10
    Empty set (170 ms)
    

    结论

    除非你找到一种方法来获取全文索引中所有被通配符搜索命中的单词,并在第二个查询中使用它们,或者你根据这些单词建立自己的分数 LIKE a new question .