代码之家  ›  专栏  ›  技术社区  ›  Nick Haslam

如何获取SQL Server查询中返回的一组多行中最常见的n个单词?

  •  6
  • Nick Haslam  · 技术社区  · 14 年前

    我想从SQL server中的查询中返回10个最常见的单词,因此针对一组行运行,例如:

    敏捷的棕色狐狸

    慢绿狐


    狐狸
    缓慢的

    棕色的
    黄色的

    3 回复  |  直到 14 年前
        1
  •  1
  •   Community CDub    8 年前

    了解如何声明性地执行此操作(即不使用 while 循环),看看我的答案(所有东西的代码): Build an ASCII chart of the most commonly used words in a given text

        2
  •  3
  •   nsmyself    14 年前

    CREATE FUNCTION dbo.Split(@String varchar(8000), @Delimiter char(1))     
    returns @temptable TABLE (items varchar(8000))     
    as     
    begin     
        declare @idx int     
        declare @slice varchar(8000)     
    
        select @idx = 1     
            if len(@String)<1 or @String is null  return     
    
        while @idx!= 0     
        begin     
            set @idx = charindex(@Delimiter,@String)     
            if @idx!=0     
                set @slice = left(@String,@idx - 1)     
            else     
                set @slice = @String     
    
            if(len(@slice)>0)
                insert into @temptable(Items) values(@slice)     
    
            set @String = right(@String,len(@String) - @idx)     
            if len(@String) = 0 break     
        end 
    return     
    end
    

    您应该从游标或其他地方调用这个函数;在其内部,只需使用以下内容:

    insert into #tmp (word) select * from dbo.split(' ', @row)
    

    select top 10 count(*) as number, word from separated_words_table order by number
    

    Source here

        3
  •  1
  •   Martin Smith    14 年前

    另一种方式借自 here )

    WITH Sentences AS
    (
    SELECT 'quick brown fox' AS Sentence UNION ALL
    SELECT 'slow yellow fox' UNION ALL
    SELECT 'slow green fox'
    ),
    Xmlified AS
    (
        SELECT
           CAST('<M>' + REPLACE(Sentence,' ','</M><M>') + '</M>' AS XML)  AS xSentence
    FROM Sentences
    ),
    Words AS
    (
    SELECT
     Split.a.value('.', 'VARCHAR(100)') AS word
    FROM Xmlified
    CROSS APPLY xSentence.nodes('/M') Split(a)
    )
    SELECT COUNT(*) AS C, word FROM Words
    GROUP BY word
    ORDER BY C DESC