代码之家 › 专栏 › 技术社区 › user7055375

如何创建自定义哈希,其中键查找基于我定义的函数?

key hash function ruby

user7055375 · 技术社区 · 7 年前

我用的是Ruby 2.4。如何创建自定义哈希,其中键的查找函数是我定义的?我有这个功能。。。

  def str_almost_equal(a, b)
    a.downcase == b.downcase || (a.size == b.size && a.downcase.chars.map.with_index{|c, i| c == b.downcase[i]}.count(false) == 1)
  end

如果键的两个类都是字符串并且它们与上面的方法匹配,我希望我的哈希能够查找键。例如,如果我有

hash = []
hash["aaa"] = 1

然后我希望散列[aab]返回1,因为“aaa”和“aab”使用我定义的函数求值为true。

4 回复 | 直到 7 年前

ulferts 7 年前

您可以定义自己的类继承自 Hash [](key) 你喜欢的方法。因此,例如。

class MyHash < Hash
  def [](key)
    # do something to return the appropriate value
  end
end

基于调用代码所需的哈希接口的其他方法,您可能需要重写其他方法。

除此之外,我不认为在哈希中提供查找键时返回1的行为是非常类似哈希的行为。我宁愿退赛。返回1将违反最小意外原则。

OP评论后添加

class VariantsSet < Set 

  PLACEHOLDER = '_' # Use a character that will not be used in your keys

  def add_variants(string)
    merge(all_variants(string))
  end 

  def delete(string)
    all_variants(string).each { |variant| super(variant) }
  end

  def includes_any_variant?(string)
    all_variants(string).any? { |variant| include?(variant) }
  end

  private

  def all_variants(string)
    downcase_string = string.downcase
    string_length = string.length

    variants = [downcase_string]

    string_length.times do |i|
      variants << downcase_string[0, i] + PLACEHOLDER + downcase_string[i + 1 , string_length]
    end

    variants
  end
end

使用方法如下:

2.4.2 :026 > s = VariantsSet.new
 => #<VariantsSet: {}> 
2.4.2 :027 > s.add_variants('foobar')
 => #<VariantsSet: {"foobar", "_oobar", "f_obar", "fo_bar", "foo_ar", "foob_r", "fooba_"}> 
2.4.2 :028 > s.includes_any_variant?('foobaz')
 => true 
2.4.2 :029 > s.includes_any_variant?('blubs')
 => false

展开以显示哈希实现

class VariantsHash < Hash

  PLACEHOLDER = '_' # Use a character that will not be used in your keys

  def []=(string, value)
    all_variants(string).each do |variant|
      super(variant, value)
    end
  end 

  def delete(string)
    all_variants(string).each do |variant|
      super(variant)
    end
  end 

  def [](string)
    match = all_variants(string).detect do |variant|
      super(variant)
    end

    super(match) if match
  end

  private

  def all_variants(string)
    downcase_string = string.downcase
    string_length = string.length

    variants = [downcase_string]

    string_length.times do |i|
      variants << downcase_string[0, i] + PLACEHOLDER + downcase_string[i + 1 , string_length]
    end

    variants
  end
end

Kapuzzi 7 年前

这可能不是最好的答案,但我认为它可以帮助你

class NewHash < Hash
  def [](key)
    if keys.include?(key)
      super
    else
      found = keys.find { |k| compare_almost(k, key) }
      self[found] if found
    end
  end

  private

  def compare_almost(a, b)
    a.downcase == b.downcase ||
      (a.size == b.size && a.downcase.chars.map.with_index{|c, i| c == b.downcase[i]}.count(false) == 1)
  end
end

hash = NewHash.new
hash['aaa'] = 1
p hash['aab'] # => 1
p hash['acs'] # => nil

对每个对象使用模块

module CompareAlmost
  def [](key)
    if keys.include?(key)
      super
    else
      found = keys.find { |k| compare_almost(k, key) }
      self[found] if found
    end
  end

  private

  def compare_almost(a, b)
    a.downcase == b.downcase ||
      (a.size == b.size && a.downcase.chars.map.with_index{|c, i| c == b.downcase[i]}.count(false) == 1)
  end
end

hash = {'aaa' => 2}
hash.extend CompareAlmost
p hash['aab'] # => 2

替换原始哈希类

class Hash
  def [](key)
    if keys.include?(key)
      fetch key
    else
      found = keys.find { |k| compare_almost(k, key) }
      self[found] if found
    end
  end

  private

  def compare_almost(a, b)
    a.downcase == b.downcase ||
      (a.size == b.size && a.downcase.chars.map.with_index{|c, i| c == b.downcase[i]}.count(false) == 1)
  end
end

hash = { 'aaa' => 3 }
p hash['aab'] # => 3

EJAg 7 年前

## your custom function, changed to a Proc so it can be passed to a method
str_almost_equal = Proc.new do |a,b|
    a.downcase == b.downcase || (a.size == b.size && a.downcase.chars.map.with_index{|c, i| c == b.downcase[i]}.count(false) == 1)
  end

## iterator to run all values through your function
def check_all(a, array, func)
  array.each do |n|
    return n if func.call(a, n)
  end
end

## Hash with custom default value
h = Hash.new do |h,k| 
  if k.is_a? String
    key = check_all(k, h.keys, str_almost_equal) 
    h[key] if key
  end

h["aaa"]    ## nil
h["aaa"] = 3
h["aab"]    ## 3
h["bab"]    ## nil
h["aab"] = 4
h["aab"]    ## 4

Jack Noble 7 年前

对于用户创建的类型,哈希查找使用两种方法 #eql? 和 #hash . 这是覆盖哈希行为的规范方法。

class StringWrapper
  attr_reader :string

  def initialize(string)
    @string = string
  end

  def eql?(other)
    string.downcase == other.string.downcase ||
    (string.size == other.string.size && 
     string.downcase.chars.map.with_index{|c, i| c == 
     other.string.downcase[i]}.count(false) == 1
    )
  end

  def hash
    # never actually do this part
    0
  end
end

hash = {}
hash[StringWrapper.new("aaa")] = 1
hash[StringWrapper.new("aab")] # => 1

但使用哈希而不是其他数据类型的原因是,无论有多少数据,查找密钥所需的时间通常不会增长。但在上面的实现中,我们打破了这一点。让我们看看散列是如何工作的。在后台,您的数据大致是这样存储的 :

|0  |1  |2  |3  |4  |5  |6  |7  |8  |9  |10 |
|   |   |   |aaa|   |   |aad|   |   |   |aab|
|   |   |   |aae|   |   |   |   |   |   |   |

"aaa".hash 例如,返回2707793439826082248。 2707793439826082248 % 11 是 3 始终返回0。

|0  |1  |2  |3  |4  |5  |6  |7  |8  |9  |10 |
|aaa|   |   |   |   |   |   |   |   |   |   |
|aab|   |   |   |   |   |   |   |   |   |   |
|aac|   |   |   |   |   |   |   |   |   |   |
|aad|   |   |   |   |   |   |   |   |   |   |

每个对象都存储在同一个存储桶中,因此我们破坏了哈希查找性能。定义自定义散列方法时,您需要一些每次计算时都相同并且看起来随机的东西,以便模运算将数据均匀分布在各个存储桶中。但是哈希方法还必须尊重

 def hash
   string[0].hash
 end

您所要求的问题是,可以构造一系列查找,其中每个字符串彼此相等。“foo”=“boo”。“boo”=“bao”。“bao”=“bar”。为了使程序尊重您对等式的定义,所有这些字符串都需要位于同一个哈希桶中。这意味着为了使哈希按您描述的方式运行 每个字符串 需要具有相同的哈希桶。根据定义,这不是真正的散列。所以我认为你应该退后一步,考虑一下你要解决的是什么问题,以及散列是否真的是正确的方法。

你当然可以按照乌尔弗茨的建议去做,但代价是使用更多的内存

hash = { "aaa" => 1 }
hash["cab"] = 2
hash["aab"]

:包含少量项的Ruby哈希实际上是作为数组实现的,至少在MRI中是这样。希望这些图有助于解释哈希表在高级别上的工作方式,而不是实际显示给定Ruby是如何实现它们的。我确信它们有很多不准确之处。