代码之家  ›  专栏  ›  技术社区  ›  Prakhar Nigam

如何使用Lucene TestUtil随机生成Unicode字符串

  •  2
  • Prakhar Nigam  · 技术社区  · 6 年前

    我正在编写一些生成随机Unicode字符串的代码。我尝试使用Lucene Test Utils生成随机Unicode字符串,如下所示

        for (int i = 0; i < 5000; i++) {
            Random random = new Random();
            final String s = TestUtil.randomUnicodeString(random, 12);
            //final String s = TestUtil.randomUnicodeString(random);Tried both
            final byte[] utf8 = new byte[s.length() * UnicodeUtil.MAX_UTF8_BYTES_PER_CHAR];
            final int utf8Len = UnicodeUtil.UTF16toUTF8(s, 0, s.length(), utf8);
            if(utf8Len !=8)
            {
                System.out.println("$$$$");
            }
        }
    

    所以我检查了lucene6.2.0版本的lucene代码中的destring实现

    public static String randomUnicodeString(Random r, int maxLength) {
    final int end = nextInt(r, 0, maxLength);
    if (end == 0) {
      // allow 0 length
      return "";
    }
    final char[] buffer = new char[end];
    randomFixedLengthUnicodeString(r, buffer, 0, buffer.length);
    return new String(buffer, 0, end);
    

    }

     public static void randomFixedLengthUnicodeString(Random random, char[] chars, int offset, int length) {
    int i = offset;
    final int end = offset + length;
    while(i < end) {
      final int t = random.nextInt(5);
      if (0 == t && i < length - 1) {
        // Make a surrogate pair
        // High surrogate
        chars[i++] = (char) nextInt(random, 0xd800, 0xdbff);
        // Low surrogate
        chars[i++] = (char) nextInt(random, 0xdc00, 0xdfff);
      } else if (t <= 1) {
        chars[i++] = (char) random.nextInt(0x80);
      } else if (2 == t) {
        chars[i++] = (char) nextInt(random, 0x80, 0x7ff);
      } else if (3 == t) {
        chars[i++] = (char) nextInt(random, 0x800, 0xd7ff);
      } else if (4 == t) {
        chars[i++] = (char) nextInt(random, 0xe000, 0xffff);
      }
    }
    

    }

    那么我得到这个例外的原因是什么呢

    Exception in thread "main" java.lang.NoClassDefFoundError: com/carrotsearch/randomizedtesting/generators/RandomInts
    at org.apache.lucene.util.TestUtil.nextInt(TestUtil.java:433)
    at org.apache.lucene.util.TestUtil.randomUnicodeString(TestUtil.java:505)
    at luceneLab.lab.main(lab.java:33)Caused by: java.lang.ClassNotFoundException: com.carrotsearch.randomizedtesting.generators.RandomInts
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 3 more
    

    提前谢谢。

    1 回复  |  直到 6 年前
        1
  •  2
  •   femtoRgon    6 年前

    例外情况是,项目中缺少com/carrotsearch/randomizedtesting/generators/RandomInts。看起来TestUtil使用 com.carrotsearch.randomizedtesting