代码之家  ›  专栏  ›  技术社区  ›  Tom cmoron

处理solr的理想方法是使用php?

  •  5
  • Tom cmoron  · 技术社区  · 15 年前

    Firslty,我知道这方面有一些类似的问题,但我认为这种情况是不同的,足以证明自己的问题。

    我正在运行一个solr索引,通过一个安装在灯服务器上的码头。我目前使用 simplexml_load_file 函数引入搜索结果,然后通过几个函数对其进行解析。我对这个过程很满意,直到我开始遇到一个基本问题。

    字段名不能通过simpleXML函数传递。例如,这个结果;

    <doc>
      <float name="score">0.73325396</float>
      <str name="add1">Ravensbridge Drive</str>
      <str name="comments">0</str>
      <str name="company">Stratstone Lotus Leicester</str>
      <str name="feed_id"/>
      <str name="id">1711765</str>
      <str name="pcode">LE4 0BX</str>
      <str name="psearch">LE4</str>
      <str name="rating">0</str>
    </doc>
    

    在simpleXML对象中会像这样;

     [doc] => Array
     (
       [0] => SimpleXMLElement Object
       (
         [float] => 0.73325396
         [str] => Array
         (
           [0] => Ravensbridge Drive
           [1] => 0
           [2] => Stratstone Lotus Leicester
           [3] => SimpleXMLElement Object
           (
             [@attributes] => Array
             (
               [name] => feed_id
             )
           )
           [4] => 1711765
           [5] => LE4 0BX
           [6] => LE4
           [7] => 0
         )
       )
    

    当找到一个完整的数据集时,数组中存储了11位数据,但是当一些数据丢失时,数据会四处移动,我的解析器就会松脱。

    所以,我已经研究了库/类来正确地完成它。也就是说,两个主要的; Apache Solr solr-php-client 但这两个看起来都过于复杂,实际的例子很少,而且它们都不支持不同的solr核心,其中我使用了几个。

    最好用什么?我现在被困在这里了,任何帮助都会非常感谢。

    谢谢!

    1 回复  |  直到 12 年前
        1
  •  8
  •   nuqqsa    15 年前

    当然,使用现有的客户之一。对于多核支持,它与为每个solr实例创建客户机实例一样简单。

    Solr扩展功能更强大,但使用起来仍然非常直观。这里有两个示例代码片段,用于进行基本查询并使用这两个库返回结果:

    PHP Solr extension

    <?php
    $options = array
    (
        'hostname' => 'localhost',
        'port'     => '8080',
        'path'     => '/solr'
    );
    
    $client = new SolrClient($options);
    
    $query = new SolrQuery();
    $query->setQuery('fox');
    $query->setStart(0);
    $query->setRows(50);
    // specify which fields do we want to retrieve
    $query->addField('id')->addField('title_t')->addField('source_t');
    
    $res = $client->query($query)->getResponse();
    
    // how does he response look like?
    var_dump($res);
    /*
    object(SolrObject)[4]
      public 'responseHeader' => 
        object(SolrObject)[5]
          public 'status' => int 0
          public 'QTime' => int 0
          public 'params' => 
            object(SolrObject)[6]
              public 'fl' => string 'id,title_t,source_t' (length=19)
              public 'indent' => string 'on' (length=2)
              public 'start' => string '0' (length=1)
              public 'q' => string 'fox' (length=3)
              public 'wt' => string 'xml' (length=3)
              public 'rows' => string '50' (length=2)
              public 'version' => string '2.2' (length=3)
      public 'response' => 
        object(SolrObject)[7]
          public 'numFound' => int 39
          public 'start' => int 0
          public 'docs' => 
            array
              0 => 
                object(SolrObject)[8]
                  ...
              1 => 
                object(SolrObject)[9]
                  ...
              2 => 
                object(SolrObject)[10]
                  ...
              (...)
    */
    // how does a document look like?
    var_dump($res->reponse->docs[0]);
    /*
    object(SolrObject)[8]
      public 'id' => int 11408
      public 'source_t' => string 'CBD News Headlines' (length=18)
      public 'title_t' => string 'Hunting across Southeast Asia weakens forests' survival' (length=55)
    */
    

    solr-php-client ( official example of use )

    require_once 'library/SolrPhpClient/Apache/Solr/Service.php';
    
    $solr = new Apache_Solr_Service('localhost', '8080', '/solr');
    
    if (!$solr->ping()) {
        exit('Solr service not responding.');
    }
    
    $offset = 0;
    $limit = 50;
    
    $query = 'fox';
    $res = $solr->search($query, $offset, $limit);
    
    // how does he response look like?
    var_dump($res->response);
    
    /*
    object(stdClass)[6]
      public 'numFound' => int 39
      public 'start' => int 0
      public 'docs' => 
        array
          0 => 
            object(Apache_Solr_Document)[46]
              protected '_documentBoost' => boolean false
              protected '_fields' => 
                array
                  ...
              protected '_fieldBoosts' => 
                array
                  ...
          1 => 
            object(Apache_Solr_Document)[47]
              protected '_documentBoost' => boolean false
              protected '_fields' => 
                array
                  ...
              protected '_fieldBoosts' => 
                array
                  ...
         (...)
    */
    
    // how does a document look like?
    var_dump($res->response->doc[0]);
    
    /*
    object(Apache_Solr_Document)[46]
      protected '_documentBoost' => boolean false
      protected '_fields' => 
        array
          'publicationTime_i' => int 1257724800
          'publicationDate_t' => string 'Mon, 9 Nov 2009' (length=15)
          'url_s' => string 'http://news.mongabay.com/2009/1108-hance_corlett.html' (length=53)
          'language_s' => string 'EN' (length=2)
          'title_t' => string 'Hunting across Southeast Asia weakens forests' survival' (length=55)
          'text' => string 'A large flying fox eats a fruit ingesting its seeds.' (length=52)
          'id' => int 11408
          'relevance_i' => int 27
          'source_t' => string 'CBD News Headlines' (length=18)
      protected '_fieldBoosts' => 
        array
          'publicationTime_i' => boolean false
          'publicationDate_t' => boolean false
          'url_s' => boolean false
          'language_s' => boolean false
          'title_t' => boolean false
          'text' => boolean false
          'id' => boolean false
          'relevance_i' => boolean false
          'source_t' => boolean false
    */