Distributed Searching基础

  categories:搜索资料  tags:,   author:



solr通过一台服务器(single shard)接受检索任务,并将其分发到各个shards上,最后合并检索结果。



1.通过shards参数执行Distributed Searching

我们可以检索请求中加入shards参数执行Distributed Searching,其格式为:






2.Distributed Searching支持的组件

只有以下组件支持Distributed Searching:

  • The Query component that returns documents matching a query
  • The Facet component, for facet.query and facet.field requests where facets are sorted by count (the default). Solr 1.4 and later also support sorting by name.
  • The Highlighting component
  • The Stats component
  • The Spell Check Component
  • The Terms Component
  • The Term Vector Component
  • The Debug component


3.Distributed Searching的限定(不足)

Distributed Searching还有种种限定条件,如下:

  • Each document indexed must have a unique key.
  • If Solr discovers duplicate document IDs, Solr selects the first document and discards subsequent ones.
  • Inverse-document frequency (IDF) calculations cannot be distributed.
  • Distributed searching does not support the QueryElevationComponent, which configures the top results for a given query regardless of Lucene’s scoring. For more information, see http://wiki.apache.org/solr/QueryElevationComponent.
  • The index for distributed searching may become out of date; for example, a document that once matched a query and was subsequently changed may no longer match the query but will still be retrieved.
    (索引会在distributed searching过程中过时。???)
  • Distributed searching supports only sorted-field faceting, not date faceting
    (distributed searching仅支持sorted-field faceting)
  • The number of shards is limited by number of characters allowed for GET method’s URI; most Web servers generally support at least 4000 characters, but many servers limit URI length to reduce their vulnerability to Denial of Service (DoS) attacks.
  • TF/IDF computations are per shard. This may not matter if content is well (randomly) distributed.



快乐成长 每天进步一点点