solr.xml配置

Solr是一个高性能,基于Lucene的全文搜索服务器。同时对其进行了扩展,提供了比Lucene更为丰富的查询语言,同时实现了可配置、可扩展并对查询性能进行了优化,并且提供了一个完善的功能管理界面,是一款非常优秀的全文搜索引擎。

来源:https://cwiki.apache.org/confluence/display/solr/Format+of+solr.xml

Format of solr.xml

Skip to end of metadata

Go to start of metadata

You can find solr.xml in your Solr Home directory. The default discovery solr.xml file looks like this:

<solr>
<solrcloud>
<str name="host">${host:}</str>
<int name="hostPort">${jetty.port:8983}</int>
<str name="hostContext">${hostContext:solr}</str>
<int name="zkClientTimeout">${zkClientTimeout:15000}</int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
</solrcloud>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:0}</int>
<int name="connTimeout">${connTimeout:0}</int>
</shardHandlerFactory>
</solr>

As you can see, the discovery solr configuration is "SolrCloud friendly". However, the presence of the <solrcloud> element does not mean that the Solr instance is running in SolrCloud mode. Unless the -DzkHost or -DzkRun are specified at startup time, this section is ignored.

Using Multiple SolrCores

It is possible to segment Solr into multiple cores, each with its own configuration and indices. Cores may be dedicated to a single application or to very different ones, but all are administered through a common administration interface. You can create new Solr cores on the fly, shutdown cores, even replace one running core with another, all without ever stopping or restarting your servlet container.

Solr cores are configured by placing a file named core.properties in a subdirectory under solr.home. There are no a-priori limits to the depth of the tree, nor are there limits to the number of cores that can be defined. Cores may be anywhere in the tree with the exception that cores may not be defined under an existing core. That is, the following is not allowed:

./cores/core1/core.properties
./cores/core1/coremore/core5/core.properties

The enumeration will stop at core1 in the above example.

The following is legal

./cores/somecores/core1/core.properties
./cores/somecores/core2/core.properties
./cores/othercores/core3/core.properties
./cores/extracores/deepertree/core4/core.properties

A minimal core.properties file looks like this:

name=collection1

This is very different than the legacy solr.xml <core> tag. In fact, your core.properties file can be empty. Say the core.properties file is located in (relative to solr_home) ./cores/core1. In that case, the file core name is assumed to be "core1". The instance dir will be the folder containing core.properties (./cores/core1). The dataDir will be ../cores/core1/data etc.

You can run Solr without configuring any cores.

Solr.xml Parameters

The <solr> Element

There are no attributes that you can specify in the <solr> tag, which is the root element of solr.xml. The tables below list the child nodes of each XML element in solr.xml.

The persistent attribute is no longer supported in solr.xml. The properties in solr.xml are immutable, and any changes to individual cores are persisted in the individual core.properties files.
NodeDescription
<str name="adminHandler">If used, this attribute should be set to the FQN (Fully qualified name) of a class that inherits from CoreAdminHandler. For example, adminHandler="com.myorg.MyAdminHandler" would configure the custom admin handler (MyAdminHandler) to handle admin requests. If this attribute isn't set, Solr uses the default admin handler, org.apache.solr.handler.admin.CoreAdminHandler. For more information on this parameter, see the Solr Wiki at http://wiki.apache.org/solr/CoreAdmin#cores.
<int name="coreLoadThreads">Specifies the number of threads that will be assigned to load cores in parallel
<str name="coreRootDirectory">The root of the core discovery tree, defaults to SOLR_HOME
<str name="managementPath">no-op at present.
<str name="sharedLib">Specifies the path to a common library directory that will be shared across all cores. Any JAR files in this directory will be added to the search path for Solr plugins. This path is relative to the top-level container's Solr Home.
<str name="shareSchema">This attribute, when set to true, ensures that the multiple cores pointing to the same schema.xml will be referring to the same IndexSchema Object. Sharing the IndexSchema Object makes loading the core faster. If you use this feature, make sure that no core-specific property is used in your schema.xml.
<int name="transientCacheSize">Defines how many cores with transient=true that can be loaded before swapping the least recently used core for a new core.

The <solrcloud> element

This element defines several parameters that relate so SolrCloud. This section is ignored unless the solr instance is started with either -DzkRun or -DzkHost

NodeDescription
<int name="distribUpdateConnTimeout">Used to set the underlying "connTimeout" for intra-cluster updates.
<int name="distribUpdateSoTimeout">Used to set the underlying "socketTimeout" for intra-cluster updates.
<str name="host">The hostname Solr uses to access cores.
<str name="hostContext">The servlet context path.
<int name="hostPort">The port Solr uses to access cores. In the default solr.xml file, this is set to ${jetty.port:}, which will use the Solr port defined in Jetty.
<int name="leaderVoteWait">When SolrCloud is starting up, how long each Solr node will wait for all known replicas for that share to be found before assuming that any nodes that haven't reported are down.
<int name="zkClientTimeout">A timeout for connection to a ZooKeeper server. It is used with SolrCloud.
<str name="zkHost">In SolrCloud mode, the URL of the ZooKeeper host that Solr should use for cluster state information.
<str name="genericCoreNodeNames">If TRUE, node names are not based on the address of the node, but on a generic name that identifies the core. When a different machine takes over serving that core things will be much easier to understand.

The <logging> element.

NodeDescription
<str name="class">The class to use for logging. The corresponding JAR file must be available to solr, perhaps through a <lib> directive in solrconfig.xml.
<str name="enabled">true/false - whether to enable logging or not.

The <logging><watcher> element.

NodeDescription
<int name="size">The number of log events that are buffered.
<int name="threshold">The logging level above which your particular logging implementation will record. For example when using log4j one might specify DEBUGWARNINFO etc.

The <shardHandlerFactory> element.

Custom share handlers can be defined in solr.xml if you wish to create a custom shard handler

<shardHandlerFactory name="ShardHandlerFactory" class="qualified.class.name">

However, since this is a custom shard handler, sub-elements are specific to the implementation.

Individual core.properties files.

Core discovery replaces the individual <core> tags in solr.xml with a core.properties file located on disk. The presence of the core.properties file defines the instanceDir for that core. The core.properties file is a simple Java Properties file where each line is just a key=value pair, e.g. name=core1. Notice that no quotes are required.

Java properties files allow the hash "#" or bang "!" characters to specify comment-to-end-of-line. This table defines the recognized properties:

keyDescription
nameThe name of the SolrCore. You'll use this name to reference the SolrCore when running commands with the CoreAdminHandler.
configThe configuration file name for a given core. The default is solrconfig.xml.
schemaThe schema file name for a given core. The default is schema.xml
dataDirThis relative path defines the Solr Home for the core.
propertiesThe name of the properties file for this core. The value can be an absolute pathname or a path relative to the value of instanceDir.
transientIf true, the core can be unloaded if Solr reaches the transientCacheSize. The default if not specified is false. Cores are unloaded in order of least recently used first.
loadOnStartupIf true, the default if it is not specified, the core will loaded when Solr starts.
coreNodeNameAdded in Solr 4.2, this attributes allows naming a core. The name can then be used later if you need to replace a machine with a new one. By assigning the new machine the same coreNodeName as the old core, it will take over for the old SolrCore.
ulogDirThe absolute or relative directory for the update log for this core (SolrCloud)
shardThe shard to assign this core to (SolrCloud)
collectionThe name of the collection this core is part of (SolrCloud)
rolesFuture param for SolrCloud or a way for users to mark nodes for their own use.

The minimal core.properties file is an empty file, in which case all of the properties are defaulted appropriately.

Implicit properties

There are several properties that Solr defines automatically for each core. These properties are described in the table below:

PropertyDescription
solr.core.dataDirThe core's data directory, ${solr.core.instanceDir}/data by default.
solr.core.configNameThe name of the core's configuration file, solrconfig.xml by default.
solr.core.schemaNameThe name of the core's schema file, schema.xml by default.

Any of the above properties can be referenced by name in schema.xml or solrconfig.xml.

When defining properties, you can assign a property a default value that will be used if another value isn't specified. For example:

<!-- Blank unless company.name variable is defined -->
<str name="foo">${company.name}</str>
<!-- "SearchCo MegaIndex" if company.name variable is not defined -->
<str name="bar">${some.variable.name:SearchCo MegaIndex}</str>

Labels:

None

A few suggestions:

  • Instead of using the blue info box for your path examples (under Multiple Solr Cores) a code box might be better - it has a white background. In Wiki Markup, you can replace {info} with {code}. There is a further convention throughout the rest of the Ref Guide for code boxes that they should have a solid black border. To define that, use {code:borderStyle=solid|borderColor=#666666}, so the first box would be changed:
    {code:borderStyle=solid|borderColor=#666666}
    ./cores/core1/core.properties
    ./cores/core1/coremore/core5/core.properties
    {code}

    And would look like:

    ./cores/core1/core.properties
    ./cores/core1/coremore/core5/core.properties
  • A relatively minor nit, but important in the broader sense for consistency, is to use monospace text for command-line commands, API calls, field names, program names, and other technical-level information. You specify that with double curly-brackets around the text (such as {{ and }}). Anyone who doesn't (or can't) copy the text directly from the screen (or the PDF) will be at a lower risk for transcription errors.
  • In the section "Solr.xml Parameters", the first sentence says, "There are no attributes that you can specify on <solr>, which is the root element of solr.xml. Here is a full example of all the possibilities in a solr.xml file with comments." But there is no example there and a full list of attributes before the SolrCloud parameters start. Is that a hold-over from a previous page?
  • At the bottom of the content, add the {scrollbar} macro for consistency with other pages - it adds page navigation (prev, next, etc.).

I did a bunch ofthese changes, with one notable exception...

Instead of using the blue info box for your path examples (under Multiple Solr Cores) a code box might be better...

...i went with "noformat" instead for these directory listing things to help distinguish them from actual xml code boxes.

发表评论