diff --git a/src/site/en/xdoc/2.0/admin/browserType-guide.xml b/src/site/en/xdoc/2.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..251bb4b32
--- /dev/null
+++ b/src/site/en/xdoc/2.0/admin/browserType-guide.xml
@@ -0,0 +1,19 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/admin/crawl-guide.xml b/src/site/en/xdoc/2.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..c98ce95b4
--- /dev/null
+++ b/src/site/en/xdoc/2.0/admin/crawl-guide.xml
@@ -0,0 +1,100 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
In Administrator account click crawl General menu after login.
+
+
You can specify the path to a generated index and replication capabilities to enable.
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. This example, and am 0 時 0 分 to crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 2005
+
Each of the 2009 start am, 10:15
+
+
+
0 * 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Every day 2:00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day 2:00 in the PM-2: 05 pm start every 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
To enable replication features that can apply already copied the Solr index generated. For example, you can use them if you want to search only in the search servers crawled and indexed on a different server, placed in front.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
To raise the performance of the index in Fess while crawling and sends for Solr document in 20 units. For each value specified here because without committing to continue adding documents documents added in the Solr on performance, Solr issued document commits. By default, after you add documents 1000 is committed.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Copy index information from the index directory as the snapshot path, if replication is enabled, will be applied.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/2.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..43afb4364
--- /dev/null
+++ b/src/site/en/xdoc/2.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,34 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
You can remove all session information and click the Delete link all in the running.
+
+
+
+
To specify a session ID, you can see crawling content.
+
+
Information about the entire crawl Cralwer *:
+
FsCrawl *: information about the file system crawling
+
WebCrawl *: crawling the Web information
+
Information issued by Solr server optimization optimize *:
+
Commit *: information about the commit was issued to the Solr server.
Here, describes Fess information backup and restore methods.
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Path mapping
+
Web authentication
+
Compatible browsers
+
Session information
+
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate.
+
+
+
You can restore the configuration information by uploading the XML outputted by the backup. Specify the XML file, click the restore button on the data.
+
If there is already to enable overwriting of data, the same data does update existing data.
Here are settings for the design of search screens.
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
You can upload the image files to use in the search screen. Image file names are supported are jpg, gif and png.
+
+
+
If you want the file name to upload image files to use. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/2.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..1f7628385
--- /dev/null
+++ b/src/site/en/xdoc/2.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,96 @@
+
+
+
+ Settings for crawling a file system using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. > For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. If you register the label shown select label drop-down box to the right of the search box.
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/admin/roleType-guide.xml b/src/site/en/xdoc/2.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..b6e3776fb
--- /dev/null
+++ b/src/site/en/xdoc/2.0/admin/roleType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Describes the settings related to Solr, here registration in Fess. SOLR servers are grouped by file, has been registered.
+
+
In Administrator account after logging in, click menu Solr.
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and shut down when not running Fess server to shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
+
+
Server group name is used to search for and update appears.
+
+
+
Server becomes unavailable and the status of disabled. For example, inaccessible to the Solr server and changes to disabled. To enable recovery after server become unavailable will become available.
+
+
+
You can publish index commit, optimize for server groups. You can also remove a specific search for the session ID.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication or DIGEST authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/2.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..316e0cda4
--- /dev/null
+++ b/src/site/en/xdoc/2.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,99 @@
+
+
+
+ Settings for crawling the Web using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in)
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/config/index.xml b/src/site/en/xdoc/2.0/config/index.xml
new file mode 100644
index 000000000..a69b85c73
--- /dev/null
+++ b/src/site/en/xdoc/2.0/config/index.xml
@@ -0,0 +1,12 @@
+
+
+
+ Set up Guide
+ Shinsuke Sugaya
+
+
+
+
Fess output log (Solr log output to the logs/catalina.out) will be output to webapps/fess/WEB-INF/logs/fess.out. sets the contents to output Fess.out, Webpps/Fess/Web-INF/clsses/log4j.Xml. By default output INFO level.
+
For example, better Fess up to document for Solr log if you want to output in log4j.xml disconnect the commented-out section below.
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/config/mobile-device.xml b/src/site/en/xdoc/2.0/config/mobile-device.xml
new file mode 100644
index 000000000..98235ae27
--- /dev/null
+++ b/src/site/en/xdoc/2.0/config/mobile-device.xml
@@ -0,0 +1,17 @@
+
+
+
+ Mobile device information settings
+ Shinsuke Sugaya
+
+
+
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Fess can copy the path in Solr index data. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
You must use the replication features of Fess for Solr index file in the shared disk, such as NFS, Fess of each can be referenced from.
+
+
+
+
Fess, download and install the./ /NET/Server1/usr/local/Fess To assume you installed.
+
To register the crawl settings as well as Fess starts after the normal construction, create the index (index for Fess building instructions normal building procedures and especially remains the same) crawling.
+
+
+
Fess, download and install the./ /NET/Server2/usr/local/Fess To assume you installed.
+
To enable replication features check box in Fess starts after the management screen crawl settings the "snapshot path'. Snapshot path designates the index location for the index for Fess. In this case, the/NET/Server1/usr/local/Fess //solr/core1/data/index In the will.
+
+
Time press the update button to save the data and set in Schedule performs replication of the index.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/app.dicon shown below.
+
+ "role1"
+
+ :
+
+
+ {"guest"}
+
+
+ :
+]]>
+
authenticatedRoles can describe multiple by commas (,). You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results not logged in to set user roles are required.
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information. change the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+ "http://localhost:8080/solr"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/config/tokenizer.xml b/src/site/en/xdoc/2.0/config/tokenizer.xml
new file mode 100644
index 000000000..4181e30ff
--- /dev/null
+++ b/src/site/en/xdoc/2.0/config/tokenizer.xml
@@ -0,0 +1,36 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search.
+
Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits.
+
For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit.
+
For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
+
Such as CJKTokenizer Japan Japanese multibyte string against bi-gram, in other words two characters create index. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to.
+
The following example to change schema.xml so analyzer parts, you can use the StandardTokenizer.
+
+
+
+
+ :
+]]>
+
+
+
+
diff --git a/src/site/en/xdoc/2.0/config/windows-service.xml b/src/site/en/xdoc/2.0/config/windows-service.xml
new file mode 100644
index 000000000..c9aec8e24
--- /dev/null
+++ b/src/site/en/xdoc/2.0/config/windows-service.xml
@@ -0,0 +1,45 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-2.0.0.
+ cd C:\Java\fess-server-2.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
Then add properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat6w.exe //ES//fess
+]]>
+
Set the following in the Java Options in the Java tab.
+
+
Modifies the value of the maximum memory pool to 512. Settings to save the settings and then press OK button. Please start later as normal Windows services and.
By label to be registered in the management screen will enable search by labels in the search screen. You can use the label if you want to sort the search results. If you do not register the label displayed the label drop-down box.
+
+
To set the label by creating indexes, can search each crawl settings specified on the label. All results search search do not specify a label is usually the same.
To sort the search results by specifying the fields such as search time.
+
You can sort the following fields by default.
+
+
+
+
Tstamp
+
On the crawl
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
+
+
If you want to sort ' sort: field name ' in to fill out the search form, the search.
+
In ascending order sort the content size as a search term, Fess is below.
+
+
To sort in descending order as below.
+
+
If you sort by multiple fields separated list, shown below.
+
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/browserType-guide.xml b/src/site/en/xdoc/3.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..3a1b65470
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/browserType-guide.xml
@@ -0,0 +1,19 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
You can use Settings Wizard, to set you up on the fess.
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
First, setting a schedule.
+
During the time in fess is crawling and indexes.
+
By default, every day is a 0 時 0 分.
+
+
The crawl settings.
+
Crawl settings is to register a URI to look for.
+
The crawl settings name please put name of any easy to identify.
+
Put the URI part de-indexed, want to search for.
+
+
For example, if you want search for http://example.com, below looks like.
+
+
In this is the last setting.
+
Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/crawl-guide.xml b/src/site/en/xdoc/3.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..29e3108ea
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/crawl-guide.xml
@@ -0,0 +1,100 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
In Administrator account click crawl General menu after login.
+
+
You can specify the path to a generated index and replication capabilities to enable.
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. This example, and am 0 時 0 分 to crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 3.05
+
3.09 years every day 10:15 am start
+
+
+
0 * 14 * *?
+
Daily 2:59 pm-3.00 pm start per 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:59 pm-3.00 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Daily 3.00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day at 3.00pm-3.05pm start per 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
To enable replication features that can apply already copied the Solr index generated. For example, you can use them if you want to search only in the search servers crawled and indexed on a different server, placed in front.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
To raise the performance of the index in Fess while crawling and sends for Solr document in 20 units. For each value specified here because without committing to continue adding documents documents added in the Solr on performance, Solr issued document commits. By default, after you add documents 1000 is committed.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Copy index information from the index directory as the snapshot path, if replication is enabled, will be applied.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/3.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..efde2ce79
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,34 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
You can remove all session information and click the Delete link all in the running.
+
+
+
+
To specify a session ID, you can see crawling content.
+
+
Information about the entire crawl Cralwer *:
+
FsCrawl *: information about the file system crawling
+
WebCrawl *: crawling the Web information
+
Information issued by Solr server optimization optimize *:
+
Commit *: information about the commit was issued to the Solr server.
Here, describes Fess information backup and restore methods.
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Path mapping
+
Web authentication
+
Compatible browsers
+
Session information
+
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate.
+
+
+
You can restore the configuration information by uploading the XML outputted by the backup. Specify the XML file, click the restore button on the data.
+
If there is already to enable overwriting of data, the same data does update existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/dataStoreCrawling-guide.xml b/src/site/en/xdoc/3.0/admin/dataStoreCrawling-guide.xml
new file mode 100644
index 000000000..1ed9547ce
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/dataStoreCrawling-guide.xml
@@ -0,0 +1,129 @@
+
+
+
+ Data store configuration
+ Sone, Takaaki
+
+
+
+
You can crawl databases in Fess. Here are required to store settings.
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. String, tie up in double quotation marks.
+ Access in the database column name, its value.
+
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
Here are settings for the design of search screens.
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
You can upload the image files to use in the search screen. Image file names are supported are jpg, gif and png.
+
+
+
If you want the file name to upload image files to use. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
+
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
tstampDate will update on registration date, lastModifiedDate. Output date format is specified in SimpeDateFormat.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/3.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..47b371320
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,96 @@
+
+
+
+ Settings for crawling a file system using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. > For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. If you register the label shown select label drop-down box to the right of the search box.
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/roleType-guide.xml b/src/site/en/xdoc/3.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..0e16265d2
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/roleType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Describes the settings related to Solr, here registration in Fess. SOLR servers are grouped by file, has been registered.
+
+
In Administrator account after logging in, click menu Solr.
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and shut down when not running Fess server to shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
+
+
Server group name is used to search for and update appears.
+
+
+
Server becomes unavailable and the status of disabled. For example, inaccessible to the Solr server and changes to disabled. To enable recovery after server become unavailable will become available.
+
+
+
You can publish index commit, optimize for server groups. You can also remove a specific search for the session ID.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication or DIGEST authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/3.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..b55cee33b
--- /dev/null
+++ b/src/site/en/xdoc/3.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,99 @@
+
+
+
+ Settings for crawling the Web using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in)
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/config/filesize.xml b/src/site/en/xdoc/3.0/config/filesize.xml
new file mode 100644
index 000000000..dc6c6adb0
--- /dev/null
+++ b/src/site/en/xdoc/3.0/config/filesize.xml
@@ -0,0 +1,28 @@
+
+
+
+ File size you want to crawl settings
+ Shinsuke Sugaya
+
+
+
+
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
+
+
+
diff --git a/src/site/en/xdoc/3.0/config/index-backup.xml b/src/site/en/xdoc/3.0/config/index-backup.xml
new file mode 100644
index 000000000..930ea9df7
--- /dev/null
+++ b/src/site/en/xdoc/3.0/config/index-backup.xml
@@ -0,0 +1,13 @@
+
+
+
+ Index backup and restore
+ Shinsuke Sugaya
+
+
+
+
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data directory. Also, index data backed up to restore to undo.
+
+
+
diff --git a/src/site/en/xdoc/3.0/config/index.xml b/src/site/en/xdoc/3.0/config/index.xml
new file mode 100644
index 000000000..064e9aeaa
--- /dev/null
+++ b/src/site/en/xdoc/3.0/config/index.xml
@@ -0,0 +1,12 @@
+
+
+
+ Set up Guide
+ Shinsuke Sugaya
+
+
+
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-Xmx1024m (in this case the maximum value set 1024M).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Fess can copy the path in Solr index data. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
You must use the replication features of Fess for Solr index file in the shared disk, such as NFS, Fess of each can be referenced from.
+
+
+
+
Fess, download and install the./ /NET/Server1/usr/local/Fess To assume you installed.
+
To register the crawl settings as well as Fess starts after the normal construction, create the index (index for Fess building instructions normal building procedures and especially remains the same) crawling.
+
+
+
Fess, download and install the./ /NET/Server2/usr/local/Fess To assume you installed.
+
To enable replication features check box in Fess starts after the management screen crawl settings the "snapshot path'. Snapshot path designates the index location for the index for Fess. In this case, the/NET/Server1/usr/local/Fess //solr/core1/data/index In the will.
+
+
Time press the update button to save the data and set in Schedule performs replication of the index.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/app.dicon shown below.
+
+
+ {"guest"}
+
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+ "role1"
+
+ :
+]]>
+
authenticatedRoles can describe multiple by commas (,).
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information. change the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+ "http://localhost:8080/solr"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/config/solr-dynamic-field.xml b/src/site/en/xdoc/3.0/config/solr-dynamic-field.xml
new file mode 100644
index 000000000..483b0a5f9
--- /dev/null
+++ b/src/site/en/xdoc/3.0/config/solr-dynamic-field.xml
@@ -0,0 +1,48 @@
+
+
+
+ How to use the dynamic field of SOLR
+ Shinsuke Sugaya
+
+
+
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in solr/core1/conf/schema.xml. dynamic fields and standard fields such as title and content can be freely defined field names are defined. The dynamic fields that are available in the schema.xml Fess become. Advanced parameter values see a Solr document.
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add fields for the following in the dynamic field data out of Solr using webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/config/tokenizer.xml b/src/site/en/xdoc/3.0/config/tokenizer.xml
new file mode 100644
index 000000000..4181e30ff
--- /dev/null
+++ b/src/site/en/xdoc/3.0/config/tokenizer.xml
@@ -0,0 +1,36 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search.
+
Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits.
+
For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit.
+
For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
+
Such as CJKTokenizer Japan Japanese multibyte string against bi-gram, in other words two characters create index. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to.
+
The following example to change schema.xml so analyzer parts, you can use the StandardTokenizer.
+
+
+
+
+ :
+]]>
+
+
+
+
diff --git a/src/site/en/xdoc/3.0/config/windows-service.xml b/src/site/en/xdoc/3.0/config/windows-service.xml
new file mode 100644
index 000000000..a277c3a1a
--- /dev/null
+++ b/src/site/en/xdoc/3.0/config/windows-service.xml
@@ -0,0 +1,49 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option. (No pdfbox.cjk.support options from 3.1.0)
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-3.0.0.
+ cd C:\Java\fess-server-3.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat6w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
By label to be registered in the management screen will enable search by labels in the search screen. You can use the label if you want to sort the search results. If you do not register the label displayed the label drop-down box.
+
+
To set the label by creating indexes, can search each crawl settings specified on the label. All results search search do not specify a label is usually the same.
If you want to find documents that do not contain a Word can NOT find.
+ Locate the NOT search as NOT in front of the Word does not contain. Is NOT in uppercase characters ago and need space.
+
For example, searches, enter if you want to find documents that contain the search term 1 does not contain a search term 2 search term 1 NOT search words 2.
If you want to find documents that contain any of the search terms OR search use.
+ When describing the multiple words in the search box, by default will search.
+ You want OR search the case describes OR between search words. OR write in capital letters, spaces are required before and after.
+
For example, the search, enter if you want to search for documents that contain either search term 2 search term 1 search term 1 OR search term 2. OR between multiple languages are available.
To sort the search results by specifying the fields such as search time.
+
You can sort the following fields by default.
+
+
+
+
Tstamp
+
On the crawl
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
+
+
If you want to sort ' sort: field name ' in to fill out the search form, the search.
+
In ascending order sort the content size as a search term, Fess is below.
+
+
To sort in descending order as below.
+
+
If you sort by multiple fields separated list, shown below.
+
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/browserType-guide.xml b/src/site/en/xdoc/4.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..1a88c20bc
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/browserType-guide.xml
@@ -0,0 +1,19 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
You can use Settings Wizard, to set you up on the fess.
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
First, setting a schedule.
+
During the time in fess is crawling and indexes.
+
By default, every day is a 0 時 0 分.
+
+
The crawl settings.
+
Crawl settings is to register a URI to look for.
+
The crawl settings name please put name of any easy to identify.
+
Put the URI part de-indexed, want to search for.
+
+
For example, if you want search for http://example.com, below looks like.
+
+
In this is the last setting.
+
Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/crawl-guide.xml b/src/site/en/xdoc/4.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..3f90227b6
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/crawl-guide.xml
@@ -0,0 +1,139 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
In Administrator account click crawl General menu after login.
+
+
You can specify the path to a generated index and replication capabilities to enable.
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. This example, and am 0 時 0 分 to crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 2005
+
Each of the 2009 start am, 10:15
+
+
+
0 * 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Every day 2:00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day 2:00 in the PM-2: 05 pm start every 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If you select Web unusable for mobile search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. can be retrieved by accessing the http://localhost:8080/Fess/hotsearchword.
+
+
+
Delete a session log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
To enable replication features that can apply already copied the Solr index generated. For example, you can use them if you want to search only in the search servers crawled and indexed on a different server, placed in front.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
To raise the performance of the index in Fess while crawling and sends for Solr document in 20 units. For each value specified here because without committing to continue adding documents documents added in the Solr on performance, Solr issued document commits. By default, after you add documents 1000 is committed.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. No need to worry about disability type is crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
Copy index information from the index directory as the snapshot path, if replication is enabled, will be applied.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/4.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..fa010c7bd
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,34 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
You can remove all session information and click the Delete link all in the running.
+
+
+
+
To specify a session ID, you can see crawling content.
+
+
Information about the entire crawl Cralwer *:
+
FsCrawl *: information about the file system crawling
+
WebCrawl *: crawling the Web information
+
Information issued by Solr server optimization optimize *:
+
Commit *: information about the commit was issued to the Solr server.
Here, describes Fess information backup and restore methods.
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Path mapping
+
Web authentication
+
Compatible browsers
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/dataStoreCrawling-guide.xml b/src/site/en/xdoc/4.0/admin/dataStoreCrawling-guide.xml
new file mode 100644
index 000000000..5edc1cd38
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/dataStoreCrawling-guide.xml
@@ -0,0 +1,129 @@
+
+
+
+ Data store configuration
+ Sone, Takaaki
+
+
+
+
You can crawl databases in Fess. Here are required to store settings.
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. String, tie up in double quotation marks.
+ Access in the database column name, its value.
+
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
Here are settings for the design of search screens.
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
You can upload the image files to use in the search screen. Image file names are supported are jpg, gif and png.
+
+
+
If you want the file name to upload image files to use. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
+
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
Here the failure URL. URL could not be obtained at crawl time are recorded and confirmed as the failure URL.
+
+
In Administrator account click menu disabled URL after login.
+
+
Clicking the confirmation link failure URL displayed for more information.
+
+
+
+
A glance could not crawl the URL and date.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/4.0/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..991676a42
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/fileAuthentication-guide.xml
@@ -0,0 +1,40 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select a file name to apply the authentication settings for the above. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/4.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..2634dfd32
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,98 @@
+
+
+
+ Settings for crawling a file system using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. > For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. If you register the label shown select label drop-down box to the right of the search box.
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/roleType-guide.xml b/src/site/en/xdoc/4.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..a56c16f7c
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/roleType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here registration in Fess. SOLR servers are grouped by file, has been registered.
+
+
In Administrator account after logging in, click menu Solr.
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and shut down when not running Fess server to shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
+
+
Server group name is used to search for and update appears.
+
+
+
Server becomes unavailable and the status of disabled. For example, inaccessible to the Solr server and changes to disabled. To enable recovery after server become unavailable will become available.
+
+
+
You can publish index commit, optimize for server groups. You can also remove a specific search for the session ID. You can remove only the specific documents by specifying the URL.
+
+
+
Shown by the number of documents registered in each session. Can verify the results list by clicking the session name.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/systemInfo-guide.xml b/src/site/en/xdoc/4.0/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..d1f28fc5f
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/systemInfo-guide.xml
@@ -0,0 +1,28 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/4.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..24417d168
--- /dev/null
+++ b/src/site/en/xdoc/4.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,99 @@
+
+
+
+ Settings for crawling the Web using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in)
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
+ Increasing awareness of security in the browser environment in recent years, open a local file (for example, c:\hoge.txt) from the Web pages on.
+ Not to copy and paste the link from the search results, and then reopen the usability is good.
+ In order to respond to this in Fess and provides desktop search functionality.
+
+
+
+ Desktop Search feature is turned off by default.
+ Please enable the following settings.
+
First of all, bin/setenv.bat as java.awt.headless from true to false edits.
+
+
Then add the following to webapps/fess/WEB-INF/conf/crawler.properties.
+
+
Start the Fess, after you set up above. How to use Basic remains especially.
+
+
+
+
Please Fess inaccessible from the outside, such as (for example, 8080 port does not release).
+
because false Java.awt.headless image size conversion for mobile devices is not available.
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/filesize.xml b/src/site/en/xdoc/4.0/config/filesize.xml
new file mode 100644
index 000000000..dc6c6adb0
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/filesize.xml
@@ -0,0 +1,28 @@
+
+
+
+ File size you want to crawl settings
+ Shinsuke Sugaya
+
+
+
+
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/index-backup.xml b/src/site/en/xdoc/4.0/config/index-backup.xml
new file mode 100644
index 000000000..930ea9df7
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/index-backup.xml
@@ -0,0 +1,13 @@
+
+
+
+ Index backup and restore
+ Shinsuke Sugaya
+
+
+
+
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data directory. Also, index data backed up to restore to undo.
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/index.xml b/src/site/en/xdoc/4.0/config/index.xml
new file mode 100644
index 000000000..de14b4810
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/index.xml
@@ -0,0 +1,12 @@
+
+
+
+ Set up Guide
+ Shinsuke Sugaya
+
+
+
+
Here is the Fess 4.0 Setup instructions.
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/install-on-tomcat.xml b/src/site/en/xdoc/4.0/config/install-on-tomcat.xml
new file mode 100644
index 000000000..0aec49b11
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 6.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-Xmx1024m (in this case the maximum value set 1024M).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Fess can copy the path in Solr index data. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
You must use the replication features of Fess for Solr index file in the shared disk, such as NFS, Fess of each can be referenced from.
+
+
+
+
Fess, download and install the./ /NET/Server1/usr/local/Fess To assume you installed.
+
To register the crawl settings as well as Fess starts after the normal construction, create the index (index for Fess building instructions normal building procedures and especially remains the same) crawling.
+
+
+
Fess, download and install the./ /NET/Server2/usr/local/Fess To assume you installed.
+
To enable replication features check box in Fess starts after the management screen crawl settings the "snapshot path'. Snapshot path designates the index location for the index for Fess. In this case, the/NET/Server1/usr/local/Fess //solr/core1/data/index In the will.
+
+
Time press the update button to save the data and set in Schedule performs replication of the index.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/app.dicon shown below.
+
+
+ {"guest"}
+
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+ "role1"
+
+ :
+]]>
+
authenticatedRoles can describe multiple by commas (,).
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information. change the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+ "http://localhost:8080/solr"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/solr-dynamic-field.xml b/src/site/en/xdoc/4.0/config/solr-dynamic-field.xml
new file mode 100644
index 000000000..483b0a5f9
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/solr-dynamic-field.xml
@@ -0,0 +1,48 @@
+
+
+
+ How to use the dynamic field of SOLR
+ Shinsuke Sugaya
+
+
+
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in solr/core1/conf/schema.xml. dynamic fields and standard fields such as title and content can be freely defined field names are defined. The dynamic fields that are available in the schema.xml Fess become. Advanced parameter values see a Solr document.
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add fields for the following in the dynamic field data out of Solr using webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/tokenizer.xml b/src/site/en/xdoc/4.0/config/tokenizer.xml
new file mode 100644
index 000000000..4181e30ff
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/tokenizer.xml
@@ -0,0 +1,36 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search.
+
Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits.
+
For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit.
+
For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
+
Such as CJKTokenizer Japan Japanese multibyte string against bi-gram, in other words two characters create index. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to.
+
The following example to change schema.xml so analyzer parts, you can use the StandardTokenizer.
+
+
+
+
+ :
+]]>
+
+
+
+
diff --git a/src/site/en/xdoc/4.0/config/windows-service.xml b/src/site/en/xdoc/4.0/config/windows-service.xml
new file mode 100644
index 000000000..a0caee850
--- /dev/null
+++ b/src/site/en/xdoc/4.0/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as To add the path %jv_home%\BinYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-4.0.0.
+ cd C:\Java\fess-server-4.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat6w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 32-bit binaries for Windows Tomcat builds based on. If you use 64-bit WindowsTomcat For 64 bit Windows zip, such as getting from the site and replace tomcat6.exe, tomcat6w.exe.
+ Use the boost search if you want to prioritize, search for specific search terms.
+ For example, if you want to find the page if you want to find apples oranges contained more 'apples' ' Apple ^ 100 orange ' that to ' ^ number "that searches in the form.
+ Number specifies an integer greater than 1.
+
+
+
diff --git a/src/site/en/xdoc/4.0/user/search-field.xml b/src/site/en/xdoc/4.0/user/search-field.xml
new file mode 100644
index 000000000..b30448189
--- /dev/null
+++ b/src/site/en/xdoc/4.0/user/search-field.xml
@@ -0,0 +1,57 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
In the Fess crawl results saved in the title and text fields. You can search for a field of them.
+
You can search for a the following fields by default.
+
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields title and content to search for.
+
+
If a field search "field name: search terms ' of so fill out the search form, the search.
Fess is corresponding to fuzzy searches based on the Levenshtein distance (fuzzy search).
+ After the search word you want to apply the fuzzy search adds '~'.
+ For example, see Solr ~ ' that you can search for documents that contain the search string "Solr" similar to the language or ("Solar").
+
+ Furthermore, you can specify a number between 0 and 1 ~ after the close to 1 just like in refine.
+ For example, in the form of 'Solr~0.8'.
+ Do not specify numeric default value is 0.5.
By label to be registered in the management screen will enable search by labels in the search screen. You can use the label if you want to sort the search results. If you do not register the label displayed the label drop-down box.
+
+
To set the label by creating indexes, can search each crawl settings specified on the label. All results search search do not specify a label is usually the same.
If you want to find documents that do not contain a Word can NOT find.
+ Locate the NOT search as NOT in front of the Word does not contain. Is NOT in uppercase characters ago and need space.
+
For example, searches, enter if you want to find documents that contain the search term 1 does not contain a search term 2 search term 1 NOT search words 2.
If you want to find documents that contain any of the search terms OR search use.
+ When describing the multiple words in the search box, by default will search.
+ You want OR search the case describes OR between search words. OR write in capital letters, spaces are required before and after.
+
For example, the search, enter if you want to search for documents that contain either search term 2 search term 1 search term 1 OR search term 2. OR between multiple languages are available.
Range searches can be done for field.
+ To range search field name: value TO the search term.
+ For example, if you search documents contentLength field against 1 k to 10 k bytes is ' contentLength: 1000 TO 10000 ' search language and the.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/browserType-guide.xml b/src/site/en/xdoc/5.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..8e229d1ad
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/browserType-guide.xml
@@ -0,0 +1,19 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
You can use Settings Wizard, to set you up on the fess.
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
First, setting a schedule.
+
During the time in fess is crawling and indexes.
+
By default, every day is a 0 時 0 分.
+
+
The crawl settings.
+
Crawl settings is to register a URI to look for.
+
The crawl settings name please put name of any easy to identify.
+
Put the URI part de-indexed, want to search for.
+
+
For example, if you want search for http://example.com, below looks like.
+
+
In this is the last setting.
+
Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/crawl-guide.xml b/src/site/en/xdoc/5.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..ed8fc3c13
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/crawl-guide.xml
@@ -0,0 +1,139 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
In Administrator account click crawl General menu after login.
+
+
You can specify the path to a generated index and replication capabilities to enable.
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. This example, and am 0 時 0 分 to crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 2005
+
Each of the 2009 start am, 10:15
+
+
+
0 * 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Every day 2:00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day 2:00 in the PM-2: 05 pm start every 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If you select Web unusable for mobile search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. can be retrieved by accessing the http://localhost:8080/Fess/hotsearchword.
+
+
+
Delete a session log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
To enable replication features that can apply already copied the Solr index generated. For example, you can use them if you want to search only in the search servers crawled and indexed on a different server, placed in front.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
To raise the performance of the index in Fess while crawling and sends for Solr document in 20 units. For each value specified here because without committing to continue adding documents documents added in the Solr on performance, Solr issued document commits. By default, after you add documents 1000 is committed.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. No need to worry about disability type is crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
Copy index information from the index directory as the snapshot path, if replication is enabled, will be applied.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/5.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..1a48a8ff6
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,34 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
You can remove all session information and click the Delete link all in the running.
+
+
+
+
To specify a session ID, you can see crawling content.
+
+
Information about the entire crawl Cralwer *:
+
FsCrawl *: information about the file system crawling
+
WebCrawl *: crawling the Web information
+
Information issued by Solr server optimization optimize *:
+
Commit *: information about the commit was issued to the Solr server.
Here, describes Fess information backup and restore methods.
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Path mapping
+
Web authentication
+
Compatible browsers
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/dataStoreCrawling-guide.xml b/src/site/en/xdoc/5.0/admin/dataStoreCrawling-guide.xml
new file mode 100644
index 000000000..4075637c2
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/dataStoreCrawling-guide.xml
@@ -0,0 +1,153 @@
+
+
+
+ Data store configuration
+ Sone, Takaaki
+ Shinsuke Sugaya
+
+
+
+
You can crawl databases in Fess. Here are required to store settings.
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
Here the data is put something like the following.
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. String, tie up in double quotation marks.
+ Access in the database column name, its value.
+
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
+
+
+
Set the following in the webapps/fess/WEB-INF/classes/app.dicon if you see the item value, such as latitude_s in the search results. After adding to $ {doc.latitude_s}, searchResults.jsp;
Here are settings for the design of search screens.
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
You can upload the image files to use in the search screen. Image file names are supported are jpg, gif and png.
+
+
+
If you want the file name to upload image files to use. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
+
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
Here the failure URL. URL could not be obtained at crawl time are recorded and confirmed as the failure URL.
+
+
In Administrator account click menu disabled URL after login.
+
+
Clicking the confirmation link failure URL displayed for more information.
+
+
+
+
A glance could not crawl the URL and date.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/5.0/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..5e07af7ca
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/fileAuthentication-guide.xml
@@ -0,0 +1,40 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select a file name to apply the authentication settings for the above. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/5.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..3356ff798
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,98 @@
+
+
+
+ Settings for crawling a file system using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. > For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. If you register the label shown select label drop-down box to the right of the search box.
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/roleType-guide.xml b/src/site/en/xdoc/5.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..920a0329d
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/roleType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here registration in Fess. SOLR servers are grouped by file, has been registered.
+
+
In Administrator account after logging in, click menu Solr.
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and shut down when not running Fess server to shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
+
+
Server group name is used to search for and update appears.
+
+
+
Server becomes unavailable and the status of disabled. For example, inaccessible to the Solr server and changes to disabled. To enable recovery after server become unavailable will become available.
+
+
+
You can publish index commit, optimize for server groups. You can also remove a specific search for the session ID. You can remove only the specific documents by specifying the URL.
+
+
+
Shown by the number of documents registered in each session. Can verify the results list by clicking the session name.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/systemInfo-guide.xml b/src/site/en/xdoc/5.0/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..1268f60ce
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/systemInfo-guide.xml
@@ -0,0 +1,28 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/5.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..0f196a47a
--- /dev/null
+++ b/src/site/en/xdoc/5.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,99 @@
+
+
+
+ Settings for crawling the Web using
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in)
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
Under normal circumstances the database use the H2 Database. You can use other databases by changing settings.
+
+
+
+
Expand the MySQL binaries.
+
+
+
Create a database.
+ create database fess_db;
+mysql> grant all privileges on fess_db.* to fess_user@localhost identified by 'fess_pass';
+mysql> create database fess_robot;
+mysql> grant all privileges on fess_robot.* to s2robot@localhost identified by 's2robot';
+mysql> FLUSH PRIVILEGES;
+]]>
+
Create a table in the database. DDL file is located in extension/mysql.
+ Increasing awareness of security in the browser environment in recent years, open a local file (for example, c:\hoge.txt) from the Web pages on.
+ Not to copy and paste the link from the search results, and then reopen the usability is good.
+ In order to respond to this in Fess and provides desktop search functionality.
+
+
+
+ Desktop Search feature is turned off by default.
+ Please enable the following settings.
+
First of all, bin/setenv.bat as java.awt.headless from true to false edits.
+
+
Then add the following to webapps/fess/WEB-INF/conf/crawler.properties.
+
+
Start the Fess, after you set up above. How to use Basic remains especially.
+
+
+
+
Please Fess inaccessible from the outside, such as (for example, 8080 port does not release).
+
because false Java.awt.headless image size conversion for mobile devices is not available.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/filesize.xml b/src/site/en/xdoc/5.0/config/filesize.xml
new file mode 100644
index 000000000..dc6c6adb0
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/filesize.xml
@@ -0,0 +1,28 @@
+
+
+
+ File size you want to crawl settings
+ Shinsuke Sugaya
+
+
+
+
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
Together with Google maps, including document with latitude and longitude location information, GEO (GEO) you can use the search.
+
+
+
+
Location is defined as a feed that contains the location information.
+ When generating the index in Solr latitude longitude set to location feeds in formats such as 45.17614,-93.87341, register the document.
+ Also sets the value as the latitude_s and longitude_s fields if you want to display latitude and longitude as a search result. * _s is available as a dynamic field of Solr string.
+
+
+
During the search specifies in the request parameter to latitude and longitude, the distance.
+ View the results in the distance (km) specified by distance-based latitude information (latitude, longitude). Latitude and longitude and distances is treated as double.
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/index-backup.xml b/src/site/en/xdoc/5.0/config/index-backup.xml
new file mode 100644
index 000000000..930ea9df7
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/index-backup.xml
@@ -0,0 +1,13 @@
+
+
+
+ Index backup and restore
+ Shinsuke Sugaya
+
+
+
+
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data directory. Also, index data backed up to restore to undo.
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/index.xml b/src/site/en/xdoc/5.0/config/index.xml
new file mode 100644
index 000000000..76a68d7da
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/index.xml
@@ -0,0 +1,12 @@
+
+
+
+ Set up Guide
+ Shinsuke Sugaya
+
+
+
+
Here is the Fess 5.0 Setup instructions.
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/install-on-tomcat.xml b/src/site/en/xdoc/5.0/config/install-on-tomcat.xml
new file mode 100644
index 000000000..7d1afeaf2
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 6.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-Xmx1024m (in this case the maximum value set 1024M).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Fess can copy the path in Solr index data. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
You must use the replication features of Fess for Solr index file in the shared disk, such as NFS, Fess of each can be referenced from.
+
+
+
+
Fess, download and install the./ /NET/Server1/usr/local/Fess To assume you installed.
+
To register the crawl settings as well as Fess starts after the normal construction, create the index (index for Fess building instructions normal building procedures and especially remains the same) crawling.
+
+
+
Fess, download and install the./ /NET/Server2/usr/local/Fess To assume you installed.
+
To enable replication features check box in Fess starts after the management screen crawl settings the "snapshot path'. Snapshot path designates the index location for the index for Fess. In this case, the/NET/Server1/usr/local/Fess //solr/core1/data/index In the will.
+
+
Time press the update button to save the data and set in Schedule performs replication of the index.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/app.dicon shown below.
+
+
+ {"guest"}
+
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+ "role1"
+
+ :
+]]>
+
authenticatedRoles can describe multiple by commas (,).
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information. change the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+ "http://localhost:8080/solr"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/solr-dynamic-field.xml b/src/site/en/xdoc/5.0/config/solr-dynamic-field.xml
new file mode 100644
index 000000000..483b0a5f9
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/solr-dynamic-field.xml
@@ -0,0 +1,48 @@
+
+
+
+ How to use the dynamic field of SOLR
+ Shinsuke Sugaya
+
+
+
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in solr/core1/conf/schema.xml. dynamic fields and standard fields such as title and content can be freely defined field names are defined. The dynamic fields that are available in the schema.xml Fess become. Advanced parameter values see a Solr document.
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add fields for the following in the dynamic field data out of Solr using webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is Minctiveserver following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status Mxretrysttuscheckcount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/tokenizer.xml b/src/site/en/xdoc/5.0/config/tokenizer.xml
new file mode 100644
index 000000000..4181e30ff
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/tokenizer.xml
@@ -0,0 +1,36 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search.
+
Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits.
+
For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit.
+
For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
+
Such as CJKTokenizer Japan Japanese multibyte string against bi-gram, in other words two characters create index. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to.
+
The following example to change schema.xml so analyzer parts, you can use the StandardTokenizer.
+
+
+
+
+ :
+]]>
+
+
+
+
diff --git a/src/site/en/xdoc/5.0/config/windows-service.xml b/src/site/en/xdoc/5.0/config/windows-service.xml
new file mode 100644
index 000000000..16ce65f50
--- /dev/null
+++ b/src/site/en/xdoc/5.0/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-5.0.0.
+ cd C:\Java\fess-server-5.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat6w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 32-bit binaries for Windows Tomcat builds based on. If you use 64-bit WindowsTomcat Of the site from, such as 64-bit Windows zip, please replace tomcat6,exe, tomcat6w,exe.
+ Use the boost search if you want to prioritize, search for specific search terms.
+ For example, if you want to find the page if you want to find apples oranges contained more 'apples' ' Apple ^ 100 orange ' that to ' ^ number "that searches in the form.
+ Number specifies an integer greater than 1.
+
+
+
diff --git a/src/site/en/xdoc/5.0/user/search-field.xml b/src/site/en/xdoc/5.0/user/search-field.xml
new file mode 100644
index 000000000..b30448189
--- /dev/null
+++ b/src/site/en/xdoc/5.0/user/search-field.xml
@@ -0,0 +1,57 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
In the Fess crawl results saved in the title and text fields. You can search for a field of them.
+
You can search for a the following fields by default.
+
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields title and content to search for.
+
+
If a field search "field name: search terms ' of so fill out the search form, the search.
Fess is corresponding to fuzzy searches based on the Levenshtein distance (fuzzy search).
+ After the search word you want to apply the fuzzy search adds '~'.
+ For example, see Solr ~ ' that you can search for documents that contain the search string "Solr" similar to the language or ("Solar").
+
+ Furthermore, you can specify a number between 0 and 1 ~ after the close to 1 just like in refine.
+ For example, in the form of 'Solr~0.8'.
+ Do not specify numeric default value is 0.5.
By label to be registered in the management screen will enable search by labels in the search screen. You can use the label if you want to sort the search results. If you do not register the label displayed the label drop-down box.
+
+
To set the label by creating indexes, can search each crawl settings specified on the label. All results search search do not specify a label is usually the same.
If you want to find documents that do not contain a Word can NOT find.
+ Locate the NOT search as NOT in front of the Word does not contain. Is NOT in uppercase characters ago and need space.
+
For example, searches, enter if you want to find documents that contain the search term 1 does not contain a search term 2 search term 1 NOT search words 2.
If you want to find documents that contain any of the search terms OR search use.
+ When describing the multiple words in the search box, by default will search.
+ You want OR search the case describes OR between search words. OR write in capital letters, spaces are required before and after.
+
For example, the search, enter if you want to search for documents that contain either search term 2 search term 1 search term 1 OR search term 2. OR between multiple languages are available.
Range searches can be done for field.
+ To range search field name: value TO the search term.
+ For example, if you search documents contentLength field against 1 k to 10 k bytes is ' contentLength: 1000 TO 10000 ' search language and the.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/browserType-guide.xml b/src/site/en/xdoc/6.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..2398086c0
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/browserType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/crawl-guide.xml b/src/site/en/xdoc/6.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..33c91309b
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/crawl-guide.xml
@@ -0,0 +1,143 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
+
+
In Administrator account click crawl General menu after login.
+
+
You can specify the path to a generated index and replication capabilities to enable.
+
+
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. This example, and am 0 時 0 分 to crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 2009
+
Each of the 2009 start am, 10:15
+
+
+
0 * 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Every day 2:00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day 2:00 in the PM-2: 05 pm start every 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If you select Web unusable for mobile search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. can be retrieved by accessing the http://localhost:8080/Fess/hotsearchword.
+
+
+
Delete a session log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
To enable replication features that can apply already copied the Solr index generated. For example, you can use them if you want to search only in the search servers crawled and indexed on a different server, placed in front.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
To raise the performance of the index in Fess while crawling and sends for Solr document in 20 units. For each value specified here because without committing to continue adding documents documents added in the Solr on performance, Solr issued document commits. By default, after you add documents 1000 is committed.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. No need to worry about disability type is crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
Copy index information from the index directory as the snapshot path, if replication is enabled, will be applied.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/6.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..316145428
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,38 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
+
+
You can remove all session information and click the Delete link all in the running.
+
+
+
+
To specify a session ID, you can see crawling content.
+
+
Information about the entire crawl Cralwer *:
+
FsCrawl *: information about the file system crawling
+
WebCrawl *: crawling the Web information
+
Information issued by Solr server optimization optimize *:
+
Commit *: information about the commit was issued to the Solr server.
Here, describes Fess information backup and restore methods.
+
+
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Path mapping
+
Web authentication
+
Compatible browsers
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/dataCrawlingConfig-guide.xml b/src/site/en/xdoc/6.0/admin/dataCrawlingConfig-guide.xml
new file mode 100644
index 000000000..5acee69c0
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/dataCrawlingConfig-guide.xml
@@ -0,0 +1,157 @@
+
+
+
+ Settings for crawling the data store
+ Sone, Takaaki
+ Shinsuke Sugaya
+
+
+
+
You can crawl databases in Fess. Here are required to store settings.
+
+
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
Here the data is put something like the following.
+
+
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. Close the string in double quotation marks.
+ Access in the database column name, its value.
+
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
+
+
+
Set the following in the webapps/fess/WEB-INF/classes/app.dicon if you see the item value, such as latitude_s in the search results. After adding to $ {doc.latitude_s}, searchResults.jsp;
Here are settings for the design of search screens.
+
+
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
tstampDate will update on registration date, lastModifiedDate. Output date format is specified in SimpeDateFormat.
+
+
+
+
+
You can upload the image files to use in the search screen. Image file names are supported are jpg, gif and png.
+
+
+
If you want the file name to upload image files to use. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
Here the failure URL. URL could not be obtained at crawl time are recorded and confirmed as the failure URL.
+
+
+
+
In Administrator account click menu disabled URL after login.
+
+
Clicking the confirmation link failure URL displayed for more information.
+
+
+
+
A glance could not crawl the URL and date.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/6.0/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..747e6bf84
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/fileAuthentication-guide.xml
@@ -0,0 +1,44 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select a file name to apply the authentication settings for the above. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/6.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..054df2c7e
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,102 @@
+
+
+
+ Settings for file system crawling
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. If you register the label shown select label drop-down box to the right of the search box.
+
+
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/roleType-guide.xml b/src/site/en/xdoc/6.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..2b63648c6
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/roleType-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here registration in Fess. SOLR servers are grouped by file, has been registered.
+
+
+
+
In Administrator account after logging in, click menu Solr.
+
+
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and shut down when not running Fess server to shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
+
+
Server group name is used to search for and update appears.
+
+
+
Server becomes unavailable and the status of disabled. For example, inaccessible to the Solr server and changes to disabled. To enable recovery after server become unavailable will become available.
+
+
+
You can publish index commit, optimize for server groups. You can also remove a specific search for the session ID. You can remove only the specific documents by specifying the URL.
+
+
+
Shown by the number of documents registered in each session. Can verify the results list by clicking the session name.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/systemInfo-guide.xml b/src/site/en/xdoc/6.0/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..e00d6d2ef
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/systemInfo-guide.xml
@@ -0,0 +1,32 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/6.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..ad4b96254
--- /dev/null
+++ b/src/site/en/xdoc/6.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,103 @@
+
+
+
+ Settings for crawling Web site
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl. If you do not specify people per 100,000.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in)
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
You can use Settings Wizard, to set you up on the fess.
+
+
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
First, setting a schedule.
+ During the time in fess is crawling and indexes. By default, every day is a 0 時 0 分.
+
+
The crawl settings.
+ Crawl settings is to register a URI to look for.
+ The crawl settings name please put name of any easy to identify. Put the URI part de-indexed, want to search for.
+
+
For example, if you want and search for http://fess.codelibs.org/, less looks like.
+
+
In this is the last setting. Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
Under normal circumstances the database use the H2 Database. You can use other databases by changing settings.
+
+
+
+
Expand the MySQL binaries.
+
+
+
Create a database.
+ create database fess_db;
+mysql> grant all privileges on fess_db.* to fess_user@localhost identified by 'fess_pass';
+mysql> create database fess_robot;
+mysql> grant all privileges on fess_robot.* to s2robot@localhost identified by 's2robot';
+mysql> FLUSH PRIVILEGES;
+]]>
+
Create a table in the database. DDL file is located in extension/mysql.
+ Increasing awareness of security in the browser environment in recent years, open a local file (for example, c:\hoge.txt) from the Web pages on.
+ Not to copy and paste the link from the search results, and then reopen the usability is good.
+ In order to respond to this in Fess and provides desktop search functionality.
+
+
+
+ Desktop Search feature is turned off by default.
+ Please enable the following settings.
+
First of all, bin/setenv.bat as java.awt.headless from true to false edits.
+
+
Then add the following to webapps/fess/WEB-INF/conf/crawler.properties.
+
+
Start the Fess, after you set up above. How to use Basic remains especially.
+
+
+
+
Please Fess inaccessible from the outside, such as (for example, 8080 port does not release).
+
because false Java.awt.headless image size conversion for mobile devices is not available.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/config/filesize.xml b/src/site/en/xdoc/6.0/config/filesize.xml
new file mode 100644
index 000000000..ba51e38d6
--- /dev/null
+++ b/src/site/en/xdoc/6.0/config/filesize.xml
@@ -0,0 +1,28 @@
+
+
+
+ File size you want to crawl settings
+ Shinsuke Sugaya
+
+
+
+
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
Together with Google maps, including document with latitude and longitude location information, GEO (GEO) you can use the search.
+
+
+
+
Location is defined as a feed that contains the location information.
+ When generating the index in Solr latitude longitude set to location feeds in formats such as 45.17614,-93.87341, register the document.
+ Also sets the value as the latitude_s and longitude_s fields if you want to display latitude and longitude as a search result. * _s is available as a dynamic field of Solr string.
+
+
+
During the search specifies in the request parameter to latitude and longitude, the distance.
+ View the results in the distance (km) specified by distance-based latitude information (latitude, longitude). Latitude and longitude and distances is treated as double.
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data directory. Also, index data backed up to restore to undo.
+If you need commercial support, maintenance and technical support for this productN9sm, Inc.To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/config/install-on-tomcat.xml b/src/site/en/xdoc/6.0/config/install-on-tomcat.xml
new file mode 100644
index 000000000..314d28334
--- /dev/null
+++ b/src/site/en/xdoc/6.0/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 6.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-Xmx1024m (in this case the maximum value set 1024M).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Fess can copy the path in Solr index data. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
You must use the replication features of Fess for Solr index file in the shared disk, such as NFS, Fess of each can be referenced from.
+
+
+
+
Fess, download and install the./ /NET/Server1/usr/local/Fess To assume you installed.
+
To register the crawl settings as well as Fess starts after the normal construction, create the index (index for Fess building instructions normal building procedures and especially remains the same) crawling.
+
+
+
Fess, download and install the./ /NET/Server2/usr/local/Fess To assume you installed.
+
To enable replication features check box in Fess starts after the management screen crawl settings the "snapshot path'. Snapshot path designates the index location for the index for Fess. In this case, the/NET/Server1/usr/local/Fess //solr/core1/data/index In the will.
+
+
Time press the update button to save the data and set in Schedule performs replication of the index.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/app.dicon shown below.
+
+
+ {"guest"}
+
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+ "role1"
+
+ :
+]]>
+
authenticatedRoles can describe multiple by commas (,).
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information. change the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+ "http://localhost:8080/solr"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in SOLR/core1/conf/schema,XML. dynamic fields and standard fields such as title and content can be freely defined field names are defined. The dynamic fields that are available in the schema.xml Fess become. Advanced parameter values see a Solr document.
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add fields for the following in the dynamic field data out of Solr using webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/config/tokenizer.xml b/src/site/en/xdoc/6.0/config/tokenizer.xml
new file mode 100644
index 000000000..fcdb1422e
--- /dev/null
+++ b/src/site/en/xdoc/6.0/config/tokenizer.xml
@@ -0,0 +1,47 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search. Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits. For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit. For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
Such as CJKTokenizer Japan Japanese multibyte string against bi-gram, in other words two characters create index. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to. However, please note that the index size increases.
+
The following example to change the analyzer part like solr/core1/conf/schema.xml, you can use the StandardTokenizer.
+
+
+
+
+
+
+ :
+
+
+
+
+ :
+]]>
+
Also, useBigram is enabled by default in the webapps/fess/WEB-INF/classes/app.dicon change to false.
+
+ true
+ :
+]]>
+
After the restart the Fess.
+
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/config/use-libreoffice.xml b/src/site/en/xdoc/6.0/config/use-libreoffice.xml
new file mode 100644
index 000000000..fbd948604
--- /dev/null
+++ b/src/site/en/xdoc/6.0/config/use-libreoffice.xml
@@ -0,0 +1,240 @@
+
+
+
+ Use of LibreOffice
+ Shinsuke Sugaya
+
+
+
+
+ It is possible to crawl using the Apache POI Fess environmental standard in MS Office system document.
+ You can crawl Office system document regarding LibreOffice, OpenOffice, do even more accurate text extraction from documents.
+
+
+
JodConverter Fess server install. from http://jodconverter.googlecode.com/jodconverter-core-3.0-Beta-4-Dist.zipThe download. Expand and copy the jar file to Fess server.
+
+
Create a s2robot_extractor.dicon to the next.
+
+
s2robot_extractor.dicon effective jodExtractor with following contents.
Index to generate the settings later, usually crawled into the street.
+
+
+
diff --git a/src/site/en/xdoc/6.0/config/windows-service.xml b/src/site/en/xdoc/6.0/config/windows-service.xml
new file mode 100644
index 000000000..3b1d40fd0
--- /dev/null
+++ b/src/site/en/xdoc/6.0/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-6.0.0.
+ cd C:\Java\fess-server-6.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat6w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 32-bit binaries for Windows Tomcat builds based on. If you use 64-bit WindowsTomcat Of the site from, such as 64-bit Windows zip, please replace tomcat6,exe, tomcat6w,exe.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+Expand the downloaded fess-server-x.y.zip.
+If you installed in the UNIX environment, in the bin added the performing rights to a script.
+
+
+
+
+Administrator account is managed by the application server. Fess Server standard available Tomcat, as well as to the user changing the Tomcat.
+Modify the password for the admin account of the conf/tomcat-user.xml if you want to change.
+
+]]>
+
+
+
+To access the Solr into Fess server is password becomes necessary.
+Change the default passwords in production, etc.
+
+How to change the password, you must first change the password attribute of the conf/tomcat-user.xml solradmin.
+
+
+
+]]>
+
+Describes the provided password webapps/fess/WEB-INF/classes/fess_solr.dicon the following points tomcat-user.xml to the next.
+
+access to / http://localhost:8080/Fess ensures startup.
+
+
+
+Management UI is / http://localhost:8080/fess/admin.
+Default Administrator account user name / password is admin/admin.
+Administrator account is managed by the application server.
+In the management UI of the Fess, authenticate with the application server in fess role determine as an administrator.
+
+
+
+Fess to stop the running shutdown scripts.
+
+
+
+
+Crawl or may take a while to completely stop during the index creation if you.
+
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
Use the boost search if you want to prioritize, search for specific search terms. Enabling search in boost search, depending on the severity of the search words.
+
+
To boost search after the search term ' ^ boost value "that specifies the boost value (weighted) in the format.
+
For example, if you want to find the page if you want to find apples oranges contained more 'apples', type the following search form.
+
+
Boost value specifies an integer greater than 1.
+
+
+
+
diff --git a/src/site/en/xdoc/6.0/user/search-field.xml b/src/site/en/xdoc/6.0/user/search-field.xml
new file mode 100644
index 000000000..72500ebf0
--- /dev/null
+++ b/src/site/en/xdoc/6.0/user/search-field.xml
@@ -0,0 +1,62 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
You crawl in Fess results are saved for each field, such as title and full text. You can search for a field of them. You can specify the search criteria in search for a field, such as document type or size small.
+
+
You can search for a the following fields by default.
+
+
Field list is available
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields subject to find the content. Fields are custom fields are also available by using the dynamic field of Solr.
+
If HTML file and search for the title tag that string in the title field, below the body tag registered in the body field.
+
+
+
If a field specifying the search field name: search words in separated by colons (:) field name and search word fill in the search form, the search.
+
If you search the Fess as a search term for the title field, type.
+
+
Document, the title field in Fess above search appears as a search result.
Ambiguity in the case does not match the words word search to search is available. Based on the Levenshtein distance in Fess ambiguous corresponds to the search (fuzzy search).
+
+
After the search word you want to apply the fuzzy search adds '~'.
+
For example, ambiguous word "Solr", you can find documents that contain the word, near the "Solr" If you want to find, type as the search form, such as ("Solar").
+
+
+
Furthermore, if by '~' after the number 0 and 1, 1 just like in refine. For example, in the form of 'Solr~0.8'. Do not specify numeric default value is 0.5.
Narrow your search by adding the categories to search the document for label information the label is specified when the search is possible. Label information by registering in the Administration screen, will enable search by labels in the search screen. Label information available to selected in the drop-down search. If you do not register the label displayed the label drop-down box.
+
+
You can select the label information at search time.
+
+
To set the label by creating indexes, can search each crawl settings specified on the label. All results search search do not specify a label is usually the same. If you change the label information to update the index.
If you want to find documents that contain any of the search terms OR search use. When describing the multiple words in the search box, by default will search.
+
+
To use search OR search words written OR. OR write in capital letters, the space required to back and forth.
+
For example, if you want to search for documents that contain either search term 2 search term 1 and type the following search form.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/browserType-guide.xml b/src/site/en/xdoc/7.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..a3452c477
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/browserType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/crawl-guide.xml b/src/site/en/xdoc/7.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..cbce17847
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/crawl-guide.xml
@@ -0,0 +1,149 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
+
+
In Administrator account click crawl General menu after login.
+
+
You can specify the path to a generated index and replication capabilities to enable.
+
+
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. This example, and am 0 時 0 分 to crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 2009
+
Each of the 2009 start am, 10:15
+
+
+
0 * 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Every day 2:00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day 2:00 in the PM-2: 05 pm start every 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Save the information you find. Identifying the users becomes possible.
+
+
+
You can collect the search result was judged good by the user. Search result voting link appears to result in list screen, so that link press made the record. You can also reflect the results collected during the crawl index.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If you select Web unusable for mobile search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. can be retrieved by accessing the http://localhost:8080/Fess/hotsearchword.
+
+
+
Delete a session log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
To enable replication features that can apply already copied the Solr index generated. For example, you can use them if you want to search only in the search servers crawled and indexed on a different server, placed in front.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
In Fess in 10 units send the document for Solr. For each value specified here Solr issued document commits. If 0 commit is performed after crawl completion.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used. If you enable incremental crawl does not delete.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. No need to worry about disability type is crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
Copy index information from the index directory as the snapshot path, if replication is enabled, will be applied.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/7.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..5a179a718
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
+
+
You can remove all session information and click the Delete link all in the running. Session has expired will be removed at next crawl.
+
+
+
Sure you can crawl the contents of session ID. Crawl start and finish time, number of documents indexed and listed.
Here, describes Fess information backup and restore methods.
+
+
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Datastore crawl settings
+
Label
+
Path mapping
+
Web authentication
+
File system authentication
+
Request header
+
Duplicate host
+
Roll
+
Compatible browsers
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate. If you need to back up the SOLR index backs solr directory.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/dataCrawlingConfig-guide.xml b/src/site/en/xdoc/7.0/admin/dataCrawlingConfig-guide.xml
new file mode 100644
index 000000000..012cedccd
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/dataCrawlingConfig-guide.xml
@@ -0,0 +1,159 @@
+
+
+
+ Settings for crawling the data store
+ Sone, Takaaki
+ Shinsuke Sugaya
+
+
+
+
You can crawl databases in Fess. Here are required to store settings.
+
+
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
Here the data is put something like the following.
+
+
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
For DB configuration parameter example
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. Close the string in double quotation marks.
+ Access in the database column name, its value.
+
+
Script settings
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
+
+
+
Set the following in the webapps/fess/WEB-INF/classes/app.dicon if you see the item value, such as latitude_s in the search results. After adding to $ {doc.latitude_s}, searchResults.jsp;
Here are settings for the design of search screens.
+
+
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
tstampDate will update on registration date, lastModifiedDate. Output date format is specified in SimpeDateFormat.
+
+
+
+
+
On the search screen files are available to download and can be removed.
+
+
+
You can upload the file to use in the search screen. Image file names are supported are jpg, gif, png, css, and js.
+
+
+
Use if you want to specify the file name file to upload. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
JSP file that you can edit
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
Here are popular URL log. When the popular URL log user clicks voting link on the search screen registers as a favorite link. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click the menu popular URL.
+
+
+
+
Lists popular URL.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/7.0/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..1ddb4dc1f
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/fileAuthentication-guide.xml
@@ -0,0 +1,44 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select a file name to apply the authentication settings for the above. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/7.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..89f9667e2
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,106 @@
+
+
+
+ Settings for file system crawling
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
IP rings contents list
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the crawl configuration information.
+
+
+
Specify the depth of a directory hierarchy.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. If you register the label shown select label drop-down box to the right of the search box.
+
+
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/roleType-guide.xml b/src/site/en/xdoc/7.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..686275f94
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/roleType-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here registration in Fess. SOLR servers are grouped by file, has been registered.
+
+
+
+
In Administrator account after logging in, click menu Solr.
+
+
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and shut down when not running Fess server to shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
+
+
Server group name to search for and update available will be shown.
+
+
+
Server becomes unavailable and the status of disabled. For example, inaccessible to the Solr server and changes to disabled. To enable recovery after server become unavailable will become available.
+
+
+
You can publish index commit, optimize for server groups. You can also remove a specific search for the session ID. You can remove only the specific documents by specifying the URL.
+
+
+
Shown by the number of documents registered in each session. Can verify the results list by clicking the session name.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/systemInfo-guide.xml b/src/site/en/xdoc/7.0/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..27ba783b6
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/systemInfo-guide.xml
@@ -0,0 +1,32 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Here the user log. Identifies the user when you search in the search screen users the user log in. You can search log and popular URL information and the use. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click menu users.
+
+
+
+
Lists the ID of the user. You can select the search logs or popular URL links, to see a list of each log.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/7.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..56a5d2c11
--- /dev/null
+++ b/src/site/en/xdoc/7.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,107 @@
+
+
+
+ Settings for crawling Web site
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
URL filtering contents list
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
You can specify the crawl configuration information.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl. If you do not specify people per 100,000.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in)
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
You can use Settings Wizard, to set you up on the fess.
+
+
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
First, setting a schedule.
+ During the time in fess is crawling and indexes.
+ By default, every day is a 0 時 0 分. Schedules can change even the General crawl settings.
+
+
The crawl settings.
+ Crawl settings is to register a URI to look for.
+ The crawl settings name please put name of any easy to identify. Put the URI part de-indexed, want to search for.
+
+
For example, if you want and search for http://fess.codelibs.org/, less looks like.
+
+
The type, such as c:\Users\taro file.
+
In this is the last setting. Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
Under normal circumstances the database use the H2 Database. You can use other databases by changing settings.
+
+
+
+
The MySQL character code setting. /etc/mysql/my.cnf and the added must have the following settings.
+
+
+
+
Expand the MySQL binaries.
+
+
+
Create a database.
+ create database fess_db;
+mysql> grant all privileges on fess_db.* to fess_user@localhost identified by 'fess_pass';
+mysql> create database fess_robot;
+mysql> grant all privileges on fess_robot.* to s2robot@localhost identified by 's2robot';
+mysql> FLUSH PRIVILEGES;
+]]>
+
Create a table in the database. DDL file is located in extension/mysql.
+ Increasing awareness of security in the browser environment in recent years, open a local file (for example, c:\hoge.txt) from the Web pages on.
+ Not to copy and paste the link from the search results, and then reopen the usability is good.
+ In order to respond to this in Fess and provides desktop search functionality.
+
+
+
+ Desktop Search feature is turned off by default.
+ Please enable the following settings.
+
First of all, bin/setenv.bat as java.awt.headless from true to false edits.
+
+
Then add the following to webapps/fess/WEB-INF/conf/crawler.properties.
+
+
Start the Fess, after you set up above. How to use Basic remains especially.
+
+
+
+
Please Fess inaccessible from the outside, such as (for example, 8080 port does not release).
+
because false Java.awt.headless image size conversion for mobile devices is not available.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/config/filesize.xml b/src/site/en/xdoc/7.0/config/filesize.xml
new file mode 100644
index 000000000..1cf88e616
--- /dev/null
+++ b/src/site/en/xdoc/7.0/config/filesize.xml
@@ -0,0 +1,29 @@
+
+
+
+ File size you want to crawl settings
+ Shinsuke Sugaya
+
+
+
+
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of Defultmxlength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
Together with Google maps, including document with latitude and longitude location information, GEO (GEO) you can use the search.
+
+
+
+
Location is defined as a feed that contains the location information.
+ When generating the index in Solr latitude longitude set to location feeds in formats such as 45.17014,-93.87341, register the document.
+ Also sets the value as the latitude_s and longitude_s fields if you want to display latitude and longitude as a search result. * _s is available as a dynamic field of Solr string.
+
+
+
During the search specifies in the request parameter to latitude and longitude, the distance.
+ View the results in the distance (km) specified by distance-based latitude information (latitude, longitude). Latitude and longitude and distances is treated as double.
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data directory. Also, index data backed up to restore to undo.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/config/install-on-tomcat.xml b/src/site/en/xdoc/7.0/config/install-on-tomcat.xml
new file mode 100644
index 000000000..314d28334
--- /dev/null
+++ b/src/site/en/xdoc/7.0/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 6.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
Set the maximum memory per process in Java. So, do not use the upper memory in the process also had 8 GB of physical memory on the server. Memory consumption depending on the number of crawl threads and interval will also change significantly. If not enough memory please change settings in the subsequent procedure.
+
+
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-Xmx1024m (in this case the maximum value set 1024M).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Fess can copy the path in Solr index data. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
You must use the replication features of Fess for Solr index file in the shared disk, such as NFS, Fess of each can be referenced from.
+
+
+
+
Fess, download and install the./ /NET/Server1/usr/local/Fess To assume you installed.
+
To register the crawl settings as well as Fess starts after the normal construction, create the index (index for Fess building instructions normal building procedures and especially remains the same) crawling.
+
+
+
Fess, download and install the./ /NET/Server2/usr/local/Fess To assume you installed.
+
To enable replication features check box in Fess starts after the management screen crawl settings the "snapshot path'. Snapshot path designates the index location for the index for Fess. In this case, the/NET/Server1/usr/local/Fess //solr/core1/data/index In the will.
+
+
Time press the update button to save the data and set in Schedule performs replication of the index.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/app.dicon shown below.
+
+
+ {"guest"}
+
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+ "role1"
+
+ :
+]]>
+
authenticatedRoles can describe multiple by commas (,).
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information. change the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+ "http://localhost:8080/solr"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in solr/core1/conf/schema.xml. dynamic fields and standard fields such as title and content can be freely defined field names are defined. The dynamic fields that are available in the schema.xml Fess become. Advanced parameter values see a Solr document.
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add a field to use to retrieve data that is stored in the dynamic field next to the webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/fess_solr.dicon.
+
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/config/tokenizer.xml b/src/site/en/xdoc/7.0/config/tokenizer.xml
new file mode 100644
index 000000000..fcdb1422e
--- /dev/null
+++ b/src/site/en/xdoc/7.0/config/tokenizer.xml
@@ -0,0 +1,47 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search. Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits. For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit. For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
Such as CJKTokenizer Japan Japanese multibyte string against bi-gram, in other words two characters create index. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to. However, please note that the index size increases.
+
The following example to change the analyzer part like solr/core1/conf/schema.xml, you can use the StandardTokenizer.
+
+
+
+
+
+
+ :
+
+
+
+
+ :
+]]>
+
Also, useBigram is enabled by default in the webapps/fess/WEB-INF/classes/app.dicon change to false.
+
+ true
+ :
+]]>
+
After the restart the Fess.
+
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/config/use-libreoffice.xml b/src/site/en/xdoc/7.0/config/use-libreoffice.xml
new file mode 100644
index 000000000..363c9c3f3
--- /dev/null
+++ b/src/site/en/xdoc/7.0/config/use-libreoffice.xml
@@ -0,0 +1,85 @@
+
+
+
+ Use of LibreOffice
+ Shinsuke Sugaya
+
+
+
+
+ It is possible to crawl using the Apache POI Fess environmental standard in MS Office system document.
+ You can crawl Office system document regarding LibreOffice, OpenOffice, do even more accurate text extraction from documents.
+
+
+
JodConverter Fess server install. from http://jodconverter.googlecode.com/jodconverter-core-3.0-Beta-4-Dist.zipThe download. Expand and copy the jar file to Fess server.
+
+
Create a s2robot_extractor.dicon to the next.
+
+
s2robot_extractor.dicon effective jodExtractor with following contents.
Index to generate the settings later, usually crawled into the street.
+
+
+
diff --git a/src/site/en/xdoc/7.0/config/windows-service.xml b/src/site/en/xdoc/7.0/config/windows-service.xml
new file mode 100644
index 000000000..093123730
--- /dev/null
+++ b/src/site/en/xdoc/7.0/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-7.0.0.
+ cd C:\Java\fess-server-7.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat6w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 32-bit binaries for Windows Tomcat builds based on. If you use 64-bit WindowsTomcat For 64 bit Windows zip, such as getting from the site and replace tomcat6.exe, tomcat6w.exe.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+Expand the downloaded fess-server-x.y.zip.
+If you installed in the UNIX environment, in the bin added the performing rights to a script.
+
+
+
+
+Administrator account is managed by the application server. Fess Server standard available Tomcat, as well as to the user changing the Tomcat.
+Modify the password for the admin account of the conf/tomcat-user.xml if you want to change.
+
+]]>
+
+
+
+To access the Solr into Fess server is password becomes necessary.
+Change the default passwords in production, etc.
+
+How to change the password, you must first change the password attribute of the conf/tomcat-user.xml solradmin.
+
+
+
+]]>
+
+Describes the provided password webapps/fess/WEB-INF/classes/fess_solr.dicon the following points tomcat-user.xml to the next.
+
+access to / http://localhost:8080/Fess ensures startup.
+
+
+
+Management UI is / http://localhost:8080/fess/admin.
+Default Administrator account user name / password is admin/admin.
+Administrator account is managed by the application server.
+In the management UI of the Fess, authenticate with the application server in fess role determine as an administrator.
+
+
+
+Fess to stop the running shutdown scripts.
+
+
+
+
+Crawl or may take a while to completely stop during the index creation if you.
+
+If you need commercial support, maintenance and technical support for this productN9sm, Inc.To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
Use the search if you want to search for documents that contain all search words of more than one. When describing multiple words in the search box separated by spaces, AND skip AND search.
+
+
If you use the search search words written AND. Write in capital letters AND the space required to back and forth. AND is possible can be omitted.
+
For example, if you want to find documents that contain the search terms 1 and 2 search terms, type the following search form.
Use the boost search if you want to prioritize, search for specific search terms. Enabling search in boost search, depending on the severity of the search words.
+
+
To boost search after the search term ' ^ boost value "that specifies the boost value (weighted) in the format.
+
For example, if you want to find the page if you want to find apples oranges contained more 'apples', type the following search form.
+
+
Boost value specifies an integer greater than 1.
+
+
+
+
diff --git a/src/site/en/xdoc/7.0/user/search-field.xml b/src/site/en/xdoc/7.0/user/search-field.xml
new file mode 100644
index 000000000..3c668053c
--- /dev/null
+++ b/src/site/en/xdoc/7.0/user/search-field.xml
@@ -0,0 +1,66 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
You crawl in Fess results are saved for each field, such as title and full text. You can search for a field of them. You can specify the search criteria in search for a field, such as document type or size small.
+
+
You can search for a the following fields by default.
+
+
Field list is available
+
+
+
Field name
+
Description
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields subject to find the content. Fields are custom fields are also available by using the dynamic field of Solr.
+
If HTML file and search for the title tag that string in the title field, below the body tag registered in the body field.
+
+
+
If a field specifying the search field name: search words in separated by colons (:) field name and search word fill in the search form, the search.
+
If you search the Fess as a search term for the title field, type.
+
+
Document, the title field in Fess above search appears as a search result.
Ambiguity in the case does not match the words word search to search is available. Based on the Levenshtein distance in Fess ambiguous corresponds to the search (fuzzy search).
+
+
After the search word you want to apply the fuzzy search adds '~'.
+
For example, ambiguous word "Solr", you can find documents that contain the word, near the "Solr" If you want to find, type as the search form, such as ("Solar").
+
+
+
Furthermore, if by '~' after the number 0 and 1, 1 just like in refine. For example, in the form of 'Solr~0.8'. Do not specify numeric default value is 0.5.
Narrow your search by adding the categories to search the document for label information the label is specified when the search is possible. Label information by registering in the Administration screen, will enable search by labels in the search screen. Label information available can multiple selections in the drop-down when you search. If you do not register the label displayed the label drop-down box.
+
+
You can select the label information at search time.
+
+
To set the label by creating indexes, can search each crawl settings specified on the label. All results search search do not specify a label is usually the same. If you change the label information to update the index.
If you want to find documents that contain any of the search terms OR search use. When describing the multiple words in the search box, by default will search.
+
+
To use search OR search words written OR. OR write in capital letters, the space required to back and forth.
+
For example, if you want to search for documents that contain either search term 2 search term 1 and type the following search form.
You can use one or multiple character wildcard search terms within. The can be specified as a one-character wildcard, * is specified as the multiple-character wildcard. Wildcards are not available in the first character. You can use wildcards for words. Wildcard search for the sentence.
+
+
If you use one character wildcard shown below? The available.
+
+
If the above is treated as a wildcard for one character, such as text or test.
+
If you use the multiple character wildcard use * below
+
+
If the above is treated as a wildcard for multiple characters, such as test, tests or tester. Also,
+
+
The so can be also used in the search term.
+
+
+
The wildcard string indexed using target. Therefore, because if the index has been created, such as bi-gram be treated meaning fixed string length in Japan Japanese wildcard in Japan, not expected behavior. Use in the field, if you use a wildcard in Japan, that used morphological analysis.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/browserType-guide.xml b/src/site/en/xdoc/8.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..04cde963c
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/browserType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/crawl-guide.xml b/src/site/en/xdoc/8.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..cec008f07
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/crawl-guide.xml
@@ -0,0 +1,147 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
+
+
In Administrator account click crawl General menu after login.
+
+
+
+
+
+
You can set the interval at which the crawl for a Web site or file system. By default, the following.
+
+
Figures are from left, seconds, minutes, during the day, month, represents a day of the week. Description format is similar to the Unix cron settings. For this example, am 0 時 0 分 in crawling daily.
+
Following are examples of how to write.
+
+
+
+
0 0 12 * *?
+
Each day starts at 12 pm
+
+
+
0 15 10? * *
+
Day 10: 15 am start
+
+
+
0 15 10 * *?
+
Day 10: 15 am start
+
+
+
0 15 10 * *? *
+
Day 10: 15 am start
+
+
+
0 15 10 * *? 2009
+
Each of the 2009 start am, 10:15
+
+
+
0 * 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 1 minute
+
+
+
0 0 / 5 14 * *?
+
Every day 2:00 in the PM-2: 59 pm start every 5 minutes
+
+
+
0 0 / 5 14, 18 * *?
+
Every day 2:00 pm-2: 59 pm and 6: 00 starts every 5 minutes at the PM-6: 59 pm
+
+
+
0 0-5 14 * *?
+
Every day 2:00 in the PM-2: 05 pm start every 1 minute
+
+
+
0 10, 44 14? 3 WED
+
Starts Wednesday March 2: 10 and 2: 44 pm
+
+
+
0 15 10? * MON-FRI
+
Monday through Friday at 10:15 am start
+
+
+
+
Also check if the seconds can be set to run at intervals 60 seconds by default. If you set seconds exactly and you should customize webapps/fess/WEB-INF/classes/chronosCustomize.dicon taskScanIntervalTime value, if enough do I see in one-hour increments.
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Save the information you find. Identifying the users becomes possible.
+
+
+
You can collect the search result was judged good by the user. Search result voting link appears to result in list screen, so that link press made the record. You can also reflect the results collected during the crawl index.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
Suggest candidates for search suggestions can be retrieved in XML or JSON format. If you want to get the words beginning with 'test' is the http://localhost:8080/fess/json? type = suggest
+
+
+
Morphological analysis of the results can be retrieved in XML or JSON format. If you want to apply morphological analysis on today's weather is sunny's http://localhost:8080/fess/json? type = analysis
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If you select Web unusable for mobile search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. can be retrieved by accessing the http://localhost:8080/Fess/hotsearchword.
+
+
+
Delete a session log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
After the data is registered for Solr. Index to commit or to optimize the registered data becomes available. If optimize is issued the Solr index optimization, if you have chosen, you choose to commit the commit is issued.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
In Fess in 10 units send the document for Solr. For each value specified here Solr issued document commits. If 0 commit is performed after crawl completion.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. Does not need to monitor the fault type is being crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/8.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..1e833ceb8
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
+
+
You can remove all session information and click the Delete link all in the running. Session has expired will be removed at next crawl.
+
+
+
Sure you can crawl the contents of session ID. Crawl start and finish time, number of documents indexed and listed.
Here, describes Fess information backup and restore methods.
+
+
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Datastore crawl settings
+
Label
+
Path mapping
+
Web authentication
+
File system authentication
+
Request header
+
Duplicate host
+
Roll
+
Compatible browsers
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate. If you need to back up the SOLR index backs solr directory.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/dataCrawlingConfig-guide.xml b/src/site/en/xdoc/8.0/admin/dataCrawlingConfig-guide.xml
new file mode 100644
index 000000000..5ee6f9634
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/dataCrawlingConfig-guide.xml
@@ -0,0 +1,159 @@
+
+
+
+ Settings for crawling the data store
+ Sone, Takaaki
+ Shinsuke Sugaya
+
+
+
+
You can crawl databases in Fess. Here are required to store settings.
+
+
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
Here the data is put something like the following.
+
+
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
For DB configuration parameter example
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. Close the string in double quotation marks.
+ Access in the database column name, its value.
+
+
Script settings
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
+
+
+
Set the following in the webapps/fess/WEB-INF/classes/app.dicon if you see the item value, such as latitude_s in the search results. After adding to $ {doc.latitude_s}, searchResults.jsp;
Here are settings for the design of search screens.
+
+
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
+
+]]>
+
tstampDate will crawl during registration on the lastModifiedDate modified date of the document. Output date formats follow the fmt:formateDate specification.
+
+
+
+
+
On the search screen files are available to download and can be removed.
+
+
+
You can upload the file to use in the search screen. Image file names are supported are jpg, gif, png, css, and js.
+
+
+
Use if you want to specify the file name file to upload. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
JSP file that you can edit
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Top page (within the Head tags)
+
This is the express search home page head tag in JSP files. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Top page (content)
+
Is a JSP file to represent the body tag in the search home page.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results page (within the Head tags)
+
Search result is a JSP file to represent within the head tag of the list page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Search results page (header)
+
Search result is a JSP file to represent the header of the list page. Include search form at the top.
+
+
+
Search results page (footer)
+
Search result is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
+
You can to edit for PCs and similar portable screen.
Here are popular URL log. When the popular URL log user clicks voting link on the search screen registers as a favorite link. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click the menu popular URL.
+
+
+
+
Lists popular URL.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/8.0/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..3ae102fbb
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/fileAuthentication-guide.xml
@@ -0,0 +1,44 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select the set name to apply the authentication settings for the above file system CROLL. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/8.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..934b6538f
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,106 @@
+
+
+
+ Settings for file system crawling
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
IP rings contents list
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the crawl configuration information.
+
+
+
Specify the depth of a directory hierarchy.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. You can pass even if you do not set the crawl settings in the settings of the label to add labels to specify regular expressions. If you register the label shown select label drop-down box to the right of the search box.
+
+
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
+
+
+
Sets the path to label in the regular expression. You can specify multiple in multiple line description. Notwithstanding the crawl configuration document to match the path specified here, will be labeled.
+
+
+
In the path and crawled on regular expressions set from what you want to exclude. You can specify multiple in multiple line description.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/roleType-guide.xml b/src/site/en/xdoc/8.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..3de5358c0
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/roleType-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here are registered in the server settings for crawling and Fess. SOLR servers are grouped by file, has been registered.
+
+
+
+
In Administrator account after logging in, click menu system settings.
+
+
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and Fess server to shut down is not running when shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
You can manually crawling under the crawl start button press stop if it is that.
+
+
+
Server group name to search for and update available will be shown.
+
+
+
In Fess Solr Server conducts a management server and index State States. Whether or not the server state can be access to the Solr Server manages. Whether or not successfully crawl index the State could manage. You can use search server status is in effect, regardless of the State of the index. The crawl Server State is enabled and can index State runs correctly if the preparation or completion. Running start crawl manually index State preparing changes automatically. Server recovery server status and auto-recovery enabled state.
+
+
+
You can be sure SOLR server instance state. You can also, for each instance, start, stop, reload request.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/systemInfo-guide.xml b/src/site/en/xdoc/8.0/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..b0fd0710a
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/systemInfo-guide.xml
@@ -0,0 +1,32 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Here the user log. Identifies the user when you search in the search screen users the user log in. You can search log and popular URL information and the use. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click menu users.
+
+
+
+
Lists the ID of the user. You can select the search logs or popular URL links, to see a list of each log.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/8.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..9c6a4eec2
--- /dev/null
+++ b/src/site/en/xdoc/8.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,107 @@
+
+
+
+ Settings for crawling Web site
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
URL filtering contents list
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
You can specify the crawl configuration information.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl. If you do not specify people per 100,000.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in).
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
You can use Settings Wizard, to set you up on the Fess.
+
+
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
First, setting a schedule.
+ During the time in Fess is crawling and indexes.
+ By default, every day is a 0 時 0 分. Schedules can change even the General crawl settings.
+
+
The crawl settings.
+ Crawl settings is to register a URI to look for.
+ The crawl settings name please put name of any easy to identify. Put the URI part de-indexed, want to search for.
+
+
For example, if you want and search for http://fess.codelibs.org/, less looks like.
+
+
The type, such as c:\Users\taro file.
+
In this is the last setting. Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
Provides binaries to use H2 Database with MySQL database. You can use the other database in to change the settings using the source code and build it.
+
+
+
+
The MySQL character code setting. /etc/mysql/my.cnf and the added must have the following settings.
+
+
+
+
Download MySQL binaries and expand.
+
+
+
Create a database.
+ create database fess_db;
+mysql> grant all privileges on fess_db.* to fess_user@localhost identified by 'fess_pass';
+mysql> create database fess_robot;
+mysql> grant all privileges on fess_robot.* to s2robot@localhost identified by 's2robot';
+mysql> FLUSH PRIVILEGES;
+]]>
+
Create a table in the database. DDL file is located in extension/mysql.
+ Increasing awareness of security in the browser environment in recent years, open a local file (for example, c:\hoge.txt) from the Web pages on.
+ Standard in Fess, open a file on a file system using the Java applet.
+ As a Java applet and another, offer desktop search functionality.
+ You can use desktop environment launches a Fess on a local PC, access to the file in the file system.
+ In the environment of the server and client desktop search not available.
+
+
+
+ Desktop Search feature is turned off by default.
+ Please enable the following settings.
+
First of all, bin/setenv.bat as java.awt.headless from true to false edits.
+
+
Then add the following to webapps/fess/WEB-INF/conf/crawler.properties.
+
+
Start the Fess, after you set up above. How to use Basic remains especially.
+
+
+
+
Please Fess inaccessible from the outside, such as (for example, 8080 port does not release).
+
because false Java.awt.headless image size conversion for mobile devices is not available.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/config/filesize.xml b/src/site/en/xdoc/8.0/config/filesize.xml
new file mode 100644
index 000000000..ff556b6fc
--- /dev/null
+++ b/src/site/en/xdoc/8.0/config/filesize.xml
@@ -0,0 +1,29 @@
+
+
+
+ File size you want to crawl settings
+ Shinsuke Sugaya
+
+
+
+
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
You can document with latitude and longitude location information in conjunction with Google maps, including the use of Dios arch.
+
+
+
+
Location is defined as a feed that contains the location information.
+ When generating the index in Solr latitude longitude set to location feeds in formats such as 45.17614,-93.87341, register the document.
+ Also sets the value as the latitude_s and longitude_s fields if you want to display latitude and longitude as a search result. * _s is available as a dynamic field of Solr string.
+
+
+
During the search specifies in the request parameter to latitude and longitude, the distance.
+ View the results in the distance (km) specified by distance-based latitude information (latitude, longitude). Latitude and longitude and distances is treated as double.
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data directory. Also, index data backed up to restore to undo.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/config/install-on-tomcat.xml b/src/site/en/xdoc/8.0/config/install-on-tomcat.xml
new file mode 100644
index 000000000..314d28334
--- /dev/null
+++ b/src/site/en/xdoc/8.0/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 6.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
Set the maximum memory per process in Java. So, do not use the upper memory in the process also had 8 GB of physical memory on the server. Memory consumption depending on the number of crawl threads and interval will also change significantly. If not enough memory please change settings in the subsequent procedure.
+
+
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case the maximum value set 1024M) will change to-Xmx1024m.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-Xmx1024m (in this case the maximum value set 1024M).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Sets the replication of the index Solr replication features. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
+
+
+
Fess, download and install the. When you install MasterServer named host./ /opt/fess_master To assume you installed. Edit the SOLR/core1/conf/solrconfig.XML like the following.
Register the crawl settings as well as Fess starts after the normal construction. Steps to building the index for Fess remains especially as normal building procedures.
+
+
+
Fess, download and install the./ /opt/fess_slave To assume you installed. Edit the SOLR/core1/conf/solrconfig.XML like the following.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+
+ {"guest"}
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information.
+
change the webapps/fess/WEB-INF/classes/app.dicon the following points.
+ "http://localhost:8080/manager/text/"
+]]>
+
change the webapps/fess/WEB-INF/classes/solrlib.dicon the following points.
+ "http://localhost:8080/solr/core1"
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in solr/core1/conf/schema.xml. dynamic fields and standard fields such as title and content can be freely defined field names are defined. The dynamic fields that are available in the schema.xml Fess become. Advanced parameter values see a Solr document.
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add a field to use to retrieve data that is stored in the dynamic field next to the webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/solrlib.dicon.
+
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is Minctiveserver following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status Mxretrysttuscheckcount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/config/tokenizer.xml b/src/site/en/xdoc/8.0/config/tokenizer.xml
new file mode 100644
index 000000000..b3d65c8e8
--- /dev/null
+++ b/src/site/en/xdoc/8.0/config/tokenizer.xml
@@ -0,0 +1,47 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search. Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits. For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit. For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default CJKTokenizer used.
+
+
+
CJKTokenizer index bi-gram, in other words two characters to like Japan Japanese multibyte string creates. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to. However, please note that the index size increases.
+
The following example to change the analyzer part like solr/core1/conf/schema.xml, you can use the StandardTokenizer.
+
+
+
+
+
+
+ :
+
+
+
+
+ :
+]]>
+
Also, useBigram is enabled by default in the webapps/fess/WEB-INF/classes/app.dicon change to false.
+
+ true
+ :
+]]>
+
After the restart the Fess.
+
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/config/use-libreoffice.xml b/src/site/en/xdoc/8.0/config/use-libreoffice.xml
new file mode 100644
index 000000000..edb25c54f
--- /dev/null
+++ b/src/site/en/xdoc/8.0/config/use-libreoffice.xml
@@ -0,0 +1,85 @@
+
+
+
+ Use of LibreOffice
+ Shinsuke Sugaya
+
+
+
+
+ It is possible to crawl using the Apache POI Fess environmental standard in MS Office system document.
+ You can crawl Office system document regarding LibreOffice, OpenOffice, do even more accurate text extraction from documents.
+
+
+
JodConverter Fess server install. from http://jodconverter.googlecode.com/jodconverter-core-3.0-Beta-4-Dist.zipThe download. Expand and copy the jar file to Fess server.
+
+
Create a s2robot_extractor.dicon to the next.
+
+
s2robot_extractor.dicon effective jodExtractor with following contents.
Index to generate the settings later, usually crawled into the street.
+
+
+
diff --git a/src/site/en/xdoc/8.0/config/windows-service.xml b/src/site/en/xdoc/8.0/config/windows-service.xml
new file mode 100644
index 000000000..5405ec93a
--- /dev/null
+++ b/src/site/en/xdoc/8.0/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-8.0.0.
+ cd C:\Java\fess-server-8.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat7w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 64-bit binaries for Windows Tomcat builds based on. If you use 32-bit WindowsTomcat Of the site from, such as 32-bit Windows zip, please replace tomcat7.exe, tomcat7w.exe, tcnative-1.dll.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+Expand the downloaded fess-server-x.y.zip.
+If you installed in the UNIX environment, in the bin added the performing rights to a script.
+
+
+
+
+Administrator account is managed by the application server. Fess Server standard available Tomcat, as well as to the user changing the Tomcat.
+Modify the password for the admin account of the conf/tomcat-user.xml if you want to change.
+
+]]>
+
+see the Tomcat documentation or JAAS authentication specification to use Tomcat-user.XML file management method other than.
+
+
+
+
+To access the Solr into Fess server is password becomes necessary.
+Change the default passwords in production, etc.
+
+How to change the password, you must first change the password attribute of the conf/tomcat-user.xml solradmin.
+
+
+]]>
+
+Describes the provided password webapps/fess/WEB-INF/classes/solrlib.dicon the following points tomcat-user.xml to the next.
+
+
+ "solradmin"
+ "solradmin"
+
+]]>
+
+
+
+To manage can manage Solr from Fess server deployed on Tomcat context need a password.
+Change the default passwords in production, etc.
+
+How to change password the change manager conf/tomcat-user.xml password attributes.
+
+
+
+]]>
+
+Describes the provided password webapps/fess/WEB-INF/classes/app.dicon the following points tomcat-user.xml to the next.
+
+access to / http://localhost:8080/Fess ensures startup.
+
+
+
+Management UI is / http://localhost:8080/fess/admin.
+Default Administrator account user name / password is admin/admin.
+Administrator account is managed by the application server.
+In the management UI of the Fess, authenticate with the application server in fess role available as an administrator.
+
+
+
+Fess to stop the running shutdown scripts.
+
+
+
+
+Crawl or may take a while to completely stop during the index creation if you.
+
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
You can use the additional parameters if the search string is shown on the screen without the specific search criteria like persuasion. additional value is retained but in the paging screen additional value.
+
+
Without the conditions show screen and run a search when searches are performed by appending additional values in hidden forms, such as (for example, a search form) in the paging screen transitions and also the condition holds.
Use the search if you want to search for documents that contain all search words of more than one. When describing multiple words in the search box separated by spaces, AND skip AND search.
+
+
If you use the search search words written AND. Write in capital letters AND the space required to back and forth. AND is possible can be omitted.
+
For example, if you want to find documents that contain the search terms 1 and 2 search terms, type the following search form.
Use the boost search if you want to prioritize, search for specific search terms. Enabling search in boost search, depending on the severity of the search words.
+
+
To boost search after the search term ' ^ boost value "that specifies the boost value (weighted) in the format.
+
For example, if you want to find the page if you want to find apples oranges contained more 'apples', type the following search form.
+
+
Boost value specifies an integer greater than 1.
+
+
+
+
diff --git a/src/site/en/xdoc/8.0/user/search-field.xml b/src/site/en/xdoc/8.0/user/search-field.xml
new file mode 100644
index 000000000..3c668053c
--- /dev/null
+++ b/src/site/en/xdoc/8.0/user/search-field.xml
@@ -0,0 +1,66 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
You crawl in Fess results are saved for each field, such as title and full text. You can search for a field of them. You can specify the search criteria in search for a field, such as document type or size small.
+
+
You can search for a the following fields by default.
+
+
Field list is available
+
+
+
Field name
+
Description
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields subject to find the content. Fields are custom fields are also available by using the dynamic field of Solr.
+
If HTML file and search for the title tag that string in the title field, below the body tag registered in the body field.
+
+
+
If a field specifying the search field name: search words in separated by colons (:) field name and search word fill in the search form, the search.
+
If you search the Fess as a search term for the title field, type.
+
+
Document, the title field in Fess above search appears as a search result.
Ambiguity in the case does not match the words word search to search is available. Based on the Levenshtein distance in Fess ambiguous corresponds to the search (fuzzy search).
+
+
After the search word you want to apply the fuzzy search adds '~'.
+
For example, ambiguous word "Solr", you can find documents that contain the word, near the "Solr" If you want to find, type as the search form, such as ("Solar").
+
+
+
Furthermore, if by '~' after the number 0 and 1, 1 just like in refine. For example, in the form of 'Solr~0.8'. Do not specify numeric default value is 0.5.
Search using location information when you search, adding latitude and longitude location information for each document when generating the index becomes possible.
+
+
Following parameters is available in the standard.
+
+
Request parameter
+
+
+
GEO.latitude
+
Latitude degrees minutes seconds specifies double.
Narrow your search by adding the categories to search the document for label information the label is specified when the search is possible. Label information by registering in the Administration screen, will enable search by labels in the search screen. Label information available can multiple selections in the drop-down when you search. If you do not register the label displayed the label drop-down box.
+
+
You can select the label information at search time. Label information can be selected in the search options dialog by pressing the options button will appear.
+
+
You can search each document to create an index, set the label to the label. All results search search do not specify a label is usually the same. If you change the label information to update the index.
You can pass any search criteria from third-party search engines move and easy to like. Pass search criteria Please implement processing in QueryHelperImpl #buildOptionQuery.
+
+
Following parameters is available in the standard.
+
+
Request parameter
+
+
+
options.q
+
This is similar to the normal query. You can specify multiple options.q. If you specify multiple is treated as a search. Pass the URL encoding.
+
+
+
options.CQ
+
Treated as exact match search queries. For example, if you specify the Fess Project searches as "Fess Project". Pass the URL encoding.
+
+
+
options.OQ
+
Is treated as an OR search. For example, if you specify the Fess Project search as a Fess OR Project. Pass the URL encoding.
+
+
+
options.NQ
+
The label value. Use to specify the label.
+
Treated as NOT search. For example, if you specify 'Fess' search as NOT Fess. Pass the URL encoding.
If you want to find documents that contain any of the search terms OR search use. When describing the multiple words in the search box, by default will search.
+
+
To use search OR search words written OR. OR write in capital letters, the space required to back and forth.
+
For example, if you want to search for documents that contain either search term 2 search term 1 and type the following search form.
To sort the search results by specifying the fields such as search time.
+
+
You can sort the following fields by default.
+
+
Sort fields list
+
+
+
Field name
+
Description
+
+
+
Tstamp
+
On the crawl
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
+
Adding custom fields as sort in Customizing.
+
+
+
You can select the sorting criteria when you search. Sorting criteria can be selected in the search options dialog by pressing the options button will appear.
+
+
Also, for sorting in the search field sort: the field name to sort and field names separated by colon (:) fill out the search form, the search.
+
In ascending order sort the content size as a search term, Fess is below.
+
+
To sort in descending order as below.
+
+
If you sort by multiple fields separated list, shown below.
You can use one or multiple character wildcard search terms within. The can be specified as a one-character wildcard, * is specified as the multiple-character wildcard. Wildcards are not available in the first character. You can use wildcards for words. Wildcard search for the sentence.
+
+
If you use one character wildcard shown below? The available.
+
+
If the above is treated as a wildcard for one character, such as text or test.
+
If you use the multiple character wildcard use * below
+
+
If the above is treated as a wildcard for multiple characters, such as test, tests or tester. Also,
+
+
The so can be also used in the search term.
+
+
+
The wildcard string indexed using target. Therefore, because if the index has been created, such as bi-gram be treated meaning fixed string length in Japan Japanese wildcard in Japan, not expected behavior. Use in the field, if you use a wildcard in Japan, that used morphological analysis.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/browserType-guide.xml b/src/site/en/xdoc/9.0/admin/browserType-guide.xml
new file mode 100644
index 000000000..a5d112263
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/browserType-guide.xml
@@ -0,0 +1,23 @@
+
+
+
+ Setting the browser type
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the browser type. Search results are browser type can be added to the data, for each type of browser browsing search results out into.
+
+
+
+
In Administrator account after logging in, click menu browser types.
+
+
+
+
+
+
You can set the display name and value. It is used if you want more new terminals. You do not need special customizations are used only where necessary.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/crawl-guide.xml b/src/site/en/xdoc/9.0/admin/crawl-guide.xml
new file mode 100644
index 000000000..eaa445843
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/crawl-guide.xml
@@ -0,0 +1,93 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
+
+
In Administrator account click crawl General menu after login.
+
+
+
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Save the information you find. Identifying the users becomes possible.
+
+
+
You can collect the search result was judged good by the user. Search result voting link appears to result in list screen, so that link press made the record. You can also reflect the results collected during the crawl index.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
If theses PC website search results on mobile devices may not display correctly. And select the mobile conversion, such as if the PC site for mobile terminals, and to show that you can. You can if you choose Google Google Wireless Transcoder allows to display content on mobile phones. For example, if site for PC and mobile devices browsing the results in the search for mobile terminals search results will link in the search result link passes the Google Wireless Transcoder. You can use smooth mobile transformation in mobile search.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If you select Web unusable for mobile search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. http://localhost:8080/Fess/JSON? can be retrieved by accessing the type = hotsearchword.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete the job days before the specified date. One day in the one log purge old log is deleted.
+
+
+
Delete the user information for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the email address to send information about crawl upon completion crawl.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
Crawl as been updated to enable incremental crawl compared lastModified field value and the target document's modification date (if the HTTP's timestamp if LAST_MODIFIED values, file).
+
+
+
File additional group access rights information added to the role.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
In Fess in 10 units send the document for Solr. For each value specified here Solr issued document commits. If 0 commit is performed after crawl completion.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. Does not need to monitor the fault type is being crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/crawlingSession-guide.xml b/src/site/en/xdoc/9.0/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..d1658c22c
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/crawlingSession-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
+
+
You can remove all session information and click the Delete link all in the running. Session has expired will be removed at next crawl.
+
+
+
Sure you can crawl the contents of session ID. Crawl start and finish time, number of documents indexed and listed.
Here, describes Fess information backup and restore methods.
+
+
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Datastore crawl settings
+
Label
+
Path mapping
+
Web authentication
+
File system authentication
+
Request header
+
Duplicate host
+
Roll
+
Compatible browsers
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate. If you need to back up the SOLR index backs solr directory.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/dataCrawlingConfig-guide.xml b/src/site/en/xdoc/9.0/admin/dataCrawlingConfig-guide.xml
new file mode 100644
index 000000000..32fd2d487
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/dataCrawlingConfig-guide.xml
@@ -0,0 +1,159 @@
+
+
+
+ Settings for crawling the data store
+ Sone, Takaaki
+ Shinsuke Sugaya
+
+
+
+
You can crawl data sources such as databases and CSV in Fess. Here are required to store settings.
+
+
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
Here the data is put something like the following.
+
+
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
For DB configuration parameter example
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. Close the string in double quotation marks.
+ Access in the database column name, its value.
+
+
Script settings
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
+
+
+
Set the following in the webapps/fess/WEB-INF/classes/app.dicon if you see the item value, such as latitude_s in the search results. After adding to $ {doc.latitude_s}, searchResults.jsp;
Here are settings for the design of search screens.
+
+
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
+
+]]>
+
tstampDate will crawl during registration on the lastModifiedDate modified date of the document. Output date formats follow the fmt:formateDate specification.
+
+
+
+
+
On the search screen files are available to download and can be removed.
+
+
+
You can upload the file to use in the search screen. Image file names are supported are jpg, gif, png, css, and js.
+
+
+
Use if you want to specify the file name file to upload. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
JSP file that you can edit
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Header
+
It is a JSP file's header.
+
+
+
Footer
+
This is the footer JSP files.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
Help pages (frames)
+
Is the help page of JSP files.
+
+
+
Search error page
+
It is a JSP file search error page. If you want to customize the search error expression change.
+
+
+
Mobile home (frames)
+
It is a JSP file for mobile home. This JSP include JSP file of each part.
+
+
+
Mobile home (within the Head tags)
+
Is a JSP file to express within the head tag for the mobile home page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Mobile home (content)
+
Is a JSP file to represent in the body tag of the mobile home.
+
+
+
Portable search results pages (frames)
+
Search results for mobile is a JSP file's page. This JSP include JSP file of each part.
+
+
+
Portable search results page (within the Head tags)
+
Search results for mobile is a JSP file to represent within the head tags of the page. If you want to edit the meta tags, title tags, script tags, such as the change.
+
+
+
Portable search results page (header)
+
Search results for mobile is a JSP file to represent the header of the page. Include search form at the top.
+
+
+
Portable search results page (footer)
+
Search results for mobile is a JSP file that represents the footer part of the page. Contains the copyright page at the bottom.
+
+
+
Portable search results pages (content)
+
Portable search results search results page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Portable search results page (result no)
+
Portable search results search results page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
File boot page
+
Is the file boot page JSP file. Is the screen used when displaying Java plug-in is enabled to display the search results by using the file system CROLL.
+
+
+
Error page (header)
+
Is a JSP file that represents the header of the page.
+
+
+
Error page (footer)
+
It is a JSP file that represents the footer part of the page.
+
+
+
Error page (page not found)
+
It is displayed if the page cannot be found error page JSP file.
+
+
+
Error (System error)
+
JSP error page that appears if the system error is.
+
+
+
Error pages (redirects)
+
This is the JSP error page displayed when an HTTP redirect occurs.
+
+
+
Error (bad request)
+
Is the error bad request appears when the JSP file.
+
+
+
+
You can to edit for PCs and similar portable screen.
In Administrator account after logging in, click the menu Dictionary. List the various dictionaries available for editing.
+
+
+
+
+
You can register names, nouns, terminology. Registered click path to user dictionaries and dictionary word list is displayed.
+
+
Displays the Edit screen and click the word you want to edit.
+
+
+
Type a Word to search for.
+
+
+
Been searching the words compound word splitting the Word can to hit. For example, you can search any word by entering the full-text search engine and full-text search engine, have split.
+
+
+
Enter the word reading in katakana.
+ Enter the split when you split. For example, enter Sembene search engine.
+
+
+
Enter the words you entered.
+
+
+
+
You can register the same meaning words (GB, gigabyte, etc.). Click the path of the synonym dictionary registered on dictionary word list is displayed.
+
+
Displays the Edit screen and click the word you want to edit.
+
+
+
Type the word being treated as synonyms.
+
+
+
Expand the words you enter in the source in the word after the conversion. For example, 'TV', 'TV' and 'television' If you want to convert type 'TV', 'TV' and 'TV' type after conversion.
Here are popular URL log. When the popular URL log user clicks voting link on the search screen registers as a favorite link. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click the menu popular URL.
+
+
+
+
Lists popular URL.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/9.0/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..c027a47b5
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/fileAuthentication-guide.xml
@@ -0,0 +1,44 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select the set name to apply the authentication settings for the above file system CROLL. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/9.0/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..d3cb3cc66
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,106 @@
+
+
+
+ Settings for file system crawling
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
IP rings contents list
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the crawl configuration information.
+
+
+
Specify the depth of a directory hierarchy.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
In Administrator account after logging in, click menu users.
+
+
+
+
Lists the job run log. You can determine the job name, status, start and finish times. You can also select more information, to check the details of each log.
+
+
+
You can check job log contents. Job name, status, start and completion time, displays the results, such as.
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. You can pass even if you do not set the crawl settings in the settings of the label to add labels to specify regular expressions. If you register the label shown select label drop-down box to the right of the search box.
+
+
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
+
+
+
Sets the path to label in the regular expression. You can specify multiple in multiple line description. Notwithstanding the crawl configuration document to match the path specified here, will be labeled.
+
+
+
In the path and crawled on regular expressions set from what you want to exclude. You can specify multiple in multiple line description.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/roleType-guide.xml b/src/site/en/xdoc/9.0/admin/roleType-guide.xml
new file mode 100644
index 000000000..6a022b1d2
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/roleType-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the job management.
+
+
+
+
+
+
+
It is the name that appears in the list.
+
+
+
You can use as an identifier for whether or not to run when the target job command to run directly in the batch, etc.. If the crawl command execution, do not specify 'all'.
+
+
+
Configure schedule settings. Run jobs written in script on a schedule you set here.
+
Description format describes the format such as Cron seconds minutes date month day year (optional)". For example, "0 0 12? * WED ' for if the weekly Wednesday 12:00 pm job to run. About how to specify the finer "Quartz"Please see.
+
+
+
Specifies the script execution environment. At the moment supports only the 'groovy'.
+
+
+
Written in the language specified in how to perform job run.
+
+
+
To enable records to the job log.
+
+
+
In turn treated as crawl jobs. In establishing the system crawl started and stopped.
+
+
+
Specifies the enabled or disabled status of the job. If the job will not run.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here are registered in the server settings for crawling and Fess. SOLR servers are grouped by file, has been registered.
+
+
+
+
In Administrator account after logging in, click menu system settings.
+
+
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and Fess server to shut down is not running when shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
You can manually crawling under the crawl start button press stop if it is that.
+
+
+
Server group name to search for and update available will be shown.
+
+
+
In Fess Solr Server conducts a management server and index State States. Whether or not the server state can be access to the Solr Server manages. Whether or not successfully crawl index the State could manage. You can use search server status is in effect, regardless of the State of the index. The crawl Server State is enabled and can index State runs correctly if the preparation or completion. Running start crawl manually index State preparing changes automatically. Server recovery server status and auto-recovery enabled state.
+
+
+
You can be sure SOLR server instance state. You can also, for each instance, start, stop, reload request.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/systemInfo-guide.xml b/src/site/en/xdoc/9.0/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..4c6772054
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/systemInfo-guide.xml
@@ -0,0 +1,32 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Here the user log. Identifies the user when you search in the search screen users the user log in. You can search log and popular URL information and the use. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click menu users.
+
+
+
+
Lists the ID of the user. You can select the search logs or popular URL links, to see a list of each log.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/9.0/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..5dc15b77e
--- /dev/null
+++ b/src/site/en/xdoc/9.0/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,107 @@
+
+
+
+ Settings for crawling Web site
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
URL filtering contents list
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
You can specify the crawl configuration information.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl. If you do not specify people per 100,000.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
Register the browser type was selected as the crawled documents. Even if you select only the PC search on your mobile device not appear in results. If you want to see only specific mobile devices also available.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in).
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
You can use Settings Wizard, to set you up on the Fess.
+
+
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
Do the crawl settings.
+ Crawl settings is to register a URI to look for.
+ The crawl settings name please put name of any easy to identify. Put the URI part de-indexed, want to search for.
+
+
For example, if you want and search for http://fess.codelibs.org/, less looks like.
+
+
The type, such as c:\Users\taro file.
+
In this setting is complete. Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
Provides binaries to use H2 Database with MySQL database. You can use the other database in to change the settings using the source code and build it.
+
+
+
+
The MySQL character code setting. /etc/mysql/my.cnf and the added must have the following settings.
+
+
+
+
Download MySQL binaries and expand.
+
+
+
Create a database.
+ create database fess_db;
+mysql> grant all privileges on fess_db.* to fess_user@localhost identified by 'fess_pass';
+mysql> create database fess_robot;
+mysql> grant all privileges on fess_robot.* to s2robot@localhost identified by 's2robot';
+mysql> FLUSH PRIVILEGES;
+]]>
+
Create a table in the database. DDL file is located in extension/mysql.
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
You can document with latitude and longitude location information in conjunction with Google maps, including the use of Dios arch.
+
+
+
+
Location is defined as a feed that contains the location information.
+ When generating the index in Solr latitude longitude set to location feeds in formats such as 45.17614,-93.87341, register the document.
+ Also sets the value as the latitude_s and longitude_s fields if you want to display latitude and longitude as a search result. * _s is available as a dynamic field of Solr string.
+
+
+
During the search specifies in the request parameter to latitude and longitude, the distance.
+ View the results in the distance (km) specified by distance-based latitude information (latitude, longitude). Latitude and longitude and distances is treated as double.
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data and solr/core1-suggest/data directories. Also, index data backed up to restore to undo.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/config/install-on-tomcat.xml b/src/site/en/xdoc/9.0/config/install-on-tomcat.xml
new file mode 100644
index 000000000..17ef0f1b6
--- /dev/null
+++ b/src/site/en/xdoc/9.0/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 7.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
Set the maximum memory per process in Java. So, do not use the upper memory in the process also had 8 GB of physical memory on the server. Memory consumption depending on the number of crawl threads and interval will also change significantly. If not enough memory please change settings in the subsequent procedure.
+
+
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case maximum value set to 1 G)-xmx1g to change.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-xmx1g (in this case maximum value set to 1 G).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Sets the replication of the index Solr replication features. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
+
+
+
Fess, download and install the. When you install MasterServer named host./ /opt/fess_mster To assume you installed. Edit the SOLR/core1/conf/solrconfig.XML like the following.
Register the crawl settings as well as Fess starts after the normal construction. Steps to building the index for Fess remains especially as normal building procedures.
+
+
+
Fess, download and install the./ /opt/fess_slave To assume you installed. Edit the SOLR/core1/conf/solrconfig.XML like the following.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+
+ {"guest"}
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information.
+
change the webapps/fess/WEB-INF/classes/app.dicon the following points.
+ "http://localhost:8080/manager/text/"
+]]>
+
change the webapps/fess/WEB-INF/classes/solrlib.dicon the following points.
+ "http://localhost:8080/solr/core1"
+]]>
+
change the SOLR/core1/conf/solrconfig,XML the following points.
+ http://localhost:8080/solr/core1-suggest
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in SOLR/core1/conf/schema,XML. dynamic fields and standard fields such as title and content can be freely defined field names are defined. Advanced parameter values see a Solr document.
+
+
+
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add a field to use to retrieve data that is stored in the dynamic field next to the webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/solrlib.dicon.
+
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/config/tokenizer.xml b/src/site/en/xdoc/9.0/config/tokenizer.xml
new file mode 100644
index 000000000..296a09c3b
--- /dev/null
+++ b/src/site/en/xdoc/9.0/config/tokenizer.xml
@@ -0,0 +1,47 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search. Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits. For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit. For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default StandardTokenizer CJKBigramFilter used.
+
+
+
StandardTokenizer CJKBigramFilter index bi-gram, in other words two characters to like Japan Japanese multibyte string creates. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to. However, please note that the index size increases.
+
The following example to change the analyzer part like solr/core1/conf/schema.xml, you can use the StandardTokenizer.
+
+
+
+
+
+
+ :
+
+
+
+
+ :
+]]>
+
Also, useBigram is enabled by default in the webapps/fess/WEB-INF/classes/app.dicon change to false.
+
+ true
+ :
+]]>
+
After the restart the Fess.
+
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/config/use-libreoffice.xml b/src/site/en/xdoc/9.0/config/use-libreoffice.xml
new file mode 100644
index 000000000..aeb4c4363
--- /dev/null
+++ b/src/site/en/xdoc/9.0/config/use-libreoffice.xml
@@ -0,0 +1,85 @@
+
+
+
+ Use of LibreOffice
+ Shinsuke Sugaya
+
+
+
+
+ It is possible to crawl using the Apache POI Fess environmental standard in MS Office system document.
+ You can crawl Office system document regarding LibreOffice, OpenOffice, do even more accurate text extraction from documents.
+
+
+
JodConverter Fess server install. from http://jodconverter.googlecode.com/jodconverter-core-3.0-Beta-4-Dist.zipThe download. Expand and copy the jar file to Fess server.
+
+
Create a s2robot_extractor.dicon to the next.
+
+
s2robot_extractor,DiCon effective jodExtractor with following contents.
Index to generate the settings later, usually crawled into the street.
+
+
+
diff --git a/src/site/en/xdoc/9.0/config/windows-service.xml b/src/site/en/xdoc/9.0/config/windows-service.xml
new file mode 100644
index 000000000..84564daa0
--- /dev/null
+++ b/src/site/en/xdoc/9.0/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-9.0.0.
+ cd C:\Java\fess-server-9.0.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat7w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 64-bit binaries for Windows Tomcat builds based on. If you use 32-bit WindowsTomcat Of the site from, such as 32-bit Windows zip, please replace tomcat7.exe, tomcat7w.exe, tcnative-1.dll.
+If you need commercial support, maintenance and technical support for this productN9sm, Inc.To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+Expand the downloaded fess-server-x.y.zip.
+If you installed in the UNIX environment, in the bin added the performing rights to a script.
+
+
+
+
+Administrator account is managed by the application server. Fess Server standard available Tomcat, as well as to the user changing the Tomcat.
+Modify the password for the admin account of the conf/tomcat-user.xml if you want to change.
+
+]]>
+
+see the Tomcat documentation or JAAS authentication specification to use Tomcat-user.XML file management method other than.
+
+
+
+
+To access the Solr into Fess server is password becomes necessary.
+Change the default passwords in production, etc.
+
+How to change the password, you must first change the password attribute of the conf/tomcat-user.xml solradmin.
+
+
+]]>
+
+Modifies the following three files webapps/fess/WEB-INF/classes/solrlib.dicon, fess_suggest.dicon and solr/core1/conf/solrconfig.xml.
+Write what you specified in tomcat-user.xml to the following password.
+
+modify the following areas of the solrlib.dicon.
+
+access to / http://localhost:8080/Fess ensures startup.
+
+
+
+Management UI is / http://localhost:8080/fess/admin.
+Default Administrator account user name / password is admin/admin.
+Administrator account is managed by the application server.
+In the management UI of the Fess, authenticate with the application server in fess role available as an administrator.
+
+
+
+Fess to stop the running shutdown scripts.
+
+
+
+
+Crawl or may take a while to completely stop during the index creation if you.
+
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
You can use the additional parameters if the search string is shown on the screen without the specific search criteria like persuasion. additional value is retained but in the paging screen additional value.
+
+
Without the conditions show screen and run a search when searches are performed by appending additional values in hidden forms, such as (for example, a search form) in the paging screen transitions and also the condition holds.
Use the search if you want to search for documents that contain all search words of more than one. When describing multiple words in the search box separated by spaces, AND skip AND search.
+
+
If you use the search search words written AND. Write in capital letters AND the space required to back and forth. AND is possible can be omitted.
+
For example, if you want to find documents that contain the search terms 1 and 2 search terms, type the following search form.
Use the boost search if you want to prioritize, search for specific search terms. Enabling search in boost search, depending on the severity of the search words.
+
+
To boost search after the search term ' ^ boost value "that specifies the boost value (weighted) in the format.
+
For example, if you want to find the page if you want to find apples oranges contained more 'apples', type the following search form.
+
+
Boost value specifies an integer greater than 1.
+
+
+
+
diff --git a/src/site/en/xdoc/9.0/user/search-field.xml b/src/site/en/xdoc/9.0/user/search-field.xml
new file mode 100644
index 000000000..3c668053c
--- /dev/null
+++ b/src/site/en/xdoc/9.0/user/search-field.xml
@@ -0,0 +1,66 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
You crawl in Fess results are saved for each field, such as title and full text. You can search for a field of them. You can specify the search criteria in search for a field, such as document type or size small.
+
+
You can search for a the following fields by default.
+
+
Field list is available
+
+
+
Field name
+
Description
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields subject to find the content. Fields are custom fields are also available by using the dynamic field of Solr.
+
If HTML file and search for the title tag that string in the title field, below the body tag registered in the body field.
+
+
+
If a field specifying the search field name: search words in separated by colons (:) field name and search word fill in the search form, the search.
+
If you search the Fess as a search term for the title field, type.
+
+
Document, the title field in Fess above search appears as a search result.
Ambiguity in the case does not match the words word search to search is available. Based on the Levenshtein distance in Fess ambiguous corresponds to the search (fuzzy search).
+
+
After the search word you want to apply the fuzzy search adds '~'.
+
For example, ambiguous word "Solr", you can find documents that contain the word, near the "Solr" If you want to find, type as the search form, such as ("Solar").
+
+
+
Furthermore, if by '~' after the number 0 and 1, 1 just like in refine. For example, in the form of 'Solr~0.8'. Do not specify numeric default value is 0.5.
Search using location information when you search, adding latitude and longitude location information for each document when generating the index becomes possible.
+
+
Following parameters is available in the standard.
+
+
Request parameter
+
+
+
GEO.latitude
+
Latitude degrees minutes seconds specifies double.
Narrow your search by adding the categories to search the document for label information the label is specified when the search is possible. Label information by registering in the Administration screen, will enable search by labels in the search screen. Label information available can multiple selections in the drop-down when you search. If you do not register the label displayed the label drop-down box.
+
+
You can select the label information at search time. Label information can be selected in the search options dialog by pressing the options button will appear.
+
+
You can search each document to create an index, set the label to the label. All results search search do not specify a label is usually the same. If you change the label information to update the index.
You can pass any search criteria from third-party search engines move and easy to like. Pass search criteria Please implement processing in QueryHelperImpl #buildOptionQuery.
+
+
Following parameters is available in the standard.
+
+
Request parameter
+
+
+
options.q
+
This is similar to the normal query. You can specify multiple options.q. If you specify multiple is treated as a search. Pass the URL encoding.
+
+
+
options.CQ
+
Treated as exact match search queries. For example, if you specify the Fess Project searches as "Fess Project". Pass the URL encoding.
+
+
+
options.OQ
+
Is treated as an OR search. For example, if you specify the Fess Project search as a Fess OR Project. Pass the URL encoding.
+
+
+
options.NQ
+
The label value. Use to specify the label.
+
Treated as NOT search. For example, if you specify 'Fess' search as NOT Fess. Pass the URL encoding.
If you want to find documents that contain any of the search terms OR search use. When describing the multiple words in the search box, by default will search.
+
+
To use search OR search words written OR. OR write in capital letters, the space required to back and forth.
+
For example, if you want to search for documents that contain either search term 2 search term 1 and type the following search form.
If in the field containing the data in specified range, such as the number range search is possible for that field.
+
+
To limit "field name: value TO value ' fill in the search form.
+
For example, type to search document contentLength field against 1 k to 10 k bytes is shown below the search form.
+
+
To time range specified search ' lastModified: [date 1 TO date 2] "(Re 1 [Re 2) fill out the search form.
+
ISO 8601 with respect to re.
+
+
+
Date and time-resolved second and fractional part
+
If the current relative to the date
+
+
+
YYYY-MM-DDThh:mm:ss.sZ( example :2013-08-02T10:45:23.5Z)
+
NOW (the current date), YEAR (this year), MONTH (month), DAY (today)
+
+
+
To relative to the current date and time NOW and DAY-(Adder, and production) and can sign and (round) like.
+
And a symbol for round / behind unit. Even if now-1DAY/day does today what time today 00: represents the day from 00 -1, the 00: 00.
+
For example, if you search for lastModified field from 2/21/2012 20: (current to date) 30 days prior to the updated document, type the following search form.
To sort the search results by specifying the fields such as search time.
+
+
You can sort the following fields by default.
+
+
Sort fields list
+
+
+
Field name
+
Description
+
+
+
Tstamp
+
On the crawl
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
+
Adding custom fields as sort in Customizing.
+
+
+
You can select the sorting criteria when you search. Sorting criteria can be selected in the search options dialog by pressing the options button will appear.
+
+
Also, for sorting in the search field sort: the field name to sort and field names separated by colon (:) fill out the search form, the search.
+
In ascending order sort the content size as a search term, Fess is below.
+
+
To sort in descending order as below.
+
+
If you sort by multiple fields separated list, shown below.
You can use one or multiple character wildcard search terms within. The can be specified as a one-character wildcard, * is specified as the multiple-character wildcard. Wildcards are not available in the first character. You can use wildcards for words. Wildcard search for the sentence.
+
+
If you use one character wildcard shown below? The available.
+
+
If the above is treated as a wildcard for one character, such as text or test.
+
If you use the multiple character wildcard use * below
+
+
If the above is treated as a wildcard for multiple characters, such as test, tests or tester. Also,
+
+
The so can be also used in the search term.
+
+
+
The wildcard string indexed using target. Therefore, because if the index has been created, such as bi-gram be treated meaning fixed string length in Japan Japanese wildcard in Japan, not expected behavior. Use in the field, if you use a wildcard in Japan, that used morphological analysis.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/crawl-guide.xml b/src/site/en/xdoc/9.1/admin/crawl-guide.xml
new file mode 100644
index 000000000..b37ec0ba8
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/crawl-guide.xml
@@ -0,0 +1,87 @@
+
+
+
+ The General crawl settings
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to crawling.
+
+
+
+
In Administrator account click crawl General menu after login.
+
+
+
+
+
+
When the user enters a search, the search the output log. If you want to get search statistics to enable.
+
+
+
Save the information you find. Identifying the users becomes possible.
+
+
+
You can collect the search result was judged good by the user. Search result voting link appears to result in list screen, so that link press made the record. You can also reflect the results collected during the crawl index.
+
+
+
Search results link attaches to the search term. To display the find search terms in PDF becomes possible.
+
+
+
Search results can be retrieved in XML format. http://localhost:8080/Fess/XML? can get access query = search term.
+
+
+
Search results available in JSON format. http://localhost:8080/Fess/JSON? can get access query = search term.
+
+
+
You can specify the label to see if the label by default,. Specifies the value of the label.
+
+
+
You can specify whether or not to display a search screen. If not available not available search screen. And if you want to create a dedicated index server and select not available.
+
+
+
In JSON format often find search words becomes available. http://localhost:8080/Fess/JSON? can be retrieved by accessing the type = hotsearchword.
+
+
+
Delete a search log for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Delete the job days before the specified date. One day in the one log purge old log is deleted.
+
+
+
Delete the user information for the specified number of days ago. One day in the one log purge old log is deleted.
+
+
+
Specifies the Bots name Bots you want to remove from the search log logs included in the user agent by commas (,). Log is deleted by log purge once a day.
+
+
+
Specifies the email address to send information about crawl upon completion crawl.
+
+
+
Specifies the encoding for the CSV will be available in the backup and restore.
+
+
+
Crawl as been updated to enable incremental crawl compared lastModified field value and the target document's modification date (if the HTTP's timestamp if LAST_MODIFIED values, file).
+
+
+
File additional group access rights information added to the role.
+
+
+
Fess can combine multiple Solr server as a group, the group can manage multiple. Solr server group for updates and search for different groups to use. For example, if you had two groups using the Group 2 for update, search for use of Group 1. After the crawl has been completed if switching server updates for Group 1, switches to group 2 for the search. It is only valid if you have registered multiple Solr server group.
+
+
+
Fess document crawling is done on Web crawling, and file system CROLL. You can crawl to a set number of values in each crawl specified here only to run simultaneously multiple. For example, crawl setting number of concurrent as 3 Web crawling set 1-set 10 if the crawling runs until the set 3 3 set 1-. Complete crawl of any of them, and will start the crawl settings 4. Similarly, setting 10 to complete one each in we will start one.
+
But you can specify the number of threads in the crawl settings simultaneously run crawl setting number is not indicates the number of threads to start. For example, if 3 in the number of concurrent crawls settings, number of threads for each crawl settings and 5 3 x 5 = 15 thread count up and crawling.
+
+
+
You can automatically delete data after the data has been indexed. If you select the 5, with the expiration of index register at least 5 days before and had no update is removed. If you omit data content has been removed, can be used.
+
+
+
Registered disabled URL URL exceeds the failure count next time you crawl to crawl out. Does not need to monitor the fault type is being crawled next time by specifying this value.
+
+
+
Disaster URL exceeds the number of failures will crawl out.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/crawlingSession-guide.xml b/src/site/en/xdoc/9.1/admin/crawlingSession-guide.xml
new file mode 100644
index 000000000..20b862334
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/crawlingSession-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Set session information
+ Shinsuke Sugaya
+
+
+
+
Describes the settings related to the session information. One time the crawl results saved as a single session information. You can check the run time and the number of indexed.
+
+
+
+
In Administrator account after logging in, click the session information menu.
+
+
+
+
+
+
You can remove all session information and click the Delete link all in the running. Session has expired will be removed at next crawl.
+
+
+
Sure you can crawl the contents of session ID. Crawl start and finish time, number of documents indexed and listed.
Here, describes Fess information backup and restore methods.
+
+
+
+
In Administrator account after logging in, click the menu backup and restore.
+
+
+
+
Click the download link and Fess information output in XML format. Saved settings information is below.
+
+
The General crawl settings
+
Web crawl settings
+
File system Crawl settings
+
Datastore crawl settings
+
Label
+
Path mapping
+
Web authentication
+
File system authentication
+
Request header
+
Duplicate host
+
Roll
+
+
Session information, search log, click log is available in CSV format.
+
In the SOLR index data and data being crawled is not backed up. Those data can Fess setting information to crawl after the restore, regenerate. If you need to back up the SOLR index backs solr directory.
+
+
+
You can restore settings information, various log in to upload XML output by backup or CSV. To specify the files, please click the restore button on the data.
+
If enable overwrite data in XML file configuration information specified when the same data is updating existing data.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/dataCrawlingConfig-guide.xml b/src/site/en/xdoc/9.1/admin/dataCrawlingConfig-guide.xml
new file mode 100644
index 000000000..f331705ba
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/dataCrawlingConfig-guide.xml
@@ -0,0 +1,159 @@
+
+
+
+ Settings for crawling the data store
+ Sone, Takaaki
+ Shinsuke Sugaya
+
+
+
+
You can crawl data sources such as databases and CSV in Fess. Here are required to store settings.
+
+
+
+
In Administrator account after logging in, click menu data store.
+
+
As an example, the following table database named testdb MySQL, user name hoge, fuga password connection and the will to make it.
+
+
Here the data is put something like the following.
+
+
+
+
+
+
Parameter settings example looks like the following.
+
+
Parameter is a "key = value" format. Description of the key is as follows.
+
+
For DB configuration parameter example
+
+
+
driver
+
Driver class name
+
+
+
URL
+
URL
+
+
+
username
+
To connect to the DB user name
+
+
+
password
+
To connect to the DB password
+
+
+
SQL
+
Want to crawl to get SQL statement
+
+
+
+
+
+
Script configuration example looks like the following.
+
+
+ Parameter is a "key = value" format.
+ Description of the key is as follows.
+
+ Side of the value written in OGNL. Close the string in double quotation marks.
+ Access in the database column name, its value.
+
+
Script settings
+
+
+
URL
+
URLs (links appear in search results)
+
+
+
host
+
Host name
+
+
+
site
+
Site pass
+
+
+
title
+
Title
+
+
+
content
+
Content (string index)
+
+
+
cache
+
Content cache (not indexed)
+
+
+
Digest
+
Digest piece that appears in the search results
+
+
+
anchor
+
Links to content (not usually required)
+
+
+
contentLength
+
The length of the content
+
+
+
lastModified
+
Content last updated
+
+
+
+
+
+
To connect to the database driver is needed. keep the jar file in webapps/fess/WEB-INF/cmd/lib.
+
+
+
Set the following in the webapps/fess/WEB-INF/classes/app.dicon if you see the item value, such as latitude_s in the search results. After adding to $ {doc.latitude_s}, searchResults.jsp;
Here are settings for the design of search screens.
+
+
+
+
In Administrator account after logging in, click the menu design.
+
+
You can edit the search screen in the screen below.
+
+
+
+
If you want to display in the search results crawl in Fess and registered or modified files to get the search results page (content), write the following.
+
+]]>
+
tstampDate will crawl during registration on the lastModifiedDate modified date of the document. Output date formats follow the fmt:formateDate specification.
+
+
+
+
+
On the search screen files are available to download and can be removed.
+
+
+
You can upload the file to use in the search screen. Image file names are supported are jpg, gif, png, css, and js.
+
+
+
Use if you want to specify the file name file to upload. Uploaded if you omit the file name will be used.
+
+
+
You can edit the JSP files in the search screen. You can by pressing the Edit button of the JSP file, edit the current JSP files. And pressing the button will default to edit as a JSP file when you install. To keep with the update button in the Edit screen, changes are reflected.
+
Following are examples of how to write.
+
+
JSP file that you can edit
+
+
+
Top page (frame)
+
Is a JSP file search home page. This JSP include JSP file of each part.
+
+
+
Header
+
It is a JSP file's header.
+
+
+
Footer
+
This is the footer JSP files.
+
+
+
Search results pages (frames)
+
Search result is a list page of JSP files. This JSP include JSP file of each part.
+
+
+
Search results pages (content)
+
Search results search results list page is a JSP file to represent the part. Is the search results when the JSP file. If you want to customize the search result representation change.
+
+
+
Search results page (result no)
+
Search results search results list page is a JSP file to represent the part. Is a JSP file when the search result is not used.
+
+
+
Help pages (frames)
+
Is the help page of JSP files.
+
+
+
Search error page
+
It is a JSP file search error page. If you want to customize the search error expression change.
+
+
+
File boot page
+
Is the file boot page JSP file. Is the screen used when displaying Java plug-in is enabled to display the search results by using the file system CROLL.
+
+
+
Error page (header)
+
Is a JSP file that represents the header of the page.
+
+
+
Error page (footer)
+
It is a JSP file that represents the footer part of the page.
+
+
+
Error page (page not found)
+
It is displayed if the page cannot be found error page JSP file.
+
+
+
Error (System error)
+
JSP error page that appears if the system error is.
+
+
+
Error pages (redirects)
+
This is the JSP error page displayed when an HTTP redirect occurs.
+
+
+
Error (bad request)
+
Is the error bad request appears when the JSP file.
In Administrator account after logging in, click the menu Dictionary. List the various dictionaries available for editing.
+
+
+
+
+
You can register names, nouns, terminology. Registered click path to user dictionaries and dictionary word list is displayed.
+
+
Displays the Edit screen and click the word you want to edit.
+
+
+
Type a Word to search for.
+
+
+
Been searching the words compound word splitting the Word can to hit. For example, you can search any word by entering the full-text search engine and full-text search engine, have split.
+
+
+
Enter the word reading in katakana.
+ Enter the split when you split. For example, enter Sembene search engine.
+
+
+
Enter the words you entered.
+
+
+
+
You can register the same meaning words (GB, gigabyte, etc.). Click the path of the synonym dictionary registered on dictionary word list is displayed.
+
+
Displays the Edit screen and click the word you want to edit.
+
+
+
Type the word being treated as synonyms.
+
+
+
Expand the words you enter in the source in the word after the conversion. For example, 'TV', 'TV' and 'television' If you want to convert type 'TV', 'TV' and 'TV' type after conversion.
Here are popular URL log. When the popular URL log user clicks voting link on the search screen registers as a favorite link. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click the menu popular URL.
+
+
+
+
Lists popular URL.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/fileAuthentication-guide.xml b/src/site/en/xdoc/9.1/admin/fileAuthentication-guide.xml
new file mode 100644
index 000000000..83d8c1a34
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/fileAuthentication-guide.xml
@@ -0,0 +1,44 @@
+
+
+
+ Settings for file system authentication
+ Shinsuke Sugaya
+
+
+
+
Crawls using file system here, describes how to set file system authentication is required. Fess is corresponding to a crawl for a shared folder in Windows.
+
+
+
+
In Administrator account after logging in, click the menu file system authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Is omitted, the specified file system Kroll set applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. File system Crawl settings specified in that case applies on any port.
+
+
+
Select the authentication method. You can use SAMBA (Windows shared folder authentication).
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. SAMBA, the set value of the domain. If you want to write as.
+
+
+
+
Select the set name to apply the authentication settings for the above file system CROLL. Must be registered ago you file system CROLL.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/fileCrawlingConfig-guide.xml b/src/site/en/xdoc/9.1/admin/fileCrawlingConfig-guide.xml
new file mode 100644
index 000000000..c43dfeaae
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/fileCrawlingConfig-guide.xml
@@ -0,0 +1,103 @@
+
+
+
+ Settings for file system crawling
+ Shinsuke Sugaya
+
+
+
+
Describes the settings for crawl here, using file system.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu file.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple paths. file: or smb: in the specify starting. For example,
+
+
The so determines. Patrolling below the specified directory.
+
So there is need to write URI if the Windows environment path that c:\Documents\taro in file/c: /Documents/taro and specify.
+
Windows shared folder, for example, if you want to crawl to host1 share folder crawl settings for smb: (last / to) the //host1/share/. If authentication is in the shared folder on the file system authentication screen set authentication information.
+
+
+
By specifying regular expressions you can exclude the crawl and search for given path pattern.
+
+
IP rings contents list
+
+
+
Path to crawl
+
Crawl the path for the specified regular expression.
+
+
+
The path to exclude from being crawled
+
The path for the specified regular expression does not crawl. The path you want to crawl, even WINS here.
+
+
+
Path to be searched
+
The path for the specified regular expression search. Even if specified path to find excluded and WINS here.
+
+
+
Path to exclude from searches
+
Not search the path for the specified regular expression. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, the path to target if you don't crawl less than/home /
+
+
Also the path to exclude if extension of png want to exclude from
+
+
It specifies. It is possible to specify multiple line breaks in.
+
How to specify the URI handling java.io.File: Looks like:
You can specify the crawl configuration information.
+
+
+
Specify the depth of a directory hierarchy.
+
+
+
You can specify the number of documents to retrieve crawl.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the time interval to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
In Administrator account after logging in, click menu users.
+
+
+
+
Lists the job run log. You can determine the job name, status, start and finish times. You can also select more information, to check the details of each log.
+
+
+
You can check job log contents. Job name, status, start and completion time, displays the results, such as.
Here are settings for the label. Label can classify documents that appear in search results, select the crawl settings in. You can pass even if you do not set the crawl settings in the settings of the label to add labels to specify regular expressions. If you register the label shown select label drop-down box to the right of the search box.
+
+
+
+
In Administrator account after logging in, click the menu label.
+
+
+
+
+
+
+
Specifies the name that is displayed when the search label drop-down select.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
+
+
+
Sets the path to label in the regular expression. You can specify multiple in multiple line description. Notwithstanding the crawl configuration document to match the path specified here, will be labeled.
+
+
+
In the path and crawled on regular expressions set from what you want to exclude. You can specify multiple in multiple line description.
Here are settings on the duplicate host. Available when the duplicate host to be treated as the same thing crawling at a different host name. For example, if you want the same site www.example.com and example.com in available.
+
+
+
+
In Administrator account after logging in, click the menu duplicate host.
+
+
+
+
+
+
+
Specify the canonical host name. Duplicate host names replace the canonical host name.
+
+
+
Specify the host names are duplicated. Specifies the host name you want to replace.
Here are settings for path mapping. You can use if you want replaced path mapping links appear in search results.
+
+
+
+
In Administrator account after logging in, click menu path mappings.
+
+
+
+
+
+
+
Path mapping is replaced by parts to match the specified regular expression, replace the string with. When crawling a local filesystem environment may search result links are not valid. Such cases using path mapping, you can control the search results link. You can specify multiple path mappings.
Here the request header. Feature request headers request header information added to requests when you get to crawl documents. Available if, for example, to see header information in the authentication system, if certain values are logged automatically.
+
+
+
+
In Administrator account after logging in, click request header menu.
+
+
+
+
+
+
+
Specifies the request header name to append to the request.
+
+
+
Specifies the request header value to append to the request.
+
+
+
Select a Web crawl setting name to add request headers. Only selected the crawl settings in appended to the request header.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/roleType-guide.xml b/src/site/en/xdoc/9.1/admin/roleType-guide.xml
new file mode 100644
index 000000000..fd8572cec
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/roleType-guide.xml
@@ -0,0 +1,27 @@
+
+
+
+ Settings for a role
+ Shinsuke Sugaya
+
+
+
+
Here are settings for the role. Role is selected in the crawl settings, you can classify the document appears in the search results. About how to use theSettings for a rolePlease see the.
+
+
+
+
In Administrator account after logging in, click menu role.
+
+
+
+
+
+
+
Specifies the name that appears in the list.
+
+
+
Specifies the identifier when a classified document. This value will be sent to Solr. Must be alphanumeric characters.
In Administrator account after logging in, click the job management.
+
+
+
+
+
+
+
It is the name that appears in the list.
+
+
+
You can use as an identifier for whether or not to run when the target job command to run directly in the batch, etc.. If the crawl command execution, do not specify 'all'.
+
+
+
Configure schedule settings. Run jobs written in script on a schedule you set here.
+
Description format describes the format such as Cron seconds minutes date month day year (optional)". For example, "0 0 12? * WED ' for if the weekly Wednesday 12:00 pm job to run. About how to specify the finer "Quartz"Please see.
+
+
+
Specifies the script execution environment. At the moment supports only the 'groovy'.
+
+
+
Written in the language specified in how to perform job run.
+
For example, if you want to run crawl jobs only three crawl settings describes as (assuming Web crawl configuration ID 1 and file system Kroll set ID to 1 and 2).
+
+
+
+
To enable records to the job log.
+
+
+
In turn treated as crawl jobs. In establishing the system crawl started and stopped.
+
+
+
Specifies the enabled or disabled status of the job. If the job will not run.
In Administrator account after logging in, click the menu search.
+
+
+
+
You can search by criteria you specify. In the regular search screen role and browser requirements is added implicitly, but do not provide management for search. You can document a certain remove from index from the search results.
Here the search log. When you search in the search screen users search logs are logged. Search log search term or date is recorded. You can also record the URL, then you want the search results to.
+
+
+
+
In Administrator account after logging in, click menu search logs.
+
+
+
+
Search language and date are listed. You can review and detailed, you click the URL.
Describes the settings related to Solr, here are registered in the server settings for crawling and Fess. SOLR servers are grouped by file, has been registered.
+
+
+
+
In Administrator account after logging in, click menu system settings.
+
+
+
+
+
+
Update server appears as a running if additional documents, such as the. Crawl process displays the session ID when running. You can safely shut down and Fess server to shut down is not running when shut down. If the process does not terminate if you shut a Fess is running to finish crawling process.
+
You can manually crawling under the crawl start button press stop if it is that.
+
+
+
Server group name to search for and update available will be shown.
+
+
+
In Fess Solr Server conducts a management server and index State States. Whether or not the server state can be access to the Solr Server manages. Whether or not successfully crawl index the State could manage. You can use search server status is in effect, regardless of the State of the index. The crawl Server State is enabled and can index State runs correctly if the preparation or completion. Running start crawl manually index State preparing changes automatically. Server recovery server status and auto-recovery enabled state.
+
+
+
You can be sure SOLR server instance state. You can also, for each instance, start, stop, reload request.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/systemInfo-guide.xml b/src/site/en/xdoc/9.1/admin/systemInfo-guide.xml
new file mode 100644
index 000000000..1b2c03d58
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/systemInfo-guide.xml
@@ -0,0 +1,32 @@
+
+
+
+ System information
+ Shinsuke Sugaya
+
+
+
+
Here, you can currently check property information such as system environment variables.
+
+
+
+
In Administrator account after logging in, click system information menu.
+
+
+
+
+
+
You can list the server environment variable.
+
+
+
You can list the system properties on Fess.
+
+
+
Fess setup information available.
+
+
+
Is a list of properties to attach when reporting a bug. Extract the value contains no personal information.
Here the user log. Identifies the user when you search in the search screen users the user log in. You can search log and popular URL information and the use. You can disable this feature in the General crawl settings.
+
+
+
+
In Administrator account after logging in, click menu users.
+
+
+
+
Lists the ID of the user. You can select the search logs or popular URL links, to see a list of each log.
Describes Web authentication is required when set against here, using Web crawling. Fess is corresponding to a crawl for BASIC authentication and DIGEST authentication.
+
+
+
+
In Administrator account after logging in, click menu Web authentication.
+
+
+
+
+
+
Specifies the host name of the site that requires authentication. Web crawl settings you specify if applicable in any host name.
+
+
+
Specifies the port of the site that requires authentication. Specify-1 to apply for all ports. Web crawl settings you specified and if applicable on any port.
+
+
+
Specifies the realm name of the site that requires authentication. Web crawl settings you specify if applicable in any realm name.
+
+
+
Select the authentication method. You can use BASIC authentication, DIGEST authentication or NTLM authentication.
+
+
+
Specifies the user name to log in authentication.
+
+
+
Specifies the password to log into the certification site.
+
+
+
Sets if the authentication site login required settings. You can set the workstation and domain values for NTLM authentication. If you want to write as.
+
+
+
+
Select to apply the above authentication settings Web settings name. Must be registered in advance Web crawl settings.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/admin/webCrawlingConfig-guide.xml b/src/site/en/xdoc/9.1/admin/webCrawlingConfig-guide.xml
new file mode 100644
index 000000000..13e735509
--- /dev/null
+++ b/src/site/en/xdoc/9.1/admin/webCrawlingConfig-guide.xml
@@ -0,0 +1,104 @@
+
+
+
+ Settings for crawling Web site
+ Shinsuke Sugaya
+
+
+
+
Describes the settings here, using Web crawling.
+
Recommends that if you want to index document number 100000 over in Fess crawl settings for one to several tens of thousands of these. One crawl setting a target number 100000 from the indexed performance degrades.
+
+
+
+
In Administrator account after logging in, click menu Web.
+
+
+
+
+
+
Is the name that appears on the list page.
+
+
+
You can specify multiple URLs. http: or https: in the specify starting. For example,
+
+
The so determines.
+
+
+
By specifying regular expressions you can exclude the crawl and search for specific URL pattern.
+
+
URL filtering contents list
+
+
+
URL to crawl
+
Crawl the URL for the specified regular expression.
+
+
+
Excluded from the crawl URL
+
The URL for the specified regular expression does not crawl. The URL to crawl, even WINS here.
+
+
+
To search for URL
+
The URL for the specified regular expression search. Even if specified and the URL to the search excluded WINS here.
+
+
+
To exclude from the search URL
+
URL for the specified regular expression search. Unable to search all links since they exclude from being crawled and crawled when the search and not just some.
+
+
+
+
For example, http: URL to crawl if not crawl //localhost/ less than the
+
+
Also be excluded if the extension of png want to exclude from the URL
+
+
It specifies. It is possible to specify multiple in the line for.
+
+
+
You can specify the crawl configuration information.
+
+
+
That will follow the links contained in the document in the crawl order can specify the tracing depth.
+
+
+
You can specify the number of documents to retrieve crawl. If you do not specify people per 100,000.
+
+
+
You can specify the user agent to use when crawling.
+
+
+
Specifies the number of threads you want to crawl. Value of 5 in 5 threads crawling the website at the same time.
+
+
+
Is the interval (in milliseconds) to crawl documents. 5000 when one thread is 5 seconds at intervals Gets the document.
+
Number of threads, 5 pieces, will be to go to and get the 5 documents per second between when 1000 millisecond interval,. Set the adequate value when crawling a website to the Web server, the load would not load.
+
+
+
You can search URL in this crawl setting to weight. Available in the search results on other than you want to. The standard is 1. Priority higher values, will be displayed at the top of the search results. If you want to see results other than absolutely in favor, including 10,000 sufficiently large value.
+
Values that can be specified is an integer greater than 0. This value is used as the boost value when adding documents to Solr.
+
+
+
You can control only when a particular user role can appear in search results. You must roll a set before you. For example, available by the user in the system requires a login, such as portal servers, search results out if you want.
+
+
+
You can label with search results. Search on each label, such as enable, in the search screen, specify the label.
+
+
+
Crawl crawl time, is set to enable. If you want to avoid crawling temporarily available.
+
+
+
+
+
Fess and crawls sitemap file, as defined in the URL to crawl. Sitemaphttp://www.sitemaps.org/ Of the specification. Available formats are XML Sitemaps and XML Sitemaps Index the text (URL line written in).
+
Site map the specified URL. Sitemap is a XML files and XML files for text, when crawling that URL of ordinary or cannot distinguish between what a sitemap. Because the file name is sitemap.*.xml, sitemap.*.gz, sitemap.*txt in the default URL as a Sitemap handles (in webapps/fess/WEB-INF/classes/s2robot_rule.dicon can be customized).
+
Crawls sitemap file to crawl the HTML file links will crawl the following URL in the next crawl.
You can use Settings Wizard, to set you up on the Fess.
+
+
+
+
In Administrator account after logging in, click menu Settings Wizard.
+
+
Do the crawl settings.
+ Crawl settings is to register a URI to look for.
+ The crawl settings name please put name of any easy to identify. Put the URI part de-indexed, want to search for.
+
+
For example, if you want and search for http://fess.codelibs.org/, less looks like.
+
+
The type, such as c:\Users\taro file.
+
In this setting is complete. Crawl start button press the start crawling. Not start until in the time specified in the scheduling settings by pressing the Finish button if the crawl.
+
+
+
+
Settings in the Setup Wizard you can change from crawl General, Web, file system.
Provides binaries to use H2 Database with MySQL database. You can use the other database in to change the settings using the source code and build it.
+
+
+
+
The MySQL character code setting. /etc/mysql/my.cnf and the added must have the following settings.
+
+
+
+
Download MySQL binaries and expand.
+
+
+
Create a database.
+ create database fess_db;
+mysql> grant all privileges on fess_db.* to fess_user@localhost identified by 'fess_pass';
+mysql> create database fess_robot;
+mysql> grant all privileges on fess_robot.* to s2robot@localhost identified by 's2robot';
+mysql> FLUSH PRIVILEGES;
+]]>
+
Create a table in the database. DDL file is located in extension/mysql.
You can specify the file size limit crawl of Fess. In the default HTML file is 2.5 MB, otherwise handles up to 10 m bytes. Edit the webapps/fess/WEB-INF/classes/s2robot_contentlength.dicon if you want to change the file size handling. Standard s2robot_contentlength.dicon is as follows.
Change the value of defaultMaxLength if you want to change the default value. Dealing with file size can be specified for each content type. Describes the maximum file size to handle text/HTML and HTML files.
+
Note the amount of heap memory to use when changing the maximum allowed file size handling. About how to set upMemory-relatedPlease see the.
You can document with latitude and longitude location information in conjunction with Google maps, including the use of Dios arch.
+
+
+
+
Location is defined as a feed that contains the location information.
+ When generating the index in Solr latitude longitude set to location feeds in formats such as 45.17614,-93.87341, register the document.
+ Also sets the value as the latitude_s and longitude_s fields if you want to display latitude and longitude as a search result. * _s is available as a dynamic field of Solr string.
+
+
+
During the search specifies in the request parameter to latitude and longitude, the distance.
+ View the results in the distance (km) specified by distance-based latitude information (latitude, longitude). Latitude and longitude and distances is treated as double.
The index data is managed by Solr. Backup from the Administration screen of the Fess, and cases will be in the size and number of Gigabit can not index data.
+
If you need to index data backup stopped the Fess from back solr/core1/data and solr/core1-suggest/data directories. Also, index data backed up to restore to undo.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/config/install-on-tomcat.xml b/src/site/en/xdoc/9.1/config/install-on-tomcat.xml
new file mode 100644
index 000000000..17ef0f1b6
--- /dev/null
+++ b/src/site/en/xdoc/9.1/config/install-on-tomcat.xml
@@ -0,0 +1,43 @@
+
+
+
+ Install to an existing Tomcat
+ Shinsuke Sugaya
+
+
+
+
+ The standard distribution of Fess Tomcat is distributed in the deployed State.
+ Because Fess is not dependent on Tomcat, deploying on any Java application server is available.
+ Describes how to deploy a Fess Tomcat here is already available.
+ Expand the downloaded Fess server.
+ Expanded Fess Server home directory to $FESS_HOME.
+ $TOMCAT_HOME the top directory of an existing Tomcat 7.
+ Copy the Fess Server data.
+
+
+ If you have, such as changing the destination file diff commands, updates your diff only applies.
+
Set the maximum memory per process in Java. So, do not use the upper memory in the process also had 8 GB of physical memory on the server. Memory consumption depending on the number of crawl threads and interval will also change significantly. If not enough memory please change settings in the subsequent procedure.
+
+
+
If the contents of the crawl settings cause OutOfMemory error similar to the following.
+
+
Increase the maximum heap memory occur. bin/setenv. [sh | bat] to (in this case maximum value set to 1 G)-xmx1g to change.
+
+
+
+
+ Crawler side memory maximum value can be changed.
+ The default is 512 m.
+
+ Unplug the commented out webapps/fess/WEB-INF/classes/fess.dicon crawlerJavaOptions to change, change the-xmx1g (in this case maximum value set to 1 G).
+
The mobile device informationValueEngine Inc.That provided more available. If you want to use the latest mobile device information downloaded device profile save the removed _YYYY-MM-DD and webapps/fess/WEB-INF/classes/device. After the restart to enable change.
+ You should password files to register the settings file to PDF password is configured to search for.
+
+
+
+
+ First of all, create the webapps/fess/WEB-INF/classes/s2robot_extractor.dicon.
+ This is test _ ~ is a pass that password set to a.pdf file.
+ If you have multiple files, multiple settings in addPassword.
In Fess when indexing and searching the stemming process done.
+
This is to normalize the English word processing, for example, words such as recharging and rechargable is normalized to form recharg. Hit and even if you search by recharging the word this word rechargable, less search leakage is expected.
+
+
+
You may not intended for the stemming process basic rule-based processing, normalization is done. For example, Maine (state name) Word will be normalized in the main.
+
In this case, by adding Maine to protwords.txt, you can exclude the stemming process.
Sets the replication of the index Solr replication features. You can distribute load during indexing to build two in Fess of the crawl and index creation and search for Fess servers.
+
+
+
+
Fess, download and install the. When you install MasterServer named host./ /opt/fess_master To assume you installed. Edit the SOLR/core1/conf/solrconfig.XML like the following.
Register the crawl settings as well as Fess starts after the normal construction. Steps to building the index for Fess remains especially as normal building procedures.
+
+
+
Fess, download and install the./ /opt/fess_slave To assume you installed. Edit the SOLR/core1/conf/solrconfig.XML like the following.
You can divide out search results in Fess in any authentication system authenticated users credentials to. For example, find rolls a does appears role information in search results with the roles a user a user b will not display it. By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
In role-based search of the Fess roll information available below.
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
To save authentication information in cookies for authentication when running of Fess in portal and agent-based single sign-on system domain and path that can retrieve role information. You can also reverse proxy type single sign-on system access to Fess adding authentication information in the request headers and request parameters to retrieve role information.
+
+
+
Describes how to set up role-based search using J2EE authentication information.
+
+
conf/Tomcat-users.XML the add roles and users. This time the role1 role perform role-based search. Login to role1.
+
+
+
+
+
+
+
+
+
+]]>
+
+
+
sets the webapps/fess/WEB-INF/classes/fess.dicon shown below.
+
+
+ {"guest"}
+
+ :
+]]>
+
You can set the role information by setting the defaultRoleList, there is no authentication information. Do not display the search results need roles for users not logged in you.
+
+
+
sets the webapps/fess/WEB-INF/web.xml shown below.
Fess up and log in as an administrator. From the role of the menu set name Role1 (any name) and value register role at role1. After the crawl settings want to use in the user with the role1 in, crawl Crawl Settings select Role1.
+
+
+
Log out from the management screen. log in as user Role1. A successful login and redirect to the top of the search screen.
+
Only thing was the Role1 role setting in the crawl settings search as usual, and displayed.
+
Also, search not logged in will be search by guest user.
+
+
+
Whether or not logged out, logged in a non-Admin role to access http://localhost:8080/fess/admin screen appears. By pressing the logout button will log out.
Fess by default, you use the port 8080. Change in the following steps to change.
+
+
Change the port Tomcat is Fess available. Modifies the following described conf/server.xml changes.
+
+
8080: HTTP access port
+
8005: shut down port
+
8009: AJP port
+
: SSL HTTP access port 8443 (the default is off)
+
19092: database port (use h2database)
+
+
+
+
May need to change if you change the Tomcat port using the settings in the standard configuration, the same Solr-Tomcat, so Fess Solr server referenced information.
+
change the webapps/fess/WEB-INF/classes/app.dicon the following points.
+ "http://localhost:8080/manager/text/"
+]]>
+
change the webapps/fess/WEB-INF/classes/solrlib.dicon the following points.
+ "http://localhost:8080/solr/core1"
+]]>
+
change the SOLR/core1/conf/solrconfig.XML the following points.
+ http://localhost:8080/solr/core1-suggest
+]]>
+
+ Note: to display the error on search and index update: cannot access the Solr server and do not change if you change the Tomcat port similar to the above ports.
+
SOLR is document items (fields) for each to the schema defined in order to register. Available in Fess Solr schema is defined in solr/core1/conf/schema.xml. dynamic fields and standard fields such as title and content can be freely defined field names are defined. Advanced parameter values see a Solr document.
+
+
+
I think scenes using the dynamic field of many, in database scrawl's, such as registering in datastore crawl settings. How to register dynamic fields in database scrawl by placing the script other_t = hoge hoge column data into Solr other_t field.
+
You need to add a field to use to retrieve data that is stored in the dynamic field next to the webapps/fess/WEB-INF/classes/app.dicon. Add the other_t.
Edit the JSP file has made returns from Solr in the above settings, so to display on the page. Login to the manage screen, displays the design. Display of search results the search results displayed on the page (the content), so edit the JSP file. where you want to display the other_t value in $ {f:h(doc.other_t)} and you can display the value registered in.
Solr server group in the Fess, managing multiple groups. Change the status of servers and groups if the server and group information that keeps a Fess, inaccessible to the Solr server.
+
SOLR server state information can change in system setting. maxErrorCount, maxRetryStatusCheckCount, maxRetryUpdateQueryCount and minActiveServer can be defined in the webapps/fess/WEB-INF/classes/solrlib.dicon.
+
+
+
+
When SOLR group within Solr server number of valid state minActiveServer less than Solr group will be disabled.
+
Solr server number of valid state is minActiveServer following group in the SOLR Solr group into an invalid state if is not, you can access to the Solr server, disable Solr server status maxRetryStatusCheckCount check to Solr server status change from the disabled state the valid state. The valid state not changed and was able to access Solr Server index corrupted state.
+
Disable Solr group is not available.
+
SOLR group to enable States to the group in the Solr Solr server status change enabled in system settings management screen.
+
+
+
+
+
Search queries can send valid Solr group.
+
Search queries will be sent only to valid Solr server.
+
Send a search query to fewer available if you register a Solr server multiple SOLR group in the Solr server.
+
The search query was sent to the SOLR server fails maxErrorCount than Solr server modifies the disabled state.
+
+
+
+
+
Update queries you can send valid state Solr group.
+
Update query will be sent only to valid Solr server.
+
If multiple Solr servers are registered in the SOLR group in any valid state Solr server send the update query.
+
Is sent to the SOLR Server update query fails maxRetryUpdateQueryCount than Solr server modifies the index corrupted state.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/config/tokenizer.xml b/src/site/en/xdoc/9.1/config/tokenizer.xml
new file mode 100644
index 000000000..296a09c3b
--- /dev/null
+++ b/src/site/en/xdoc/9.1/config/tokenizer.xml
@@ -0,0 +1,47 @@
+
+
+
+ Settings for the index string extraction
+ Sone, Takaaki
+
+
+
+
+
You must isolate the document in order to register as the index when creating indexes for the search. Tokenizer is used for this.
+
Basically, carved by the tokenizer units smaller than go find no hits. For example, statements of living in Tokyo, Japan. Was split by the tokenizer now, this statement is in Tokyo, living and so on. In this case, in Tokyo, Word search, you will get hit. However, when performing a search with the word 'Kyoto' will not be hit. For selection of the tokenizer is important.
+
You can change the tokenizer by setting the schema.xml analyzer part is if the Fess in the default StandardTokenizer CJKBigramFilter used.
+
+
+
StandardTokenizer CJKBigramFilter index bi-gram, in other words two characters to like Japan Japanese multibyte string creates. In this case, can't find one letter words.
+
+
+
+
StandardTokenizer creates index uni-gram, in other words one by one for the Japan language of multibyte-character strings. Therefore, the less search leakage. Also, with StandardTokenizer can't CJKTokenizer the search query letter to search to. However, please note that the index size increases.
+
The following example to change the analyzer part like solr/core1/conf/schema.xml, you can use the StandardTokenizer.
+
+
+
+
+
+
+ :
+
+
+
+
+ :
+]]>
+
Also, useBigram is enabled by default in the webapps/fess/WEB-INF/classes/app.dicon change to false.
+
+ true
+ :
+]]>
+
After the restart the Fess.
+
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/config/use-libreoffice.xml b/src/site/en/xdoc/9.1/config/use-libreoffice.xml
new file mode 100644
index 000000000..e4d25514a
--- /dev/null
+++ b/src/site/en/xdoc/9.1/config/use-libreoffice.xml
@@ -0,0 +1,85 @@
+
+
+
+ Use of LibreOffice
+ Shinsuke Sugaya
+
+
+
+
+ It is possible to crawl using the Apache POI Fess environmental standard in MS Office system document.
+ You can crawl Office system document regarding LibreOffice, OpenOffice, do even more accurate text extraction from documents.
+
+
+
JodConverter Fess server install. from http://jodconverter.googlecode.com/jodconverter-core-3.0-Beta-4-Dist.zipThe download. Expand and copy the jar file to Fess server.
+
+
Create a s2robot_extractor.dicon to the next.
+
+
s2robot_extractor.dicon effective jodExtractor with following contents.
Index to generate the settings later, usually crawled into the street.
+
+
+
diff --git a/src/site/en/xdoc/9.1/config/windows-service.xml b/src/site/en/xdoc/9.1/config/windows-service.xml
new file mode 100644
index 000000000..1361b4a77
--- /dev/null
+++ b/src/site/en/xdoc/9.1/config/windows-service.xml
@@ -0,0 +1,54 @@
+
+
+
+ Register for the Windows service
+ Shinsuke Sugaya
+
+
+
+
You can register the Fess as a Windows service in a Windows environment. How to register a service is similar to the Tomcat.
+
+
Because if you registered as a Windows service, the crawling process is going to see Windows system environment variablesIs Java JAVA_HOME environment variables for the system to register, As well as Add %JAVA_HOME%\bin to PathYou must.
+
+
+
to edit the webapps \fess\WEB-INF\classes\fess.dicon, remove the-server option.
First, after installing the Fess from the command prompt service.bat performs (such as Vista to launch as administrator you must). Fess was installed on C:\Java\fess-server-9.1.0.
+ cd C:\Java\fess-server-9.1.0\bin
+> service.bat install fess
+...
+The service 'fess' has been installed.
+]]>
+
+
+
By making the following you can review properties for Fess. To run the following, Tomcat Properties window appears.
+ tomcat7w.exe //ES//fess
+]]>
+
+
+
Control Panel - to display the management tool in administrative tools - services, you can set automatic start like normal Windows services.
+
+
+
+
+
Distributed in the Fess is 64-bit binaries for Windows Tomcat builds based on. If you use 32-bit WindowsTomcat Of the site from, such as 32-bit Windows zip, please replace tomcat7.exe, tomcat7w.exe, tcnative-1.dll.
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
+Expand the downloaded fess-server-x.y.zip.
+If you installed in the UNIX environment, in the bin added the performing rights to a script.
+
+
+
+
+Administrator account is managed by the application server. Fess Server standard available Tomcat, as well as to the user changing the Tomcat.
+Modify the password for the admin account of the conf/tomcat-user.xml if you want to change.
+
+]]>
+
+see the Tomcat documentation or JAAS authentication specification to use Tomcat-user.XML file management method other than.
+
+
+
+
+To access the Solr into Fess server is password becomes necessary.
+Change the default passwords in production, etc.
+
+How to change the password, you must first change the password attribute of the conf/tomcat-user.xml solradmin.
+
+
+]]>
+
+Modifies the following three files webapps/fess/WEB-INF/classes/solrlib.dicon, fess_suggest.dicon and solr/core1/conf/solrconfig.xml.
+Write what you specified in tomct-user.XML to the following password.
+
+modify the following areas of the solrlib.dicon.
+
+access to / http://localhost:8080/Fess ensures startup.
+
+
+
+Management UI is / http://localhost:8080/fess/admin.
+Default Administrator account user name / password is admin/admin.
+Administrator account is managed by the application server.
+In the management UI of the Fess, authenticate with the application server in fess role available as an administrator.
+
+
+
+Fess to stop the running shutdown scripts.
+
+
+
+
+Crawl or may take a while to completely stop during the index creation if you.
+
+If you need commercial support, maintenance and technical support for this productN2SM, Inc....To consult.
+
+
+
+
+
+About the effectiveness of the Web site's third party in the Fess project, described in this document has no responsibility.
+The Fess project through any such site or resource available content, advertising, products, services, and other documents regarding assumes no responsibility, obligations, guarantees.
+For the Fess project through such sites or resources and use of available content, advertising, products, services, and other documents, or or credit, related to it caused or alleged, any injury or damage assumes no responsibility or obligation.
+
+
+
+Fess project is committed to the improvement of this document, and welcomes comments from readers, such as proposed.
+
You can use the additional parameters if the search string is shown on the screen without the specific search criteria like persuasion. additional value is retained but in the paging screen additional value.
+
+
Without the conditions show screen and run a search when searches are performed by appending additional values in hidden forms, such as (for example, a search form) in the paging screen transitions and also the condition holds.
Use the search if you want to search for documents that contain all search words of more than one. When describing multiple words in the search box separated by spaces, AND skip AND search.
+
+
If you use the search search words written AND. Write in capital letters AND the space required to back and forth. AND is possible can be omitted.
+
For example, if you want to find documents that contain the search terms 1 and 2 search terms, type the following search form.
Use the boost search if you want to prioritize, search for specific search terms. Enabling search in boost search, depending on the severity of the search words.
+
+
To boost search after the search term ' ^ boost value "that specifies the boost value (weighted) in the format.
+
For example, if you want to find the page if you want to find apples oranges contained more 'apples', type the following search form.
+
+
Boost value specifies an integer greater than 1.
+
+
+
+
diff --git a/src/site/en/xdoc/9.1/user/search-field.xml b/src/site/en/xdoc/9.1/user/search-field.xml
new file mode 100644
index 000000000..3c668053c
--- /dev/null
+++ b/src/site/en/xdoc/9.1/user/search-field.xml
@@ -0,0 +1,66 @@
+
+
+
+ Search by specifying a search field
+ Shinsuke Sugaya
+
+
+
+
You crawl in Fess results are saved for each field, such as title and full text. You can search for a field of them. You can specify the search criteria in search for a field, such as document type or size small.
+
+
You can search for a the following fields by default.
+
+
Field list is available
+
+
+
Field name
+
Description
+
+
+
URL
+
The crawl URL
+
+
+
host
+
Were included in the crawl URL host name
+
+
+
site
+
Site name was included in the crawl URL
+
+
+
title
+
Title
+
+
+
content
+
Text
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
mimetype
+
The MIME type of the content
+
+
+
+
If you do not specify the fields subject to find the content. Fields are custom fields are also available by using the dynamic field of Solr.
+
If HTML file and search for the title tag that string in the title field, below the body tag registered in the body field.
+
+
+
If a field specifying the search field name: search words in separated by colons (:) field name and search word fill in the search form, the search.
+
If you search the Fess as a search term for the title field, type.
+
+
Document, the title field in Fess above search appears as a search result.
Ambiguity in the case does not match the words word search to search is available. Based on the Levenshtein distance in Fess ambiguous corresponds to the search (fuzzy search).
+
+
After the search word you want to apply the fuzzy search adds '~'.
+
For example, ambiguous word "Solr", you can find documents that contain the word, near the "Solr" If you want to find, type as the search form, such as ("Solar").
+
+
+
Furthermore, if by '~' after the number 0 and 1, 1 just like in refine. For example, in the form of 'Solr~0.8'. Do not specify numeric default value is 0.5.
Search using location information when you search, adding latitude and longitude location information for each document when generating the index becomes possible.
+
+
Following parameters is available in the standard.
+
+
Request parameter
+
+
+
GEO.latitude
+
Latitude degrees minutes seconds specifies double.
Narrow your search by adding the categories to search the document for label information the label is specified when the search is possible. Label information by registering in the Administration screen, will enable search by labels in the search screen. Label information available can multiple selections in the drop-down when you search. If you do not register the label displayed the label drop-down box.
+
+
You can select the label information at search time. Label information can be selected in the search options dialog by pressing the options button will appear.
+
+
You can search each document to create an index, set the label to the label. All results search search do not specify a label is usually the same. If you change the label information to update the index.
You can pass any search criteria from third-party search engines move and easy to like. Pass search criteria Please implement processing in QueryHelperImpl #buildOptionQuery.
+
+
Following parameters is available in the standard.
+
+
Request parameter
+
+
+
options.q
+
This is similar to the normal query. You can specify multiple options.q. If you specify multiple is treated as a search. Pass the URL encoding.
+
+
+
options.CQ
+
Treated as exact match search queries. For example, if you specify the Fess Project searches as "Fess Project". Pass the URL encoding.
+
+
+
options.OQ
+
Is treated as an OR search. For example, if you specify the Fess Project search as a Fess OR Project. Pass the URL encoding.
+
+
+
options.NQ
+
Treated as NOT search. For example, if you specify 'Fess' search as NOT Fess. Pass the URL encoding.
If you want to find documents that contain any of the search terms OR search use. When describing the multiple words in the search box, by default will search.
+
+
To use search OR search words written OR. OR write in capital letters, the space required to back and forth.
+
For example, if you want to search for documents that contain either search term 2 search term 1 and type the following search form.
If in the field containing the data in specified range, such as the number range search is possible for that field.
+
+
To limit "field name: value TO value ' fill in the search form.
+
For example, type to search document contentLength field against 1 k to 10 k bytes is shown below the search form.
+
+
To time range specified search ' lastModified: [date 1 TO date 2] "(Re 1 [Re 2) fill out the search form.
+
ISO 8601 with respect to re.
+
+
+
Date and time-resolved second and fractional part
+
If the current relative to the date
+
+
+
YYYY-MM-DDThh:mm:ss.sZ( example :2013-08-02T10:45:23.5Z)
+
NOW (the current date), YEAR (this year), MONTH (month), DAY (today)
+
+
+
To relative to the current date and time NOW and DAY-(Adder, and production) and can sign and (round) like.
+
And a symbol for round / behind unit. Even if now-1DAY/day does today what time today 00: represents the day from 00 -1, the 00: 00.
+
For example, if you search for lastModified field from 2/21/2012 20: (current to date) 30 days prior to the updated document, type the following search form.
To sort the search results by specifying the fields such as search time.
+
+
You can sort the following fields by default.
+
+
Sort fields list
+
+
+
Field name
+
Description
+
+
+
Tstamp
+
On the crawl
+
+
+
contentLength
+
You crawl the content size
+
+
+
lastModified
+
Last update of the content you want to crawl
+
+
+
+
Adding custom fields as sort in Customizing.
+
+
+
You can select the sorting criteria when you search. Sorting criteria can be selected in the search options dialog by pressing the options button will appear.
+
+
Also, for sorting in the search field sort: the field name to sort and field names separated by colon (:) fill out the search form, the search.
+
In ascending order sort the content size as a search term, Fess is below.
+
+
To sort in descending order as below.
+
+
If you sort by multiple fields separated list, shown below.
You can use one or multiple character wildcard search terms within. The can be specified as a one-character wildcard, * is specified as the multiple-character wildcard. Wildcards are not available in the first character. You can use wildcards for words. Wildcard search for the sentence.
+
+
If you use one character wildcard shown below? The available.
+
+
If the above is treated as a wildcard for one character, such as text or test.
+
If you use the multiple character wildcard use * below
+
+
If the above is treated as a wildcard for multiple characters, such as test, tests or tester. Also,
+
+
The so can be also used in the search term.
+
+
+
The wildcard string indexed using target. Therefore, because if the index has been created, such as bi-gram be treated meaning fixed string length in Japan Japanese wildcard in Japan, not expected behavior. Use in the field, if you use a wildcard in Japan, that used morphological analysis.
+
+
+
diff --git a/src/site/en/xdoc/articles/article-1.xml b/src/site/en/xdoc/articles/article-1.xml
new file mode 100644
index 000000000..9b35f8dd1
--- /dev/null
+++ b/src/site/en/xdoc/articles/article-1.xml
@@ -0,0 +1,407 @@
+
+
+
+ In Fess make Apache Solr based search server-introduction
+ Shinsuke Sugaya
+
+
+
+
+ Document management is increasing daily, is expected to effectively manage their documents.
+ More managed document, with specific information from its difficult to continue.
+ Include implementing full-text search server able to search through vast as the solution.
+
+ Fess is a easy deployment, Java-based open-source full-text search server.
+ The search engine part of the Fess by using Apache Solr.
+ Is a very powerful search engine called SOLR index can be the 200 million document.
+ On the other hand, may need to implement in your own crawler parts, such as when trying to build in the Apache Solr search system.
+ You can use S2Robot Fess offers from the Seasar Project crawler parts, collect Web or file system on various types of documents to search.
+
+ Therefore, this article introduces about building a search server by Fess.
+
+
+
+
+
+
Those who want to build search system
+
Those who observed to add search functionality into existing systems
+
Those who are interested in Apache Solr
+
+
+
+
+
+ Regarding the content of this article in the following environment and behavior verification.
+
+
+
+ Windows 7 (Service Pack1)
+
+
+ JDK 1.7.0_21
+
+
+
+
+
+
+ Fess is a open source Web and file system using the full text search system. The SourceForge.jpFess sitesFrom the provided in the Apache license.
+
+
+
+
+
+ Java-based search system
+
+
+ Fess is as in the following figure, has been built using various open source products.
+
+
+
Fess structure
+
+
+
+
+
+ A Fess and Solr war file is deployed to the Tomcat distribution.
+ War file of the Fess offers search and management screens.
+ The Fess as a development framework Seasar2, SAStruts employs in the presentation layer.
+ So, by modifying the JSP if you want to customize, such as screen easy customization is possible.
+
+ Also using the built-in database H2Database to save settings and crawl data, is accessed by using o/r Mapper DBFlute.
+ S9chronos is used to perform a crawl in the time specified in the Fess, scheduling framework provided by the Seasar project.
+ SOLR and S2Robot are discussed.
+
+ Fess was constructed as a Java-based system, so any platform can be performed.
+ Provides a UI to easily set from the Web browser settings.
+
+
+ As a search engine using Apache Solr
+
+
+ Apache Solr is an enterprise search server based on Lucene, is available from the Apache Software Foundation.
+ Roundness that characterized the support such as faceted search, search result highlighting, multiple output formats.
+ Also in the Solr server configuration depends on the number of documents that can be searched for, and several hundred million documents, you can scale out to large scale site search server.
+ Said to search engine usage and many in Japan, has been in.
+
+ Fess uses Apache Solr to search engines.
+ Distributed in the distribution of the Fess in the Solr, but cut out Fess Solr server to another server that is available.
+ Also, multiple Solr Server manages the Fess as a group, form a redundant configuration is possible.
+ Design can take advantage of scalability in this way with SOLR in Fess;
+
+
+ Available as a crawling engine S2Robot
+
+
+ S9robot is the krolaframework provided by the Seasar project.
+ S2Robot can collect touring the document to the documents on the Web or on the file system.
+ It is possible to also document collection in multiple threads simultaneously multiple documents efficiently treating.
+ Also, document can handle HTML, not to mention in numerous formats such as MS Office system files, such as Word and Excel, zip archive files, images and audio files, including covers (images and audio files, gets a meta-information).
+
+ Fess by using S2Robot, touring on the Web and file system documents, collect text information.
+ You can can handle S2Robot file format to accommodate even those to be searched.
+ Etc to crawl through the S2Robot for parameter it is possible to set from the management UI of the Fess.
+
+
+
+ Mobile support
+
+
+ Fess is compatible for viewing on docomo, au and Softbank Mobile phones.
+ You can specify when indexing documents to can be viewed in search results with what handsets.
+ Book and skip for viewing on your mobile device in the paper, describes the next time.
+
+
+
+
+
+ Start the Fess, and describes the steps to do a search.
+ You can install and launch in almost similar steps in Mac OS X and Linux provides information intended to be run on Windows XP, but.
+
+
+
+ http://sourceforge.jp/projects/Fess/releases/ From the download the latest package.
+ The most recent version at the time of writing this article ( 2013 / 06 ) 8.1.0.
+ Unzip the download has finished, in any directory.
+
+
Download Fess
+
+
+
+
+
+
+
+
+ CATALINA_HOME and JAVA_HOME environment more appropriately, please run the %CATALINA_HOME%\bin\startup.bat.
+ For example, if you unzip the fess-8.1.0.zip C:\fess CATALINA_HOME is C:\fess\fess-server-8.1.0.
+
+
Launch of the Fess
+
+C:\fess\fess-Server-8.1.0 > set "JAVA_HOME = C:\Program Files \Java\jdk1.7.0_21" C:\fess\fess-server-8.1.0 > set CATALINA_HOME = C:\fess\fess-server-8.1.0 C:\fess\fess-server-8.1.0 > cd bin C:\fess\fess-server-8.1.0\bin > startup.bat
+
+
+
+ In the browser / http://localhost:8080/Fess The Fess is starting and access the following screen appears, the.
+
+
+
Search top screen
+
+
+
+
+
+
+
+
+ Please run the shutdown.bat.
+
+
+
Stop Fess
+
+C:\fess\fess-Server-8.1.0\Bin > shutdown.bat
+
+
+
+
+
+
+ Directory structure looks like this.
+
+
+
Directory configuration
+
+Fess-Server-8.1.0/ |--LICENSE |--NOTICE |--RELEASE-NOTES |--RUNNING.txt |--bin / |--conf / |--extension / |--lib / |--logs / |--solr /-- SOLR data directory
+| |--contrib / | |--core1 / | | |-- bin/ --SOLR executable
+| | |-- conf/ --Place the settings file in SOLR
+| | |-- data/ --Contains the SOLR index
+| | '--txlog / | |--dist / | '--lib / |--temp / |--webapps / | |--fess and | | |--META-INF / | | |--Web-INF / | | | |--cachedirs /--Store for mobile image cache
+| | | |--classes /--Place the classes and settings files
+| | | |-- db/ --Place the DB data
+| | | |--cmd | | | |--conf / | | | |--lib / | | | |--orig / | | | |-- logs/ --Contains the log files of the Fess
+| | | |-- view/ --Place the JSP UI related
+| | | |--fe.tld | | | |--struts-config.xml | | | |--validator-rules.xml | | | '--web.xml | | |-- css/ --Place the CSS file
+| | |-- js/ --Place the JS file
+| | |-- images/ --Place the image file
+| | '--jar / | |--fess.war | |-- solr/ --The SOLR Web apps
+| |--solr.war | |--manager / |
+
+'--manager.war '--work /
+
+
+ just below the "fess-server-8.1.0" directory configuration is similar to the Tomcat 7, might be deployed Solr data directory 'solr', 'fess.war' and 'solr.war.
+ Is deployed 'fess.war' to 'webapps/fess/WEB-INF/view' put JSP file search and management screens.
+ Also, if you need to customize the screen CSS file is placed in the 'webapps/fess/css', so edit the files.
+
+
+
+
+
+ Also indexed for search in the State immediately after the launch, make a search returned nothing results.
+ So, you must first create the index. In this case,http://Fess.codelibs.org/ja/ Create index to below, to do a search as an example.
+
+
+
+ First of all, on Administration page http://localhost:8080/Fess/Admin To access, please login.
+ By default user name and password are both admin.
+
+
+
Login to the management page
+
+
+
+
+
+
+
+
+ Then, register the crawled.
+ Because the Web page, select the [Web] from the left of the admin page.
+ For anything not registered in the initial state, select Create new.
+
+
+
Select the [new]
+
+
+
+
+
+
+ As a Web crawl settings, this ishttp://Fess.codelibs.org/ja/ That will crawl all the pages below.
+ In addition, results are displayed when you search from any PC or mobile phone, and then select all as the browser type.
+
+
+
Web crawl settings
+
+
+
+
+
+
+ Then, click the [create] on the confirmation screen that can crawl to register.
+ Registration is possible to change from the Edit.
+
+
+
Completing the registration Web crawl settings
+
+
+
+
+
+
+
+
+ Set to collect, document, crawl schedules.
+ Crawl schedules are set from the menu on the left of the admin page crawl General.
+
+ Formatting is similar to the Unix Cron.
+ From left, seconds, minutes, time, day, month, represents a day of the week.
+ For example, daily 12: If you successfully crawl your 10 am ' 0 10 12 * *? ' and then I.
+
+
+
Crawl schedule
+
+
+
+
+
+
+ Crawl is started and the index has been created to make from the menu on the left side, the session information that you can.
+ Displays the document number when the crawl is complete, the search index size of session information (Web/file).
+
+
+
Check the crawl status of
+
+
+
+
+
+
+
If the crawl is complete example
+
+
+
+
+
+
+
+
+ Like image below to search crawl after the results are returned.
+
+
+
+
Search example
+
+
+
+
+
+
+
+
+
+ Here, the most viewed users, search results and search top screen shows how to customize the list screen.
+
+
+ Shows how we change the log file name.
+ You can change any knowledge of HTML, so if you want to change the design itself described in a simple JSP files.
+
+ First of all, find the top screen 'webapps/fess/WEB-INF/view/index.jsp' file.
+
+ Change the file name to change the images that appear on the home screen search 'logo.gif' where you want to replace.
+ Files placed in the 'webapps/fess/images'.
+
+ <s:form>And <bean:message>such as a JSP tag.
+ For example,<s:form> the actual HTML view when converted to the form tag.
+ Detailed description see SAStruts or for JSP sites. </s:form></bean:message></s:form>
+
+ The search results list screen will be in the 'webapps/fess/WEB-INF/view/search.jsp' file.
+
+ Results of the 'logo-head.gif' file name change to change the image that appears at the top of the screen.
+ similar to 'logo.gif' put in 'webapps/fess/images'.
+
+ Edit 'Style.css' If you want to change the CSS file used in a JSP file located in the 'webapps/fess/css'.
+
+
+
+
+
+ About the Fess in the full-text search system, from installation until search and simple customization methods discussed.
+ I could introduce you can easily build a search system if you have the Java runtime environment, with no special environment.
+ Can be introduced into an existing system site search functionality, such as if you want, so you try.
+
+ I want to introduce the next time you support Fess mobile site search feature.
+
+
+
+
diff --git a/src/site/en/xdoc/articles/article-2.xml b/src/site/en/xdoc/articles/article-2.xml
new file mode 100644
index 000000000..5abb5af93
--- /dev/null
+++ b/src/site/en/xdoc/articles/article-2.xml
@@ -0,0 +1,195 @@
+
+
+
+ In Fess make Apache Solr based search Server-Mobile Edition
+ Shinsuke Sugaya
+
+
+
+
+
+ The last timeIntroduction chapterSo, how to build open-source full-text search server by Fess introduced.
+ Fess on docomo, au and Softbank Mobile search so that introduce how to use this time.
+
+ In this article, explains Fess 8.1.0. About how to build a FessIntroduction chapterPlease see the.
+
+
+
+
+
+
Those who want to build mobile terminal search system
+
Observed to add search functionality on existing cell sites
+
+
+
+
+
+ Regarding the content of this article in the following environment and behavior verification.
+
+
+
+ Windows 7 (Service Pack1)
+
+
+ JDK 1.7.0_21
+
+
+
+
+
+
+ Available for mobile terminals in the full-text search system for systematically following such response will become necessary.
+
+
+
To get the mobile device information, suitable for Terminal
+
You can specify a crawl when you create for user agents
+
Can include career information index information
+
View converted into Web-site if the content search results
+
+
+ In the Fess corresponds to all of the above. For processing first, retrieving the phonemobyletIs adopted.
+ mobylet is a Java open source framework for building mobile Web applications.
+ You can view the results mobylet in docomo, au and Softbank handsets to identify the appropriate terminals each.
+
+ You can with Web crawl settings set the user agent when the next crawl search in Fess.
+ You can crawl the user agent for career, to get their sites for mobile phones.
+ However, you must allow Fess server IP you are IP restrictions for mobile site, mobile phone sites displaying.
+ Also, viewable results carrier terminal only in select the carriers you want to display all selected Web crawl settings in the default [browser], but that becomes possible.
+
+ (If you use the PC site Viewer, such as can be viewed is) cannot normally see on mobile devices even if the search result Web-site, PC site link appears in the search results.
+ It is possible to use the Google Wireless Transcoder in Fess.
+ Google Wireless Transcoder is a service provided by Google, Inc., PC site will convert your various mobile phone.
+ It is possible for simple settings in Fess, links Google Wireless Transcoder to convert search results the search results use smooth PC site.
+
+
+
+
+ Fess 8.1.0 installed and is running.
+
+
+
+
+ create a Web crawl settings to display the search results only if you searched for DoCoMo handsets.
+
+
+ Access and http://localhost:8080/fess/admin in the management page, please login.
+ By default user name and password are both admin.
+ Select the [Web] from the left of the admin page.
+ For anything not registered in the initial state, select Create new.
+
+
Select the [new]
+
+
+
+
+
+ This time the mobile phone site is not that will crawl all the pages of http://fess.codelibs.org/ja/ following.
+ any mobile sites that can be displayed in DoCoMo handsets instead of http://fess.codelibs.org/ja/ the site URL specifies.
+
+ [Browser] and select only the DoCoMo so that appears only in docomo handsets.
+ If you want to display in AU and Softbank handset select them here.
+
+ After the user agent be user-agent for docomo handsets.
+ This time we enter DoCoMo/2.0 P903i.
+
+
for DoCoMo Web crawl settings
+
+
+
+
+
+ Then, click the [create] on the confirmation screen that can crawl to register.
+ Registration is possible to change from the Edit.
+
+
+
+
+ Sets the search results PC if there is Google Wireless Transcoder available sites you like.
+ Excluding the PC site to search for the mobile site is just there if you don't need this setting.
+
+ Select the crawl General from the left of the admin page.
+ In the mobile conversion, select the Google Wireless Transcoder.
+
+
Mobile conversion configuration
+
+
+
+
+
+ Saves the settings and click [update].
+
+
+
+
+
+ Mobile handset settings after the start to crawl, create a searchable index.
+ Select the system settings from the left of the admin page.
+
+
System settings
+
+
+
+
+
+ Click the start crawling, initiates a search for crawling and indexing.
+ While complete crawl.
+
+
+
+
+ First, try searching in the PC browser, such as Internet Explorer.
+ visit http://localhost:8080/Fess, locate the Fess.
+
+
Search in PC browser
+
+
+
+
+
+ You can: set Web crawl settings in the PC browser in search results is displayed.
+
+
+ The following access in docomo handsets. This time real terminal, not in FirefoxFireMobileSimulatorUse Add-ons, to see the results.
+ FireMobileSimulator is a Firefox Add-on that simulates the major three carriers mobile phone browser.
+ FireMobileSimulator installed in Firefox, from the Firefox menu select, DC Terminal docomo P903i from [tool] [FireMobileSimulator].
+ When you access this setting allows Firefox to simulate the P903i handset environment.
+ Similarly if the PC browser and visit http://localhost:8080/fess, locate the Fess.
+
+
Search in DoCoMo handsets
+
+
+
+
+
+ Search for this time is specified in the Web crawl settings is displayed.
+
+
+
+
+
+
+ How to respond to the Fess in the full-text search system handsets introduced.
+ I could introduce you can provide search functionality to the handsets of three major carriers in simple settings.
+ Also, it is possible to respond by phone will be released new models on a regular basis, but the latest terminal information file in Fess 'webapps/fess/WEB-INF/classes/device'.
+ About how to update the device information file see the README in the directory.
+
+ Next switch to display results search results depending on authentication of users, roles functions are introduced here.
+
+
+
+
diff --git a/src/site/en/xdoc/articles/article-3.xml b/src/site/en/xdoc/articles/article-3.xml
new file mode 100644
index 000000000..c585a94d7
--- /dev/null
+++ b/src/site/en/xdoc/articles/article-3.xml
@@ -0,0 +1,303 @@
+
+
+
+ In Fess make Apache Solr based search server-part role-based search
+ Shinsuke Sugaya
+
+
+
+
+
+ The last timeMobile EditionOr how to build mobile device-friendly search system using Fess was introduced.
+ Introduces role-based search feature is also distinctive features of Fess one thing in this article.
+
+ In this article, explains Fess 8.2.0. About how to build a FessIntroduction chapterPlease see the.
+
+
+
+
+
More authentication such as portal sites like seen in search system building
+
Want to build an environment to search for viewing permissions each
+
+
+
+
+
+ Regarding the content of this article in the following environment and behavior verification.
+
+
+
+ CentOS 5.5
+
+
+ JDK 1.6.0_22
+
+
+
+
+
+
+ Is divided out search results user and role-based search of the Fess, authenticated in any authentication system authentication information to the original function.
+ For example, search for technical personnel b sales reps A sales division role shows a sales division role information in search results, sales of rolls does not appear it.
+ By using this feature, user login in the portal and single sign-on environment belongs to you can enable search, sector or job title.
+
+ In Fess role-based search to retrieve role information from the following places can be.
+
+
+
Request parameter
+
Request header
+
Cookies
+
J2EE authentication information
+
+
+ Role information to pass to the Fess, usages, in the Portal Server and agent-based single sign-on system authentication when running Fess's domain and path to save authentication information in cookies.
+ By reverse proxy type single sign-on system access to Fess to add authentication information request headers and request parameters can retrieve role information in Fess.
+ You can thus various authentication systems and working with the search results to put out separate.
+
+ To cope with that to provide a class to implement the jp.sf.fess.helper.RoleQueryHelper interface if you are running your own authentication system. The class 'webapps/fess/WEB-INF/classes' As in, have put in place through the classpath "webapps/fess/WEB-INF/classes/fess.dicon"In the given on behalf of the jp.sf.fess.helper.impl.RoleQueryHelperImpl.
+
+
+
+
+
+ By installing a Fess 8.2.0. If you have not installed yet,Introduction chapterPlease install, refer to.
+
+
+ Describes the role-based search using the credentials of the J2EE (Tomcat authentication) provides the Fess existing login screen using various authentication systems but without building a separate authentication systems in Fess Tomcat authentication environment, so use this in this article.
+
+
+
+
+ First of all, search results separate out, to show to Tomcat users.
+ This time, create sales ( sales ) and engineering (eng) two rolls.
+ And the user adds taro and hanako users belong to the eng role belongs to the sales role. User information below to seeconf/Tomcat-users.XML"To write.
+
+ This setting is not required if the use of an existing authentication system.
+
+
+
+
+
+ Here's the settings for the Fess. First of all, 'webapps/fess/WEB-INF/classes/fess.dicon"The roleQueryHelper sets how to retrieve the default roles and authentication information. By using the J2EE authentication information, so seeFess.dicon"The roleQueryHelper will set like the following.
+
+ Set the default role as above.
+ Treated as the role search to locate the default roles are not logged in.
+ All search results are displayed on the status not logged in and you do not specify a default search.
+
+ About the J2EE authentication information not available if worth mentioning here.
+ If the authentication information from the request parameter set follows.
+ Key request parameter specifying where fessRoles can pass the role information in the comma-separated values. For example, the URL to locate the user with the sales and admin roles 'http: //hostname/fess/search?' The so will be added fessRoles.
+ You can pass here encryptedParameterValue is set to false, this value to true and encrypt, Blowfish or AES in the value part of the fessRoles.
+ You need to set if you encrypt the value to be FessCipher components, so that you can decrypt.
+
+ If the authentication information from the request header set shown below.
+
+ logged in as admin user and lists the role role in the left menu to click.
+ We create three roles.
+
+
Role list
+
+
+
+
Role name
+
Value
+
+
+
By default
+
default
+
+
+
Sales Department
+
sales
+
+
+
Technology Department
+
Eng
+
+
+
+
+
+
+
+
+ Create a crawl. This time the users in the sales department rolehttp://www.n9sm.NET/Only, you can search for users of the technology of rollhttp://Fess.codelibs.org/Just so that you can find.
+ In order to crawl settings, click on the left menu [Web], lists the Web crawl settings.
+ Click the [create new], please create a Web crawl settings. First of all, sales for thehttp://www.n2sm.NET/To the Sales Department, select [role] item as the crawl settings, create. In the followinghttp://Fess.codelibs.org/The create a role Select technology, in the crawl settings.
+
+
+
Web crawl settings roll items
+
+
+
+
+
+
+
+
+ Registration after the crawl settings, click System settings on the left menu, click the Start button in the system settings screen, starts to crawl.
+ While wait for crawl to complete.
+
+
+
+
+ After crawling,/ http://localhost:8080/FessTo make sure that search word, such as "fess" access, not logged in, search results are displayed.
+ Then logged in taro, as well as search. for taro user has a sales rolehttp://www.n9sm.NET/The only search results are displayed.
+
+
+
Search screen in the sales role
+
+
+
+
+
+ Taro user logout, please login with hanako users. Destination and so have eng role hanako users as well as search andhttp://Fess.codelibs.org/The only search results are displayed.
+
+
+
in the Eng role search screen
+
+
+
+
+
+
+
+
+
+
+ I introduced about the security features of the Fess in role-based search.
+ I think various authentication systems to accommodate because mainly covers the J2EE authentication information by using role-based search, but the passing of authentication information to the Fess general implementation.
+ It is possible to realize systems so each attribute in the user search results out into the corporate portal site or shared folder browsing permissions per search is required.
+
+ Next offers Fess of Ajax functions are introduced here.
+
+
+
+
diff --git a/src/site/en/xdoc/articles/article-4.xml b/src/site/en/xdoc/articles/article-4.xml
new file mode 100644
index 000000000..9e7773477
--- /dev/null
+++ b/src/site/en/xdoc/articles/article-4.xml
@@ -0,0 +1,268 @@
+
+
+
+ In Fess make Apache Solr based search server-ed. REST API
+ Shinsuke Sugaya
+
+
+
+
+
+ The last timePart of the role-based searchThe user permission is required in how available Fess or introduced.
+ An introduction of how to do search and the results display in the client-side (browser side), using the REST API is now Fess.
+ In the REST API, Fess to use as a search server Web system of the existing, and integrate changes only the HTML is possible too.
+
+ In this article, explains Fess 8.1.0. About how to build a FessIntroduction chapterPlease see the.
+
+
+
+
+
Those who want to search functionality into existing Web systems
+
Those who want to build a search system with AJAX
+
+
+
+
+
+ Regarding the content of this article in the following environment and behavior verification.
+
+
+
IE 10
+
Firefox 21
+
+
+
+
+
+ According to normal HTML Fess search expressions in addition to search results using XML and JSON (including JSONP) response is possible as the REST API.
+ Can also use the REST API should build a Fess server will contact only the search results from the existing system to easily.
+ I think independent development language such as XML or JSON results format, so easy-to-integrate Fess to non-Java System.
+ You can handle it easily even if it is XML or JSON support in JavaScript libraries, so as Ajax.
+
+ See below about what return the REST API of Fess.
+
+ Fess uses Apache Solr as an internal search engine.
+ SOLR also provides XML and JSON API that Fess API is different.
+ Benefits as well as SOLR API, API Fess might Fess API in various Fess, such as control of search log management and viewing rights of specific features are available.
+ Would use the Solr case developed from the structure of the document crawling their own good, but easy to add and search features that Fess, can reduce the development cost of the many.
+
+
+
+
+ Describes how to build a Fess REST API using this site.
+ Use the JSON response to Fess server interaction.
+ Fess server to utilize this time uses a Fess server in the Fess project released as a demo.
+ If the install the Fess 4.0.0 or later if you want to use the Fess server of your own.
+ Fess 4.0.0 or later supports JSONP.
+
+
+
+ Same-Origin policy should be aware when using the AJAX security model.
+ You must use the JSONP can use JSON if it displays in the browser HTML output and Fess server exists in the same domain, but different.
+ You can use the JSONP in Fess REST API in the JSON response request parameters with callback key value by passing.
+
+
Same-Origin policy. (B) the must Fess return search results (JSON) becomes another domain if you use the JSONP.
+
+
+
+
+
+ Advances the story in case that this time there was HTML and Fess servers are in different domains.
+ So, please remove the callback from the request parameter if the good example, but in the same domain by using the JSONP.
+
+
+
+
+ Implements the search process, this time using JavaScript in HTML.
+ As a JavaScript library using jQuery.
+ You can also process, such as AJAX, such as easy to implement in jQuery.
+ Files you create will become.
+
+
To display the search form and search results to HTML file 'index.html'
+
JS file to communicate with Fess server 'fess.js'
+
+
+ In the examples of this building has implemented the following features.
+
+
+
Search buttons send a search request.
+
In the search results list
+
Search results paging
+
+
+
+
+
+ Create the HTML search form and search results.
+ Easy-to-understand this time is to adjust the design with CSS, without to have simple tag structure.
+ In one of the following HTML file to use.
+ We look below the body tag, and first id attribute in the header div tag where search input box and the search button located by.
+ Also, in the hidden form holds display start position (start) and the number (num).
+ Search requests submitted in JavaScript update will start and the num value.
+ However, show is 1 page per, and no ability to change the number displayed in this sample code, so the value of num is unchanged.
+ Also, you will not send this form page transitions occur if you have JavaScript enabled for on the search form submit search requests are communicated Ajax.
+
+ Displays information such as the number of hits in the subheader for the following div tag where to search.
+ displays search results and paging links in the div tag of the result.
+
+ Load the 'fess.js' recently created this jQuery JS files and JavaScript.
+ I try to get via Google CDN may save the jQuery JS file in the same directory as index.html, but this time.
+
+
+
+
+ Create a JS file to display the search results, and then communicates with the Fess server 'fess.js'.
+ And create a 'fess.js' with the following contents will put in the same directory as index.html.
+ runs after the DOM of the HTML file is built "fess.js" process.
+ First, specify Fess server URL by 1.
+ In the specify Fess public demo server.
+ To get search results JSON data from external servers using JSONP.
+ If you use JSON, JSONP, without callback =? is not required.
+
+ 2 save the jQuery object for the search button.
+ Maintain the variable to use the jQuery object search button several times and reuse.
+
+ 3 defines search functions.
+ Contents of this function is described in the following sections.
+
+ 20 registers events when the search form is submitted.
+ Search button press or when the decision was made in the search input field Enter key press occurs when registered at 20.
+ Search processing function doSearch call when the event occurs.
+ used for paging is the value passed when calling the search processing function is the value of the Navi.
+
+ Register the event at 21 and 22 be added paging links are clicked.
+ You need to register event by the delegate because these links are added dynamically.
+ 20 Similarly, these events call the search functions.
+
+
+
+
+ Describes the search processing function doSearch 3.
+
+
+ Gets the starting position and number of 4.
+ The value of these in the search form in the header area saved as a hidden value.
+ Display start position is set the default value if other values because 0 or more, the number of intended value from 1 to 100.
+
+ Determines the value of parameter navi passed doSearch event registration, was 5 when the fix at the start of the show.
+ Here,-1 is previous page move, 1 following page move, otherwise, it will change to move to the first page.
+
+ The decision to terminate the process and run a search if entered the search input field value is 6, if empty, without anything.
+
+
+ 7 in order to prevent double sub Mitt Fess Server contact while search button to turn off in.
+
+
+ Assemble the URL to send Ajax requests in 8.
+ 1 URL search, starting position, and number of joins.
+
+ Send Ajax requests in 9.
+ Use the JSONP so specify the jsonp on dataType.
+ Change the json to use JSON.
+ Request comes back to normal, and runs the success function.
+ search results from Fess server returned success argument object is passed.
+
+ First of all, in 10 has confirmed the contents of the response status.
+ 0 is set if the search request was processed successfully. More information on Fess JSON responseFess sitesPlease make sure.
+
+
+ Display a message if the search request is processed correctly, did not hit the search results 11 conditional statements in with content of the subheader area empty, did not hit results in the result area.
+
+
+ If you hit search results in the condition statement in the 12 search result processing.
+ Sets the execution time and number of messages in the 13 subheader area.
+ 14 we will add a reault area search results.
+ Search results are stored as an array of data.response.result.
+ results [i]...-in by accessing search results can retrieve field values of the document.
+
+ Page number in the 15 current page and add a link to the previous page and next page result.
+ With 16 saves current starting position and number of hidden in the search form.
+ The starting position and number of search requests at next reuse.
+
+ Change the on-screen position of pages in 17.
+ For the page itself is not updated when clicked next page links to scrawlTop moves to the top of the page.
+
+ 18 the search process after search button to enable the.
+ The request succeeds or fails will perform complete to be called in.
+
+ Avoid sending form or link after a search operation functions was called 19 returns false.
+ From this page transitions occur.
+
+
+
+
+ access the 'index.html' in the browser.
+ Displays the search form:
+
+
Search form
+
+
+
+
+
+ Enter a suitable search term and press the search button to display search results.
+ If there are 20 is the default display number, hit the number of search results displays links to the following pages in the bottom of the list.
+
+
Search results
+
+
+
+
+
+
+
+
+
+
+ Using the REST API of the Fess and tried building jQuery-based client search site.
+ Build system not only browser-based applications that use the REST API, use the Fess called from another application.
+
+ I want to show you how the next time the database scrawl functionality to add full-text search capabilities to an existing database.
+
+
+
+
diff --git a/src/site/en/xdoc/dev/getting-started.xml b/src/site/en/xdoc/dev/getting-started.xml
new file mode 100644
index 000000000..9e6bc5925
--- /dev/null
+++ b/src/site/en/xdoc/dev/getting-started.xml
@@ -0,0 +1,73 @@
+
+
+
+ Open source full-text search server - Fess development overview
+ Shinsuke Sugaya
+
+
+
+
+
+
+
+
Summarizes the information needed to develop the Fess.
+
+
+
Fess is developed as an application to work in the environment of the Java 7 or more. The following knowledge becomes necessary.
+
+
Java
+
Seasar 2
+
SAStruts (if you are developing a Web screen)
+
DBFlute (if you are developing around the DB)
+
SOLR (if you are developing around the search index)
+
S2Robot (if you are developing around the crawler)
+
+
When you develop using Eclipse and Maven in this scenario (to build to generate in the fess-server release of Ant becomes necessary). Also to develop and operate Fess server for download and you have installed. Dude install in advance is what you need.
+
+
+
Summarizes how to develop management and search screens. Describes how to develop with Eclipse. You must keep your development on the Eclipse WTP, etc (you installed the J2EE Edition).
+
+
Install Java, Eclipse, Maven 3.x, Fess and prepare. A zip file of the Fess <FESS_HOME>suppose you deploy directory. </FESS_HOME>
+
The source code of the Fess clone from github.
+
+
+
As a Maven project into the Eclipse.
+
Display the servers view. If you don't see the Window > Show View > Other... In, display the dialog Server > Servers, select OK button press concludes.
+
Add a new server in the servers view. Tomcat v7.0 Server, select the server name is set to moderate, the Next button press concludes. The Add Configured: fess Finish button press concludes. So that server in the servers view is displayed, double click the display settings information (Overview).
+
Select Use Tomcat Installation Server Locations.
+
In the timeouts Start to 180 seconds, change the Stop to 60 seconds.
+
Click the General Information Open Launch Configuration. Click the tag arguments. In the VM arguments '-Dsolr.solr.home=<FESS_HOME>/solr-Dfess.log.file=<FESS_HOME>/logs/fess.out-Dsolr.log.file=<FESS_HOME>/logs/solr.log-Djava.awt.headless=true---Xmx1g-XX: UseTLAB - XX:-XX:MaxMetaspaceSize = DisableExplicitGC 128 m-XX: Compressedclssspcesize = 32 m - XX:-usegcoverhedlimit-XX: UseConcMarkSweepGC - XX: XX - XX: Cmsincrementlmode - XX: Cmsincrementlpcing - XX: Cmsincrementldutycyclemin = 0 - Cmsinititingoccupncyfrction = 75: Useprnewgc - XX: OptimizeStringConcat ' add the. <FESS_HOME>Will change depending on the environment. OK button press concludes. (If the Java 7-XX:MaxMetaspaceSize = 128 m-XX:CompressedClassSpaceSize = 32 m-XX:MaxPermSize = 128 m to replace)</FESS_HOME> </FESS_HOME> </FESS_HOME> </FESS_HOME>
+
Boot the server from the servers view.
+
+
If you want to develop a HOT deploy src/main/resources/env.txt change from the product the ct. You can change the source code, without having to restart the Tomcat (Fess).
+
+
+
The crawl process Tomcat (Fess) starts a process. If you want to pursue in the debugger, such as to register as for debugging Java applications on Eclipse.
+
+
Register as a normal Java Application can be debugged in Eclipse. main function will be to jp.sf.fess.exec.Crawler.
+
In settings 1, argument settings, first, passes a session ID-sessionId 20100101000000 to appropriate program settings. Expand the contents of the Fess bin/setenv.sh to the VM Settings and register.
+
Add /fess/src/main/webapp/WEB-INF/cmd and geronimo_servlet_9.4_spec-1.0.jar in the classpath setting.
+
The run.
+
+
+
+
Distribution of Fess is included in Tomcat war file of the Fess and Solr. Build the distribution of Fess in SVN fess-server. In order to build Ant becomes necessary.
The management of the project N2SM, Inc.... From the subject to support. We received contribution of advertising to raise awareness of the Fess. Fess, for many people seeking donations support projects.
+
+
Project assistance money paid will utilize as a cost of advertising such as AdWords. Contact us on project support, such as receiptClick hereTo contact us please.
+
+
+
In the Fess project site on AdSense ads. Income generated through AdSense will Fess awareness and to disseminate will invest in AdWords intact as advertising expenses.
+
Please take a look at AdSense in addition to you have any good ideas about the spread of the Fess,.
The Fess 'Easily 5 minutes with full-text search server can be built"Is. Any operating system, it is possible to have a Java runtime environment. Fess is offered in the Apache license and offers free (freeware).
+
+
+
+
+
+
Standard demo
+
+
+
+
Sites within search demo
+
+
+
+
EC-friendly product search demo
+
+
+
+
+
+
+
Easily 5 minutes with full-text search server can be built
+
In the Apache license (free soft, so made available free) offer
+
OS-independent (Java based build)
+
Web, file system, Windows shared folder and database crawling.
+
Support for many file formats, such as MS Office and PDF
Fess is Apache licensed open-source products, personal and commercial free in freely available.
+
+
If you need and Fess customization, implementation and support servicesCommercial supportPlease visit. Performance tuning, such as, search quality and crawling slowly accommodated in commercial support.
+
+
+
diff --git a/src/site/en/xdoc/news.xml b/src/site/en/xdoc/news.xml
new file mode 100644
index 000000000..5aaca88da
--- /dev/null
+++ b/src/site/en/xdoc/news.xml
@@ -0,0 +1,153 @@
+
+
+
+ Open source full-text search server - news list
+ Shinsuke Sugaya
+
+
+
+
+
+
+
+
* Different versions and follow these steps butJava 7u25Install (recommended). [2014 / 3 / 13]
+
Click the 'Download JDK' JavaSE7. (JavaScript is disabled and download is valid and must)
+
+
+
Whether JavaScript is enabled you can check below. (If the Internet Explorer 9)
+
+
Click [tools] on the menu bar.
+
Click the [Internet Options].
+
Click the Security tab.
+
Click the [custom level].
+
Scrawl to the [scripts] section is displayed.
+
In the Active Scripting section radio button to enable a check to confirm.
+
Turn on and off if it is, click OK.
+
+
+
+
+
When you accept the license, read 'The Oracle Binary Code License Agreement for Java SE' turn to Accept License Agreement check.
+
+
+
+
Do the JDK download according to the OS you install. Windwos 64-bit if you select Windows 86 x 64 Windows, for Windows 32-bit (Windows 64-bit version of the example is below).
+
+
PCs are used whether the you can check below. (For Windows 7)
+
+
In Control Panel-> system and security → [System] appears to the type of system.
+
+
+
+
+
+
(XX is the downloaded update release version) run the JDK Installer (jdk-7uXX-windows-x64.exe). The following is an example Windows 64-bit Edition.
+
+
Depending on settings in Windows, see allow changes to this computer to the following program? "That may receive dialog. If so, please click the [Yes] button.
+
+
+
The installer will launch. Press the [next] button.
+
+
You can change the destination folder. Left at default without problem, press the [next] button.
+
+
Start installing the JDK, so wait.
+
+
+
+
Run the JRE installation after installing the JDK, JRE installed on your PC. In the JDK as well to change the destination folder is possible, but basically the same JDK folder no problem. Press the [next] button.
+
+
JRE installation starts, so wait a while.
+
+
+
+
The installation complete message appears. Press the [close] button.
+
+
Installation is complete.
+
+
The components is the following two. In the following you can check. (For Windows 7)
+
+
[Control Panel] → [programs] → [programs and features] listed will be.
+
+
Java SE Development Kit 7 Update XX (64-bit)
+
Java (TM) Update 7 XX (64-bit)
+
+
+
+
+
+
+
This is the setting information and environment variables passed to the program. In order to run the JDK command at the command prompt after the Java installation, setting environment variables is required.
+
Windows 7 sets the following. Select the control panel → system and security → [System] → advanced system settings → environment variables.
+
+
Click system and security.
+
+
Click the 'system'.
+
+
Click the 'advanced settings'.
+
+
Click environment variables.
+
+
New system environment variables button at the bottom of screen.
+
+
Enter "JAVA_HOME" variable name.
+
+
Describes the directory where you installed the JDK to the variable value.
+
Open the C:\Program Files zone\scripting in the Explorer, ' jdk... ' that describes its addresses, looking for a folder.
+
For example if you have installed the jdk version 1.7.0_XX will be C:\Program Files \Java\jdk1.7.0_XX. (Contains the portion of the XX version)
+
Description and then press 'OK'.
+
Locate the row from a list of system environment variables, see variables 'Path'.
+
+
And then push open the Edit button, the row at the end of the variable value "; %JAVA_HOME%\bin ' that adds a string, click the 'OK'.
Release file list of the destination URL and then 'fess-server-9.x.y.zip'.
+
+
+
+
Unzip the zip file you downloaded. If the Windows environment expand the zip decompression tool, in.
+
+
If you installed in the UNIX environment, in the bin added the performing rights to a script.
+
+
+
+
Open the unzipped folder by double-clicking.
+
+
Open the bin folder by double-clicking.
+
+
+
+
Double-click the startup.bat file in the bin folder, Fess up.
+
+
Run the startup.sh for UNIX environments.
+
+
+
+
Will start the command prompt appears. In the last sentence "Server startup... ' is Setup complete appears.
+
+
+
+
+
access to / http://localhost:8080/Fess ensures startup.
+
Management UI is / http://localhost:8080/fess/admin. Default Administrator account user name / password is admin/admin. Administrator account is managed by the application server. In the management UI of the Fess, authenticate with the application server in fess role determine as an administrator.
+
+
+
+
stop the Fess, double-click shutdown.bat file in the bin folder.
+
+
Run the shutdown.sh for UNIX environments.
+
+
+
+
+
Administrator account is managed by the application server. Fess Server standard available Tomcat, as well as to the user changing the Tomcat. Modify the password for the admin account of the conf/tomcat-user.xml if you want to change.
+
+]]>
+
+
+
+To access the Solr into Fess server is password becomes necessary.
+Change the default passwords in production, etc.
+
+How to change the password, you must first change the password attribute of the conf/tomcat-user.xml solradmin.
+
+
+]]>
+
+Modifies the following three files webapps/fess/WEB-INF/classes/solrlib.dicon, fess_suggest.dicon and solr/core1/conf/solrconfig.xml.
+Write what you specified in tomcat-user.xml to the following password.
+
+modify the following areas of the solrlib.dicon.
+
+
+ "solradmin"
+ "solradmin"
+
+]]>
+
+fess_suggest.dicon is the following.
+
+
+ "solradmin"
+ "solradmin"
+
+]]>
+
+SOLR/core1/conf/solrconfig.XML is the following.
+
This is a list of the Fess found on the crawl and search file format.
+
+
Type
Extension
+
+
Text
txt
+
XML
XML
+
XML
mm
+
XML
XML
+
HTML
HTML
+
MS Office
PPT
+
MS Office
doc
+
MS Office
pptx
+
MS Office
xls
+
MS Office
xlsx
+
MS Office
docx
+
PDF
PDF
+
Source code
JS
+
Source code
c
+
Source code
h
+
Source code
Java
+
Source code
HPP
+
Source code
CPP
+
Compressed file
gz
+
Compressed file
tar
+
Compressed file
zip
+
+
+
Character string extraction from various kinds of unknown files Fess. Not listed in the above file also crawl and you can find. If you have files you want to check,Search for system test data repositoryTo please pull requests.
+
+
+
The following file corresponds with commercial support.
+
+
A taro
+
OASYS for Windows
+
DocuWorks
+
AutoCAD
+
+
+
+
diff --git a/src/site/tools/translatexml b/src/site/tools/translatexml
new file mode 100755
index 000000000..50d2744a7
--- /dev/null
+++ b/src/site/tools/translatexml
@@ -0,0 +1,26 @@
+#!/bin/sh
+CWD="${0%/*}"
+
+if [ $# -ne 2 ]; then
+ echo "Usage: $0 input_xml_dir output_xml_dir" 1>&2
+ exit 1
+fi
+
+in_dir=$1
+out_dir=$2
+
+find $in_dir -name "*.xml" -print | while read in; do
+ out=${in/${in_dir}/${out_dir}}
+ d=$(dirname $out)
+ mkdir -p $d
+ echo $out 1>&2
+ ruby ${CWD}/translatexml.rb -f ja -t en -i $in | \
+ sed -e '
+ s/Croll/Crawl/g
+ s/croll/crawl/g
+ s/Crolling/Crawling/g
+ s/crolling/crawling/g
+ ' \
+ > $out
+done
+
diff --git a/src/site/tools/translatexml.rb b/src/site/tools/translatexml.rb
new file mode 100644
index 000000000..8bd42c472
--- /dev/null
+++ b/src/site/tools/translatexml.rb
@@ -0,0 +1,183 @@
+# -*- coding:utf-8 -*-
+require 'net/http'
+#require 'uri'
+require 'rexml/document'
+require 'json'
+require 'rest_client'
+require 'optparse'
+
+#
+# parsing command line options
+#
+options = {}
+
+optparse = OptionParser.new do |parser|
+ parser.on('-f lang', '--from lang', 'Input language can be \'ja\', \'en\',.. (Required)') {|v| options[:from] = v}
+ parser.on('-t lang', '--to lang', 'Output language can be \'ja\', \'en\',.. (Required)') {|v| options[:to] = v}
+ parser.on('-i [file]', '--input [file]', 'Input file (Optional)') {|v| options[:input] = v}
+ parser.on('-o [file]', '--output [file]', 'Output file (Optional)') {|v| options[:output] = v}
+ parser.on('-v', '--verbose', 'Verbose message to STDERR (Optional)') {|v| options[:verbose] = v}
+ parser.on('-d', '--dryrun', 'Dry-run (don\'t translate) (Optional)') {|v| options[:dryrun] = v}
+end
+
+begin
+ optparse.parse!
+ mandatory = [:from, :to]
+ missing = mandatory.select{ |param| options[param].nil? }
+ if not missing.empty?
+ STDERR.puts "Missing options: #{missing.join(', ')}"
+ STDERR.puts optparse
+ exit
+ end
+rescue OptionParser::InvalidOption, OptionParser::MissingArgument
+ STDERR.puts $!.to_s
+ STDERR.puts optparse
+ exit
+end
+
+$from = options[:from]
+$to = options[:to]
+
+if (options[:input] == nil)
+ $input = STDIN
+else
+ $input = open(options[:input])
+end
+
+if (options[:output] == nil)
+ $output = STDOUT
+else
+ $output = open(options[:output], "w")
+end
+
+$verbose = options[:verbose]
+$dryrun = options[:dryrun]
+
+#
+# Translator class
+#
+class Translator
+ CLIENT_ID = 'Set Client ID'
+ CLIENT_SECRET = 'Set Client Secret Key'
+ AUTHORIZE_URL = 'https://datamarket.accesscontrol.windows.net/v2/OAuth2-13'
+ TRANSLATION_URL = 'http://api.microsofttranslator.com/V2/Http.svc/Translate'
+ SCOPE = 'http://api.microsofttranslator.com'
+
+ @@access_token = nil
+
+ def get_access_token
+ unless @@access_token == nil
+ return @@access_token
+ end
+ json = JSON.parse(
+ RestClient.post(AUTHORIZE_URL,
+ {
+ 'grant_type' => 'client_credentials',
+ 'client_id' => CLIENT_ID,
+ 'client_secret' => CLIENT_SECRET,
+ 'scope' => SCOPE,
+ },
+ :content_type => 'application/x-www-form-urlencoded'
+ )
+ )
+ @@access_token = json['access_token']
+ @@access_token
+ end
+ private:get_access_token
+
+ def translate(text, from, to)
+ access_token = get_access_token
+ unless $dryrun
+ xml = REXML::Document.new(
+ RestClient.get("#{TRANSLATION_URL}?from=#{from}&to=#{to}&text=#{URI.escape(text)}",
+ 'Authorization' => "Bearer #{access_token}"
+ )
+ )
+ xml.root.text
+ else
+ "..."
+ end
+ end
+end
+
+#
+# Extends REXML::Element class
+#
+class REXML::Element
+ def has_cdata?
+ self.cdatas.length > 0
+ end
+end
+
+#
+# translateNode
+#
+def translateNode(element)
+
+ translator = Translator.new
+
+ # Translate attributes
+ if (element.is_a?(REXML::Element))
+ if (element.has_attributes?)
+ $attributes.each do |attribute|
+ text = element.attributes[attribute]
+ if /\S+/ =~ text
+ unless (text.nil? || text.empty?)
+ STDERR.puts "attributes[#{attribute}]=#{text}" if $verbose
+ element.attributes[attribute] = translator.translate(text, $from, $to)
+ STDERR.print "." unless $verbose
+ STDERR.puts " =>#{element.attributes[attribute]}" if $verbose
+ end
+ end
+ end
+ end
+
+ # Translate recursively if has children
+ if (element.has_elements?)
+ element.map.each do |child|
+ translateNode(child)
+ end
+ return
+ end
+
+ # Noting to do if CDATA
+ if (element.has_cdata?)
+ return
+ end
+ end
+
+ # Translate the text
+ if (element.is_a?(REXML::Text))
+ text = element.value
+ if /\S+/ =~ text
+ unless (text.nil? || text.empty?)
+ STDERR.puts "text=#{text}" if $verbose
+ element.value = translator.translate(text, $from, $to)
+ STDERR.print "." unless $verbose
+ STDERR.puts " =>#{element.value}" if $verbose
+ end
+ end
+ elsif (element.is_a?(REXML::Element) && element.has_text?)
+ text = element.text
+ if /\S+/ =~ text
+ unless (text.nil? || text.empty?)
+ STDERR.puts "text=#{text}" if $verbose
+ element.text = translator.translate(text, $from, $to)
+ STDERR.print "." unless $verbose
+ STDERR.puts " =>#{element.text}" if $verbose
+ end
+ end
+ end
+end
+
+
+#
+# parsing xml and translate (main)
+#
+$attributes = Array["name", "alt", "content"]
+
+doc = REXML::Document.new($input)
+translateNode(doc.root)
+$output.puts doc.to_s
+
+STDERR.print "\n" unless $verbose