Dashboard > Community Wiki > ... > Deployment > Lucene indexer
Lucene indexer Log In View a printable version of the current page.

Added by GrĂ©gory Joseph , last edited by Boris Kraft on Jun 23, 2008  (view change)
Labels: 

magnolia-lucene indexer application

Its a standalone application which could be configured to index pages from different magnolia instances

config/indexer.xml
<Repository config="./config/repository_config_test.xml" id="website" logName="site-1 indexer" interval="3600000" indexDir="./index" domain="http://localhost:8082">
</Repository>

config is same as repository config file in default magnolia instance (WEB-INF/config/config.xml)
id is an ID of a repository as defined in the above config file
logName is used by log4j (any name string)
interval in milliseconds
indexDir is a start directory where lucene index will be created
domain which will be indexed recursively

How it works

it reads repository as configured and stream all pages one by one using straight http calls to the domain specified.
each indexed document has two fields handle and data, handle is same as magnolia path and data holds all html or whatever returned by the server.

you can program your templates such that if the request is coming from this indexer you return plain text or page without navigation etc....

How to use from your search template

The index created is pure lucene index, use lucene API in your template and point to the same index directory indexDir

In its previous incarnation on JspWiki, this page was last edited on Feb 9, 2007 10:25:51 AM

Powered by a free Atlassian Confluence Open Source Project License granted to Magnolia International. Evaluate Confluence today.
Powered by Atlassian Confluence 2.7, the Enterprise Wiki. Bug/feature request - Atlassian news - Contact administrators