Tuesday, May 15, 2012

Lucene Search in Alfresco

Before jump onto the Lucene searching , firstly have to understand the basic of Alfresco.

The principle is : Everything is a NODE!
The rule is : Alfresco provides services to manage Nodes.

Now have a look at Lucene that "What is Lucene ?"

> Apache Lucene is a highperformance, full-featured text search engine library written entirely in Java.
>Although Lucene provides the ability to create your own queries through its API, it also provides a rich query language through the Query Parser, a lexer which interprets a string into a Lucene Query using JavaCC.

And with Alfresco…How does it work?

1: Collect


2: Preserve

3: Use



When a node is created, the following actions take place:

> Indexing of the whole set of the items properties.
> Plain Text Indexing of related content.

When a node is created:

> The plain text indexing takes place if the content format is:
– Office (Open ou Microsoft)
– XML/HTML
– PDF
– Emails
– Texte

Have a look into the data model!


What does the « tokenize » principle mean?
Tokenising is to split (or not) a word into one or many key words.
> The serach can be done on those key words only!
> So don’t forget to check your data model !


Now go for the Syntax of Lucene Searching:

Lucene in Alfresco enables you to query on:
• The NodeRef (ID)
• The Type > of a Node
• The Properties
• The Aspects
• The key Words(Content)

To query on the Noderef:


To query on the Type :


To query on a property:


To query on an Aspect:


To query on a key word included in the content of the node:


Now have a look at Practical Examples of Lucene Searching:

How to make a simple query with Lucene:
> Connect as « admin » to Alfresco.
> Click on
> Choose the node browser.
> Choose the store : workspace://SpacesStore
> In the drop down list, choose Lucene


Example :

I want the « Folder Test »



And the Result is :




> To identify and make a node unique in a store, we use… UUID
> The concatenation of the protocole, the name of the store and the UUID of a node is NodeRef.



Operators:

+
To add a validated criteria
-
To add a non validated criteria
AND
To add a criteria
OR
To add a choice between several criterias
NOT
To exclude a criteria

I want all the spaces with the name « space »:


RESULT:


I want all folders with the name « space » which has a category:




I want all spaces with the name « space » which DOES NOT have a category:




Special Operators:

?
Can replace one character
*
Can replace one or many characters

Example:

I want all spaces with the name ending with « spa »;

TYPE:"cm:folder"AND @cm\:name:"*ace"
TYPE:"cm:folder"+@cm\:name:"*ace"




6 comments:

  1. The last example posted.. the command must end with "spa" instead of "ace" - guess this is what is right. The document was really helpful.

    ReplyDelete
    Replies
    1. don't go with the words , just understand the syntax..

      Delete
  2. Hi ,
    Thanks for sharing wonderful knowledge with all of us.
    Could you write something more about Lucence.

    Thank You
    Simant ji

    ReplyDelete

Thanks for your Comment