Non-SQL
storage indexing. Data can also
be streamed to batch indexer in a simple XML format called XMLpipe,
or inserted directly into an incremental RT index.
To use xmlpipe, configure the data
source in your configuration file as follows:
source example_xmlpipe_source
{
type = xmlpipe
//Perl
xmlpipe_command = perl /www/mysite.com/bin/sphinxpipe.pl
//PHP
xmlpipe_command = php /www/mysite.com/bin/sphinxpipe.php
//Direct File
xmlpipe_command = cat /www/mysite.com/bin/sphinxpipe.xml
}
The indexer
will run the command specified in xmlpipe_command,
and then read, parse and index the data it prints to stdout.
More formally, it opens a pipe to given command and then reads from
that pipe.
indexer will expect one or more documents in custom XML format.
XmlPipe2 structure:
/****************XML File Format*************************************/
<?xml
version="1.0" encoding="utf-8"?>
<sphinx:docset> <sphinx:schema> <sphinx:field name="subject"/> <sphinx:field name="content"/> <sphinx:attr name="published" type="timestamp"/> <sphinx:attr name="author_id" type="int" bits="16" default="1"/> </sphinx:schema> <sphinx:document id="1234"> <content>this is the main content</content> <published>1012325463</published> <subject>note how field/attr tags can be in <b class="red">randomized</b> order</subject> <misc>some undeclared element</misc> </sphinx:document> <!-- ... even more sphinx:document entries here ... --> <sphinx:killlist> <id>1234</id> </sphinx:killlist> </sphinx:docset>/*********************************************************************/
Required Tools:
1.
expat
Expat is an XML
parser library written in C. It is a stream-oriented parser in which
an application registers handlers for things the parser might find
in the XML document (like start tags)
expat-1.95.8-8.3.el5_5.3
expat-1.95.8-8.3.el5_5.3
- expat -devel.
expat-devel-1.95.8-8.3.el5_5.3
expat-devel-1.95.8-8.3.el5_5.3.
“expat”
installation Using Package Manager:
1.Linux
CentOS/Redhat/Fedora
-
$
sudo yum install expat expat-devel.
Ubuntu/
Debian - $ sudo apt-get install libexpat libexpat-dev
Manual
Installation:download
link tar.gz packageRPM
package
Installation
Steps:Extract expat downloaded file
tar
file : $tar -xvf expat-2.0.1.tar.gz rpm
file : $ rpm -qlp ovpc-2.1.10.rpm 2.
$ ./configure --prefix=/<installation_path>/ 3.
$ make 4.
$ make install
Here
complete Example: 1.
sphinx Configuration
2.
Genarte XMLPIPE2 supported xml schema.
XML
Creation Tools:
In
Php:
Use
xmlWriter API.
In
Java:
- Apache Xerces.
- DOM XML parser
- JDOM XML Parser
In
Perl :
3.
Run the indexer to create full-text index from your data:
$
cd /usr/local/sphinx/etc
$
/usr/local/sphinx/bin/indexer –all
4.
Search
$
cd /usr/local/sphinx/etc
$
/usr/local/sphinx/bin/search promedik.
5.
Returns Documents Ids.
6.
Display search results Using Any language(PHP,Java,Python,Perl).
-PAVANKUMAR JOSHI

No comments:
Post a Comment