sphinx中文分词的安装使用

发布于:2014-08-08 15:34:10 - 查看:1795次

下载sphinx
wget http://sphinxsearch.com/files/sphinx-2.0.5-release.tar.gz
tar zxvf sphinx-2.0.5-release.tar.gz
cd sphinx-2.0.5-release
(看coreseek的中文文档的时候让我安装,然后进入mmsg,后来发现应该下载的时coreseek的文档)
wget http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.1-beta.tar.gz
tar zxvf coreseek-4.1-beta.tar.gz
cd coreseek-4.1-beta/
cd mmsg
./configure –prefix=/usr/local/mmsg
提示错误:
config.status: error: cannot find input file: src/Makefile.in
搜结果需要执行这两个命令:
ACLOCAL_FLAGS=”-I /usr/share/aclocal”
./bootstrap
再次执行:./configure –prefix=/usr/local/mmsg
make && make install
//测试分词是否安装成功
/usr/local/mmsg/bin/mmseg -d /usr/local/mmsg/etc src/t1.txt

得到结果:
中文/x 分/x 词/x 测试/x
中国人/x 上海市/x

Word Splite took: 12 ms.

安装csft

cd ../csft-4.1/
sh buildconf.sh
./configure –prefix=/usr/local/coreseek –without-unixodbc –with-mmseg –with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ –with-mmseg-libs=/usr/local/mmseg3/lib/ –with-mysql
make
#测试sphinx是否安装成功#
cd ../testpack/
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/sphinx-min.conf.dist
#如果报错 找不到libmysqlclient.so.18 就给它加个link#
ln -s /usr/local/mysql/lib/libmysqlclient.so.18 /usr/lib/
/*
安装libsphinxclient
*/
/usr/local/coreseek/bin/indexer -c etc/csft.conf
/usr/local/coreseek/bin/indexer -c etc/csft.conf –all
/usr/local/coreseek/bin/indexer -c etc/csft.conf xml
/usr/local/coreseek/bin/search -c etc/csft.conf
/usr/local/coreseek/bin/search -c etc/csft.conf -a
/usr/local/coreseek/bin/search -c etc/csft.conf -a Twittter和Opera都提供了搜索服务