2010年8月17日星期二

Moses平台的搭建

52nlp网站上有个很详细的帖子介绍如何在Ubuntu下搭建Moses, 可参见
http://www.52nlp.cn/ubuntu-moses-platform-build-process-record

以下记录本人在按上述流程操作时,碰到的一些问题及解决方案.

1. The command to download Moses via SVN (you have to install svn in advance):
svn co https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk moses

2. While comping Moses with the command "./regnerate-makefiles.sh" under the directory 'moses', it meets:

name@name:~/smt/moses$ ./regenerate-makefiles.sh
Calling /usr/bin/aclocal…
Calling /usr/bin/autoconf…
configure.in:12: error: possibly undefined macro: AC_DISABLE_SHARED
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.in:13: error: possibly undefined macro: AC_PROG_LIBTOOL
autoconf failed

run the command again, it meets:

Calling /usr/bin/aclocal...
Calling /usr/bin/autoconf...
Calling /usr/bin/automake...
moses/src/Makefile.am:1: Libtool library used but `LIBTOOL' is undefined
moses/src/Makefile.am:1: The usual way to define `LIBTOOL' is to add
`AC_PROG_LIBTOOL'
moses/src/Makefile.am:1: to `configure.in' and run `aclocal' and
`autoconf' again.
moses/src/Makefile.am:1: If `AC_PROG_LIBTOOL' is in `configure.in', make
sure
moses/src/Makefile.am:1: its definition is in aclocal's search path.
automake failed


Check that you have libtool installed, and that you're using recent versions
of the autoconf/automake tools (The version of autoconf/automake should be 1.9 or higher).


3. While comping Moses Support Scripts with the command "make release" under the directory 'moses/scripts', it meets the following errors:

phrasetable.h:17:27: error: boost/bimap.hpp: No such file or directory

Make sure the boost have installed (download boost at http://www.boost.org).
(While installing boost, use "sudo ./bjam install" command and ignor such errors as "libs/iostreams/src/bzip2.cpp....")

Then, the following error happens:

Please specify a BINDIR.

The BINDIR directory must contain GIZA++, snt2cooc.out and mkcls executables.
These are available from http://www.fjoch.com/GIZA++.html and
http://www-i6.informatik.rwth-aachen.de/Colleagues/och/software/mkcls.html .

Copy GIZA++ and snt2cooc.out from the directory GIZA++-v2, and mkcls from the diectory mkcls-v2 into the directory "$SMT/bin". Then run the command "make release".

OK, done. It creates a directory "scripts-yyyymmdd-hhmm".


其他问题:
安装srilm时, 提示tcl.h找不到时, 需要修改common/Makefile.machine.i686文件, 找到Tcl support部分, 修改为
# Tcl support (standard in Linux)
TCL_INCLUDE = -I/usr/include/tcl8.5
TCL_LIBRARY = -L/usr/lib/tcl8.5

安装irstlm (irstlm-5.60.02)时, 运行./regenerate-makefiles.sh, autoconf出错:
configure.in:17: error: possibly undefined macro: AC_DISABLE_SHARED
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
configure.in:18: error: possibly undefined macro: AC_PROG_LIBTOOL
autoconf failed
再次运行./regenerate-makefiles.sh, automake出错
src/Makefile.am:1: Libtool library used but `LIBTOOL' is undefined
src/Makefile.am:1: The usual way to define `LIBTOOL' is to add `AC_PROG_LIBTOOL'
src/Makefile.am:1: to `configure.in' and run `aclocal' and `autoconf' again.
src/Makefile.am:1: If `AC_PROG_LIBTOOL' is in `configure.in', make sure
src/Makefile.am:1: its definition is in aclocal's search path.
src/Makefile.am: installing `./depcomp'
src/Makefile.am: C source seen but `CC' is undefined
src/Makefile.am: The usual way to define `CC' is to add `AC_PROG_CC'
src/Makefile.am: to `configure.in' and run `autoconf' again.
automake failed

确保安装libtool (sudo apt-get install libtool) 和 autoconf/automake版本在1.9或以上 (sudo apt-get install automake1.10).
再次运行./regenerate-makefiles.sh, OK!!

运行./configure --prefix=/home/..../irstlm --enable-caching
运行make, 提示gzfilebuf.h:8:18: error: zlib.h: No such file or directory, 安装zlib (sudo apt-get install zlib1g-dev), 重新运行make, OK!!

运行make install, OK!!

2 条评论:

ldd2000 说...

补充安装srilm时的问题
1) 修改srilm的Makefile文件中定义的SRILM为srilm路径后,运行make World命令, 得到错误make: /sbin/machine-type: Command not found
其实并不是说./sbin/machine-type文件不存在, It probably means that /bin/csh, which is what machine-type is written in, is not available on your machine. 解决的方法是安装csh, 用命令apt-get install tcsh

2) 解决上述问题, 继续运行make World, 到最后出现另外错误tclmain.cc:8:17: error: tcl.h: No such file or directory. 解决方法是安装tch, 用命令apt-get install tcl8.5-dev (tcl8.5已经安装, 但安装完tcl8.5-dev后, 可以在/usr/include/下找到tcl8.5, 里面包含tcl.h), 修改tclmain.cc, #include --> #include

3) 继续运行make World, 到最后又出现新的错误,如下:
In file included from Ngram.h:14,
from NgramLM.cc:36:
/usr/include/stdio.h:754: error: declaration of C function 'int fseek(FILE*, __off64_t, int)' conflicts with
/usr/include/stdio.h:722: error: previous declaration 'int fseek(FILE*, long int, int)' here
/usr/include/stdio.h:757: error: declaration of C function '__off64_t ftell(FILE*)' conflicts with
/usr/include/stdio.h:727: error: previous declaration 'long int ftell(FILE*)' here
这问题太奇怪, 似乎在说stdio.h中函数定义冲突, 以前运行gcc/g++命令也从没遇到过.
在运行make World时指定MACHINE_TYPE, make MACHINE_TYPE=i686-gcc4 World, 重新运行, OK. (原因不详)

ldd2000 说...

按照http://www.statmt.org/moses_steps.html运行法-->英翻译训练命令时出现错误:
*** buffer overflow detected ***: /home/ldd/Moses/tools/bin/GIZA++ terminated
======= Backtrace: =========
[0x81749ec]
[0x81749ad]
[0x81747c8]
[0x815ce9e]
[0x8181755]
[0x8174864]
[0x81747bd]
[0x806cd2d]
[0x806d7a3]
[0x80739d0]
[0x8147c6f]
[0x8048191]
======= Memory map: ========
007bb000-007bc000 r-xp 00000000 00:00 0 [vdso]
08048000-081f3000 r-xp 00000000 08:05 1573057 /home/ldd/Moses/tools/bin/GIZA++
081f3000-081f5000 rw-p 001ab000 08:05 1573057 /home/ldd/Moses/tools/bin/GIZA++
081f5000-081fd000 rw-p 00000000 00:00 0
08abf000-08b33000 rw-p 00000000 00:00 0 [heap]
bfb1d000-bfb32000 rw-p 00000000 00:00 0 [stack]

解决办法有人说是需要将g++-4.3改为g++-4.1, 也有人说需要小修改file_spec.h中一点点代码
char time_stmp[17]; ---> time_stmp[19]
...
sprintf(time_stmp, "%02d-%02d-%02d.%02d%02d%02d.", local->tm_year,
(local->tm_mon + 1), local->tm_mday, local->tm_hour,
local->tm_min, local->tm_sec);
--->sprintf(time_stmp, "%04d-...", 1900+local->tm_year,....)
然后重新编译

参见http://code.google.com/p/giza-pp/issues/detail?id=11#c0