2011年6月20日星期一

Moses的一些实现细节

Moses的一些实现细节

使用默认的参数训练,

# distortion (reordering) weight
[weight-d]
0.010878 //distance weight
0.059869 //monotone weight (f ---> e direction)
0.033513 //swap weight (f-->e direction)
0.216945 //discontinuous weight (f-->e direction)
0.057195 //monotone weight (e---> f direction)
0.037017 //swap weight (e-->f direction)
0.111842 //discontinuous weight (f-->e direction)

#distortion样例
国际 ||| in the world ||| 0.016393 0.180328 0.803279 0.016393 0.016393 0.967213
6个特征值分别表示分别对应monotone, swap, discontinuous (f-->e) direction, monotone, swap, discontinuous (e-->f) direction

在解码过程中, 采用stack decoding算法, 每形成一个短语对时, 只能推断f-->e方向的orientation, e-->f方向的orientation需要到下一个短语对形成时计算 (句子最后一个短语对e-->f方向的orientation不计算). oreientation取值只能为monotone, swap, discontinuous中的某一项, 其他两项的特征值置0

# translation model weights
[weight-t]
0.008391 //P(f|e) weight
0.017675 //Lexical(f|e) weight
0.043903 //P(e|f) weight
0.029038 //Lexical(e|f) weight
0.074607 //Phrase Penalty weight

phrase table样例
国际 ||| in the world ||| 0.0174804 0.0214346 0.00197252 7.68173e-05 2.718
5个特征值分别对应以上5项


对未登录词的处理
未登录词的权重为1.0, 特征值为-100. 未登录词的翻译结果为其本身, 但# distortion (reordering) 6个特征值为0, translation model的5个特征值为0.. language mode特征值由lm定.. 以下是未登录词的特征值及权重的一个例子:
score:0.000000 weight:0.010878 //distance feature
score:-1.000000 weight:-0.236574 //word penalty feature
score:-100.000000 weight:1.000000 //unknown word feature
score:0.000000 weight:0.059869 //monotony f-->e
score:0.000000 weight:0.033513 //swap
score:0.000000 weight:0.216945 //discontinuous
score:0.000000 weight:0.057195 //monotony e-->f
score:0.000000 weight:0.037017 //swap
score:0.000000 weight:0.111842 //discontinuous
score:-6.271241 weight:0.062555 //language model
score:0.000000 weight:0.008391 //translation (f|e)
score:0.000000 weight:0.017675 //lexical (f|e)
score:0.000000 weight:0.043903 //translation (e|f)
score:0.000000 weight:0.029038 //lexical (e|f)
score:0.000000 weight:0.074607 //phrase penalty


Future value的计算....
Future value计算考虑的feature包括word penalty feature, unknown word feature, language model, 以及5种translation features; 即不包括distance feature以及各种distortion reordering features

在对hypothesis expanding的时候, 例如输入句子, 此时已翻译的单词为0:
( 海啸 救灾 ) 中国 国际 广播 电台 与 中华 慈善 总会 将 联合 举办 广播 赈灾 义演
“中国 国际 广播”并不作为一个源语言的phrase, moses给的解释是(参见SearchNormal.cpp):
// the basic idea is this: we would like to translate a phrase starting
// from a position further right than the left-most open gap. The
// distortion penalty for the following phrase will be computed relative
// to the ending position of the current extension, so we ask now what
// its maximum value will be (which will always be the value of the
// hypothesis starting at the left-most edge). If this value is less than
// the distortion limit, we don't allow this extension to be made.
计算出来的distortion值为7, 大于默认的distortion limit值6.

BeamSearch的Beamwidth设置为-11.5129251, 即ln(0.00001), 用于存放hypothesis的stack并没有限制其大小.

stack的数据结构为stl set容器, 不清楚各元素是按什么进行排序??

Recombining的条件是与set容器内的某个元素值相同, 但不清楚元素值相同的条件什么??
1) 首先已翻译的单词一致;
2) 最近翻译的f端短语终止位置一致
3) 最近翻译的短语对在f-->e方向上的orientation一致并且特征值也一致?.
??与n-gram无关?

抽取规则时短语的最大长度默认为7 (源端和目标端), 可以max-phrase-length设置

没有评论: