1) 得到当前目录
use Cwd qw( abs_path );
use File::Basename qw( dirname );
my $current_dir = dirname(abs_path($0));
2) 使用regex进行部分lowercase替换
$line =~ s/( [^\(\)]+\))/lc($1)/ge;
((S (NP (DT The)(NN luxury)(NN auto)(NN maker))(NP (JJ last)(NN year))(VP (VBD sold)(NP (CD 1,214)(NNS cars))(PP (IN in)(NP (DT the)(NNP U.S.))))))
替换为
((S (NP (DT the)(NN luxury)(NN auto)(NN maker))(NP (JJ last)(NN year))(VP (VBD sold)(NP (CD 1,214)(NNS cars))(PP (IN in)(NP (DT the)(NNP u.s.))))))
3) 使用regex提取字符数组
my @terms = ($line =~ /\([^\(\)]+\)/g);
((S (NP (DT The)(NN luxury)(NN auto)(NN maker))(NP (JJ last)(NN
year))(VP (VBD sold)(NP (CD 1,214)(NNS cars))(PP (IN in)(NP (DT the)(NNP
U.S.))))))
提取出(DT The) (NN luxury) (NN auto) (NN maker) (JJ last) (NN
year) .... (NNP
U.S.)
没有评论:
发表评论