江天一色: 笔记：Learning Perl

Chapter 3: Lists and Arrays-Using List-Producing Expressions in Scalar Context

#In a list context, it gives a reversed list.
#In a scalar context, it returns a reversed string
#(or reversing the result of concatenating all the strings of a list, if given one):
@backwards = reverse qw/ yabba dabba doo /;
   # gives doo, dabba, yabba 

$backwards = reverse qw/ yabba dabba doo /;
   # gives oodabbadabbay

Chapter 4: Subroutines-Omitting the Ampersand
与内建函数同名的自定义函数
假设一个自定义函数chomp与内置的chomp函数同名，则必须以&chomp调用自定义的版本，若以chomp调用则会优先考虑内置的版本。
Chapter 4:?Subroutines-Persistent, Private Variables
state关键字
使用state关键字修饰函数内部变量，则变量在函数调用结束后值仍被保存。
类似C里面的局部static变量，即所谓的private, persistent variable。
例如：
use 5.010;
sub foo {
state $n=0;
state @m;
#......
}
Perl 5.10还不支持直接用列表环境初始化state类型的array/hash变量。
例如：
state @array = qw(a b c);
就会报错。
Chapter 5: Input and Output-The Invocation Arguments
<>与@ARGV
<>会先检查@ARGV：
如果@ARGV不空，直接用里面的各项做为输入。
如果@ARGV为空，才会用STDIN做为输入。
也就是说在调用<>之前我们可以通过指定@ARGV来调整<>读入的内容：
@ARGV = qw# larry moe curly #; # force these three files to be read while (<>) { chomp; print "It was $_ that I saw in some stooge-like file!\n"; }
Chapter 5: Input and Output-Changing the Default Output Filehandle
更改print输出的文件句柄
默认情况下print是往STDOUT送。
select NEWFH;
print "some content\n";
是往NEWFH里面送。
如果置$|=1，则表示每次print过后缓冲区立即输出。

Chapter 7 - Chapter 8

Chapter 7: In the World of Regular Expressions
\g{N}反向引用
原来：

$_ = "aa11bb"; 

if (/(.)\111/) { 

    print "It matched!\n"; 

}

Perl会认为是\111，但是其实想搞的是\1。
现在：

use 5.010; 

$_ = "aa11bb"; 

if (/(.)\g{1}11/) { 

    print "It matched!\n"; 

}

用\g{N}指代第N个反向引用。
还可以用相对的序号-N倒着往前数：

use 5.010; 

$_ = "aa11bb"; 

if (/(.)\g{-1}11/) { 

    print "It matched!\n"; 

}

空白符集
\h匹配水平空白符，等价于[\t ]
\v匹配垂直空白符，等价于[\f\n\r]
\R匹配断行(支持各种操作系统)
Chapter 8: Matching with Regular Expressions
匹配变量的生命期 The Persistence of Memory
当前匹配变量一直保存到下次成功匹配才改变。
也就是说，除非确定真的成功匹配了(也就是说每次都要检查成没成啦)，否则不要调用$1,$2,...
不然你得到的只是上次成功匹配的变量。
这在当前匹配失败的时候可能会出问题。
如果匹配成功后又要过一阵子才用到这次的匹配变量，可以把$1内容拷到命名变量中保存，这样可读性也好点。
命名匹配 Named Captures
用(?<LABEL>PATTERN)对模式进行命名，匹配成功则将内容写入%+变量，即$+{LABEL}。

use 5.010;

my $names = 'Fred or Barney';

if( $names =~ m/((?<name1>\w+) (and|or) (?<name2>\w+))/ ) {

say "I saw $+{name1} and $+{name2}";

}

反向引用也可以搞标签\g{LABEL}

use 5.010;

my $names = 'Fred Flinstone and Wilma Flinstone';

if( $names =~ m/(?<last_name>\w+) and \w+ \g{last_name}/ ) {

say "I saw $+{last_name}";

}

\k{LABEL}与\g{LABEL}的作用一样，不过\g{N}还能指定相对反向引用。

Chapter 9 Processing Text with Regular Expressions

大小写转换
\U后面全转成大写
\L后面全转成小写
\E关闭大小写转换
\u首字母转成大写
\l首字母转成小写
另：这些转换在双引号内总是可以起作用。
print "Hello, \L\u$name\E, would you like to play a game?\n";
split函数

@fields = split /:/, ":::a:b:c:::";  

# gives ("", "", "", "a", "b", "c")

前面的三个空白保留，但是后面的所有空白都省略掉。
如果是

split;

那么行首的空格也被忽略。
例子：

$_=" test abc bcd";

print "split\n";

print "#$_" for split;

print "\nsplit / /, \$_\n";

print "#$_" for split / /,$_;

print "\nsplit /\\s+/, \$_\n";

print "#$_" for split /\s+/,$_;

print "\n";

----------
split #test#abc#bcd
split / /, $_ ##test#abc##bcd
split /\s+/, $_ ##test#abc#bcd

join函数
join函数的第一个参数总是"字符串"，而非"模式"！
第一个参数以后必须至少还有2个参数才能进行连接！
例：

$y = join "foo","abc"; 

$z = join "foo",@list; 

print "$y#$z#\n";

--------
abc##
非贪婪模式 Nongreedy Quantifiers
{3,5}? , {8,}? , ??

$_ = "aaaaaa";

if(/(a{3,5}?)/){

  print $1,"\n";

}

$_=  "abcd";

if(/(abc??)/){

  print $1,"\n";

}

-----------
aaa
ab
Exercises 3
Modify the previous program to change every Fred to Wilma and every Wilma to Fred. Now input like fred&wilma should look like Wilma&Fred in the output.

#!/usr/bin/perl

$^I='.out';

%list=('abc'=>'xyz','xyz'=>'abc');

while(<>){

  s/(abc|xyz)/$list{$1}/g;

  print;

}

Chapter 12 - Chapter 14

Chapter 12: File Tests-Testing Several Attributes of the Same File
文件属性检查
对同一文件的多个属性做检查时,不用
-f $file and -w $file
而可以用
-f $file and -w _
前者调用两次的stat，后者只需调用一次stat。
Perl 5.010支持连写，返回true/false：
use 5.010;
if(-w -r $file){
say "write and read!";
}
从离$file最近的-r开始检测。
stat作用于link时，返回的是link指向的对象的信息，如果要得到link本身的信息，可以用lstat。
Chapter 13: Directory Operations-An Alternate Syntax for Globbing
搜寻指定模式的文件名
现在都用
my @all_files = <*>;
my @dir_files = <$dir/* $dir/.*>;
而不用之前的glob("*");这种。
注意<>同时也可以表示文件句柄。
如果<>之间是个Perl标识符，则认为是一个文件句柄，如my @lines = ;
否则，就做glob查找文件名，如my @files = ;
Chapter 13: Directory Operations-Directory Handles
目录句柄
注意后面那个$dirname也得加，不然就变成在当前目录找文件了。

opendir SOMEDIR, $dirname or die "Cannot open $dirname: $!";

while (my $name = readdir SOMEDIR) {

next if $name =~ /^\./; # skip over dot files

$name = "$dirname/$name"; # patch up the path

next unless -f $name and -r $name; # only readable files   ...

}

Chapter 14: Strings and Sorting
字符串及排序注记
能用substr跟index搞定的，就不要用正则，主要是前者速度会更快。
一个sort的例子：
my @winners = sort by_score_and_name keys %score; sub by_score_and_name { $score{$b} <=> $score{$a} # by descending numeric score or $a cmp $b # ASCIIbetically by name }

江天一色

2008年11月17日星期一

笔记：Learning Perl

Chapter 7 - Chapter 8

Chapter 9 Processing Text with Regular Expressions

Chapter 12 - Chapter 14

没有评论:

发表评论