| import re | 
| . | 任意的一个字符 | 
| a|b | 字符a或字符b | 
| [afg] | a或者f或者g的一个字符 | 
| [0-4] | 0-4范围内的一个字符 | 
| [a-f] | a-f范围内的一个字符 | 
| [^a] | 不是a的一个字符 | 
| \s | 一个空格 | 
| \S | 一个非空格 | 
| \d | [0-9],即0-9的任意字符 | 
| \D | [^0-9],即非0-9的任意字符 | 
| \w | [0-9a-zA-Z] | 
| \W | [^0-9a-zA-Z] | 
| \b | 匹配一个单词边界,也就是指单词和空格间的位置。例如,“er\b”可以匹配“never”中的“er”,但不能匹配“verb”中的“er” | 
| \B | 匹配非单词边界。“er\B”能匹配“verb”中的“er”,但不能匹配“never”中的“er” | 
| * | 重复>=0次 | 
| + | 重复>=1次 | 
| ? | 重复0次或是1次 | 
| {m} | 重复m次,如[01]{2}匹配字符串00或11或01或10 | 
| {m,n} | 重复m-n次,如a{1,3}匹配字符串a或aa或aaa | 
| ^ | 字符串的起始位置 | 
| $ | 字符串的结尾位置 | 
| m = re.search("output_(\d{4}).*(\d{4})", "output_1986a.txt1233") | 
| >>> match = re.search(r‘(?P<first>\bt\w+)\W+(?P<second>\w+)‘, ‘This is test for python group‘)     
>>> print match
<_sre.SRE_Match object at 0x23f6250>
>>> print match.group()
test for
>>> print match.group(0)
test for
>>> print match.group(1)
test
>>> print match.group(2)
for
>>> print match.groupdict()     
{‘second‘: ‘for‘, ‘first‘: ‘test‘}
>>> print match.groupdict()[‘first‘]
test
>>> print match.groupdict()[‘second‘]
for | 
| >>> help(re.compile) Help on function compile in module re: compile(pattern, flags=0)     Compile a regular expression pattern, returning a pattern object. | 
| 1).re.I(re.IGNORECASE): 忽略大小写 2).re.M(MULTILINE): 多行模式,改变‘^‘和‘$‘的行为 3).re.S(DOTALL): 点任意匹配模式,改变‘.‘的行为 4).re.L(LOCALE): 使预定字符类 \w \W \b \B \s \S 取决于当前区域设定 5).re.U(UNICODE): 使预定字符类 \w \W \b \B \s \S \d \D 取决于unicode定义的字符属性 6).re.X(VERBOSE): 详细模式。这个模式下正则表达式可以是多行,忽略空白字符,并可以加入注释  | 
| import re
pattern = re.compile(r‘re‘)
pattern.match(‘This is re module of python‘)
re.compile(r‘re‘, ‘This is re module of python‘)
# 以上两种方式是一样的
# 以下两种方式是一样的
pattern1 = re.compile(r"""\d + #整数部分
                          \.   #小数点
                          \d * #小数部分""", re.X)
pattern2 = re.compile(r‘\d+\.\d*‘) | 
| >>> help(re.match) Help on function match in module re: match(pattern, string, flags=0)     Try to apply the pattern at the start of the string, returning     a match object, or None if no match was found. | 
| >>> match = re.match(r‘This‘, ‘This is re module of python‘) >>> print match <_sre.SRE_Match object at 0x0000000002C26168> >>> print match.group() This >>> match = re.match(r‘python‘, ‘This is re module of python‘) >>> print match None | 
| >>> help(re.search) Help on function search in module re: search(pattern, string, flags=0)     Scan through string looking for a match to the pattern, returning     a match object, or None if no match was found. | 
| >>> match = re.search(r‘(?P<first>\bt\w+)\W+(?P<second>\w+)‘, ‘This is test for python group‘)     
>>> print match
<_sre.SRE_Match object at 0x23f6250>
>>> print match.group()
test for
>>> print match.group(0)
test for
>>> print match.group(1)
test
>>> print match.group(2)
for
>>> print match.groupdict()     
{‘second‘: ‘for‘, ‘first‘: ‘test‘}
>>> print match.groupdict()[‘first‘]
test
>>> print match.groupdict()[‘second‘]
for | 
| >>> help(re.split) Help on function split in module re: split(pattern, string, maxsplit=0, flags=0)     Split the source string by the occurrences of the pattern,     returning a list containing the resulting substrings. | 
| >>> results = re.split(r‘\d+‘, ‘fasdf12fasdf4fasf1fasdf123‘) >>> type(results) <type ‘list‘> >>> print results [‘fasdf‘, ‘fasdf‘, ‘fasf‘, ‘fasdf‘, ‘‘] >>> results = re.split(r‘-‘, ‘2013-11-12‘) >>> print results [‘2013‘, ‘11‘, ‘12‘] | 
| >>> help(re.findall) Help on function findall in module re: findall(pattern, string, flags=0)     Return a list of all non-overlapping matches in the string.     If one or more groups are present in the pattern, return a     list of groups; this will be a list of tuples if the pattern     has more than one group.     Empty matches are included in the result. | 
| >>> results = re.findall(r‘\bt\w+\W+\w+‘, ‘this is test for python findall‘) 
>>> results
[‘this is‘, ‘test for‘]
>>> results = re.findall(r‘(\bt\w+)\W+(\w+)‘, ‘this is test for python findall‘)
>>> results
[(‘this‘, ‘is‘), (‘test‘, ‘for‘)] | 
| sub(pattern, repl, string, count=0)subn(pattern, repl, string, count=0) | 
| >>> print re.sub(r‘(\w+) (\w+)‘, r‘\2 \1‘, ‘i say, hello world!‘)
say i, world hello! | 
| >>> print re.subn(r‘(\w+) (\w+)‘, r‘\2 \1‘, ‘i say, hello world!‘) (‘say i, world hello!‘, 2) >>> print re.subn(r‘(\w+) (\w+)‘, r‘\2 \1‘, ‘i say, hello world!‘, count=1) (‘say i, hello world!‘, 1) | 
原文:http://blog.csdn.net/henujyj/article/details/42915387