Python3中正则模块re.compile、re.match及re.search函数用法详解

2020-02-15 21:47:39

字体：大中小

来源：转载

供稿：网友

本文实例讲述了Python3中正则模块re.compile、re.match及re.search函数用法。分享给大家供大家参考，具体如下：

re模块 re.compile、re.match、 re.search

re 模块官方说明文档

正则匹配的时候，第一个字符是 r，表示 raw string 原生字符，意在声明字符串中间的特殊字符不用转义。

比如表示 ‘/n'，可以写 r'/n'，或者不适用原生字符 ‘/n'。

推荐使用 re.match

re.compile() 函数

编译正则表达式模式，返回一个对象。可以把常用的正则表达式编译成正则表达式对象，方便后续调用及提高效率。

re.compile(pattern, flags=0)

pattern 指定编译时的表达式字符串 flags 编译标志位，用来修改正则表达式的匹配方式。支持 re.L|re.M 同时匹配

flags 标志位参数

re.I(re.IGNORECASE)
使匹配对大小写不敏感

re.L(re.LOCAL)
做本地化识别（locale-aware）匹配

re.M(re.MULTILINE)
多行匹配，影响 ^ 和 $

re.S(re.DOTALL)
使 . 匹配包括换行在内的所有字符

re.U(re.UNICODE)
根据Unicode字符集解析字符。这个标志影响 /w, /W, /b, /B.

re.X(re.VERBOSE)
该标志通过给予你更灵活的格式以便你将正则表达式写得更易于理解。

示例：

import recontent = 'Citizen wang , always fall in love with neighbour，WANG'rr = re.compile(r'wan/w', re.I) # 不区分大小写print(type(rr))a = rr.findall(content)print(type(a))print(a)

findall 返回的是一个 list 对象

<class '_sre.SRE_Pattern'>
<class 'list'>
['wang', 'WANG']

re.match() 函数

总是从字符串‘开头曲匹配'，并返回匹配的字符串的 match 对象 <class '_sre.SRE_Match'>。

re.match(pattern, string[, flags=0])

pattern 匹配模式，由 re.compile 获得 string 需要匹配的字符串

import repattern = re.compile(r'hello')a = re.match(pattern, 'hello world')b = re.match(pattern, 'world hello')c = re.match(pattern, 'hell')d = re.match(pattern, 'hello ')if a:  print(a.group())else:  print('a 失败')if b:  print(b.group())else:  print('b 失败')if c:  print(c.group())else:  print('c 失败')if d:  print(d.group())else:  print('d 失败')

hello
b 失败
c 失败
hello

match 的方法和属性

参考链接

import restr = 'hello world! hello python'pattern = re.compile(r'(?P<first>hell/w)(?P<symbol>/s)(?P<last>.*ld!)') # 分组，0 组是整个 hello world!, 1组 hello，2组 ld!match = re.match(pattern, str)print('group 0:', match.group(0)) # 匹配 0 组，整个字符串print('group 1:', match.group(1)) # 匹配第一组，helloprint('group 2:', match.group(2)) # 匹配第二组，空格print('group 3:', match.group(3)) # 匹配第三组，ld!print('groups:', match.groups())  # groups 方法，返回一个包含所有分组匹配的元组print('start 0:', match.start(0), 'end 0:', match.end(0)) # 整个匹配开始和结束的索引值print('start 1:', match.start(1), 'end 1:', match.end(1)) # 第一组开始和结束的索引值print('start 2:', match.start(1), 'end 2:', match.end(2)) # 第二组开始和结束的索引值print('pos 开始于：', match.pos)print('endpos 结束于：', match.endpos) # string 的长度print('lastgroup 最后一个被捕获的分组的名字：', match.lastgroup)print('lastindex 最后一个分组在文本中的索引：', match.lastindex)print('string 匹配时候使用的文本：', match.string)print('re 匹配时候使用的 Pattern 对象：', match.re)print('span 返回分组匹配的 index （start(group),end(group))：', match.span(2))

上一篇：python中itertools模块zip_longest函数详解

下一篇：Python hashlib模块用法实例分析