首页 > 编程 > Python > 正文

详解Python中的文本处理

2020-02-23 00:39:17
字体:
来源:转载
供稿:网友

字符串 -- 不可改变的序列

如同大多数高级编程语言一样,变长字符串是 Python 中的基本类型。Python 在“后台”分配内存以保存字符串(或其它值),程序员不必为此操心。Python 还有一些其它高级语言没有的字符串处理功能。

在 Python 中,字符串是“不可改变的序列”。尽管不能“按位置”修改字符串(如字节组),但程序可以引用字符串的元素或子序列,就象使用任何序列一样。Python 使用灵活的“分片”操作来引用子序列,字符片段的格式类似于电子表格中一定范围的行或列。以下交互式会话说明了字符串和字符片段的的用法:
字符串和分片

>>> s =     "mary had a little lamb">>> s[0]     # index is zero-based    'm'>>> s[3] =     'x'     # changing element in-place failsTraceback (innermost last): File     "<stdin>", line 1,     in     ?TypeError: object doesn't support item assignment>>> s[11:18]     # 'slice' a subsequence    'little '>>> s[:4]     # empty slice-begin assumes zero    'mary'>>> s[4]     # index 4 is not included in slice [:4]    ' '>>> s[5:-5]     # can use "from end" index with negatives    'had a little'>>> s[:5]+s[5:]     # slice-begin & slice-end are complimentary    'mary had a little lamb'

另一个功能强大的字符串操作就是简单的 in 关键字。它提供了两个直观有效的构造:
in 关键字

>>> s =     "mary had a little lamb">>>     for     c     in     s[11:18]:     print     c,     # print each char in slice...l i t t l e>>>     if    'x'     in     s:     print    'got x'     # test for char occurrence...>>>     if    'y'     in     s:     print    'got y'     # test for char occurrence...got y

在 Python 中,有几种方法可以构成字符串文字。可以使用单引号或双引号,只要左引号和右引号匹配,常用的还有其它引号的变化形式。如果字符串包含换行符或嵌入引号,三重引号可以很方便地定义这样的字符串,如下例所示:
三重引号的使用

>>> s2 =     """Mary had a little lamb... its fleece was white as snow... and everywhere that Mary went... the lamb was sure to go""">>>     print     s2Mary had a little lambits fleece was white as snow    and     everywhere that Mary wentthe lamb was sure to go

使用单引号或三重引号的字符串前面可以加一个字母 "r" 以表示 Python 不应该解释规则表达式特殊字符。例如:
使用 "r-strings"

>>> s3 =     "this /n and /n that">>>     print     s3this    and    that>>> s4 = r    "this /n and /n that">>>     print     s4this /n     and     /n that            
发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表