如何用Python实现简单的Markdown转换器

2020-02-15 22:25:22

字体：大中小

来源：转载

供稿：网友

今天心血来潮，写了一个 Markdown 转换器。

import os, re,webbrowsertext = '''# TextHeader ## Header1  List   - 1    - 2   - 3  > **quote**  》 quote2 ## Header2  1. *斜体*  2. [@以茄之名](https://www.jb51.net/people/e4f87c3476a926c1e2ef51b4fcd18fa3)  3、 ![](https://www.jb51.net/v2-8560440c136c746730a63813ed701f52_is.jpg)   ## Header3   `*[文章地址](https://zhuanlan.zhihu.com/p/39742445)*`  ·**code1**·  - [x]是否点赞'''

程序开头先处理一些行内的语法，比如 code、strong、i 等，用正则直接替换：

text = re.sub(re.compile('([/`·])([^`·]+)[/`·]'), r'<code>/2</code>', text)text = re.sub(re.compile('/*/*([^/*]+)/*/*'), r'<strong>/1</strong>', text)text = re.sub(re.compile('([^/*])/*([^/*]+)/*'), r'/1<i>/2</i>', text)

接着是复杂一点的图片和链接：

text = re.sub(re.compile('([^/!])/[([^/]]+)/]/(([^)]+)/)'),    r'/1<a href="/3" rel="external nofollow" target="_blank">/2</a>', text)text = re.sub(re.compile('/!/[([^/]]*)/]/(([^)]+)/)'),    r'<img src="/2" >', text)

接着就处理其他的语法，先把文本按每一行分开：

lines = text.split('/n')html = ''list_flag = ''

处理列表和待办事项的问题：

for line in lines: line = line.strip(' ') if re.match('- /[[ x]/]', line):  print('matched')  p_html = ''  if re.match('- /[x/]', line):   p_html = ' checked="checked"'  line = re.sub('- /[[ x]/]', '', line)  html += '''<label class="cssCheckbox">  <input type="checkbox" %s />  <span></span>%s  </label>''' % (p_html, line)

因为有序列表和无序列表的区别是头尾的ol和ul，所以要用 list_flag 变量来判断

elif re.match('[/+/-/*] ', line): if list_flag == '':  html += '<ul>/n'  list_flag = 'ul' line = re.sub('[/+/-/*] ', '', line) html += '<li>%s</li>/n' % (line)elif re.match('[/d]+[.、] ', line): if list_flag == '':  list_flag = 'ol'  html += '<ol>/n' line = re.sub('[/d]+[.、] ', '', line) html += '<li>%s</li>/n' % (line)

处理完后处理其他的语法：

else:  if list_flag != '':   html += '</%s>/n' % list_flag   list_flag = ''  if re.match('/#+', line):   well = re.match('/#+', line).group().count('#')   line = re.sub('/#+', '', line)   html += '<h%i>%s</h%i>/n' % (well, line, well)  elif re.match('[>》 ]', line):   line = re.sub('^/s*[>》 ]', '', line)   html += '<blockquote>%s</blockquote>/n' % (line)  # elif re.match('[>》 ]', line):  #  line = re.sub('^/s*[>》 ]', '', line)  #  html += '<blockquote>%s</blockquote>/n' % (line)  else:   html += line

这里我稍微修改了一点，让 > 和》都可以转换成引用，主要是切换中英文标点太难了。

上一篇：对pycharm代码整体左移和右移缩进快捷键的介绍

下一篇：解决python3 urllib 链接中有中文的问题