[Python学习笔记-011]HTML特殊字符处理

2021/9/19 14:06:08

编程Tag： html python python3 sys argv py foo 011

本文主要是介绍[Python学习笔记-011]HTML特殊字符处理，对大家解决编程问题具有一定的参考价值，需要的程序猿们随着小编来一起学习吧！

在HTML中，有大量的特殊字符，如果需要通过Python进行编码和解码，则需使用模块html。例如：

>>> import html
>>> s = ' " '
>>> html.escape(s)
' &quot; '
>>> html.unescape(' &quot; ')
' " '
>>>

因此，将特殊字符进行编码，使用html.escape()；反之，解码则使用html.unescape()。

一个实用的小脚本（foo.py）

 1 #!/usr/bin/python3
 2 """ Encode/Decode HTML special chars """
 3 
 4 import sys
 5 import html
 6 
 7 
 8 def main(argc, argv):
 9     if argc != 3:
10         print("Usage: %s <-e|-d> <chars>" % argv[0], file=sys.stderr)
11         return 1
12     op = argv[1]
13     chars = argv[2]
14     if op == '-e':
15         print(html.escape(chars))
16     else:
17         print(html.unescape(chars))
18     return 0
19 
20 
21 if __name__ == '__main__':
22     sys.exit(main(len(sys.argv), sys.argv))

Sample:

$ python3 foo.py -e \"
&quot;
$ python3 foo.py -d '&quot;'
"

$ python3 foo.py -e \'
&#x27;
$ python3 foo.py -d '&#x27;'
'

$ python3 foo.py -d '&sum;'
∑

参考资料：

https://dev.w3.org/html5/html-author/charref
html.escape() in Python
HTML Named character references

这篇关于[Python学习笔记-011]HTML特殊字符处理的文章就介绍到这儿，希望我们推荐的文章对大家有所帮助，也希望大家多多支持为之网！

[Python学习笔记-011]HTML特殊字符处理

相关编程文章