首页 > 编程 > Python > 正文

经典的把一篇英文文章转成word2id形式的dict的一段python程序

2019-11-06 08:26:20
字体:
来源:转载
供稿:网友
import collectionsimport tensorflow as tfdef _read_Words(filename): with tf.gfile.GFile(filename, "r") as f: return f.read().decode("utf-8").replace("/n", "<eos>").split()def _build_vocab(filename): data = _read_words(filename) counter = collections.Counter(data) count_pairs = sorted(counter.items(), key=lambda x: (-x[1], x[0])) words, _ = list(zip(*count_pairs)) word_to_id = dict(zip(words, range(len(words)))) return word_to_id

摘自https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/reader.py


发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表