python去除重复单词_Python重复的单词
我可以看到你在哪里排序,因为你可以可靠地知道你什么时候打了一个新单词并跟踪每个独特单词的计数。但是,您真正想要做的是使用哈希(字典)来跟踪计数,因为字典键是唯一的。例如:words = sentence.split()counts = {}for word in words:if word not in counts:counts[word] = 0counts[word] += 1现在,它将为您
我可以看到你在哪里排序,因为你可以可靠地知道你什么时候打了一个新单词并跟踪每个独特单词的计数。但是,您真正想要做的是使用哈希(字典)来跟踪计数,因为字典键是唯一的。例如:
words = sentence.split()
counts = {}
for word in words:
if word not in counts:
counts[word] = 0
counts[word] += 1现在,它将为您提供一个字典,其中键是单词,值是它出现的次数。您可以使用collections.defaultdict(int)执行某些操作,因此您只需添加以下值即可:
counts = collections.defaultdict(int)
for word in words:
counts[word] += 1但是甚至还有更好的东西...... collections.Counter会将你的单词列表转换成包含计数的字典(字典的扩展名)。
counts = collections.Counter(words)从那里你需要按排序顺序的单词列表及其计数,以便您可以打印它们。 items()将为您提供元组列表,sorted将按每个元组的第一项(在本例中为单词)排序(默认情况下)...这正是您想要的。
import collections
sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""
words = sentence.split()
word_counts = collections.Counter(words)
for word, count in sorted(word_counts.items()):
print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else ""))OUTPUT
"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.
更多推荐
所有评论(0)