Skip to content

Commit

Permalink
垃圾邮件分类:添加拉普拉斯平滑
Browse files Browse the repository at this point in the history
  • Loading branch information
Jack-Cherish committed Oct 23, 2017
1 parent 4da1716 commit e11559b
Showing 1 changed file with 0 additions and 24 deletions.
24 changes: 0 additions & 24 deletions Naive Bayes/bayes-modify.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,6 @@
import random
import re

"""
函数说明:创建实验样本
Parameters:
Returns:
postingList - 实验样本切分的词条
classVec - 类别标签向量
Author:
Jack Cui
Blog:
http://blog.csdn.net/c406495762
Modify:
2017-08-11
"""
def loadDataSet():
postingList=[['my', 'dog', 'has', 'flea', 'problems', 'help', 'please'], #切分的词条
['maybe', 'not', 'take', 'him', 'to', 'dog', 'park', 'stupid'],
['my', 'dalmation', 'is', 'so', 'cute', 'I', 'love', 'him'],
['stop', 'posting', 'stupid', 'worthless', 'garbage'],
['mr', 'licks', 'ate', 'my', 'steak', 'how', 'to', 'stop', 'him'],
['quit', 'buying', 'worthless', 'dog', 'food', 'stupid']]
classVec = [0,1,0,1,0,1] #类别标签向量,1代表侮辱性词汇,0代表不是
return postingList,classVec #返回实验样本切分的词条和类别标签向量

"""
函数说明:将切分的实验样本词条整理成不重复的词条列表,也就是词汇表
Expand Down

0 comments on commit e11559b

Please sign in to comment.