Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Book6_Ch04_第3节 4.3 标准化:Z 分数 #16

Open
virtualxiaoman opened this issue May 25, 2024 · 0 comments
Open

Book6_Ch04_第3节 4.3 标准化:Z 分数 #16

virtualxiaoman opened this issue May 25, 2024 · 0 comments

Comments

@virtualxiaoman
Copy link

原文标准化通常是指将数据缩放到均值为 0,标准差为 1 的标准正态分布上。
存疑处:均值为 0,标准差为 1肯定没问题,但说是缩放到标准正态分布我感觉有点不妥(或者说容易引起误解)。
原因:Z-Score标准化不会改变原始数据的分布形状,分布形状并不会变成正态分布,比如如下代码:

# 生成一组对数正态分布的数据
x = stats.lognorm.rvs(0.5, size=1000)
# 绘制直方图
plt.hist(x, bins=50, density=True, alpha=0.6, color='g')
plt.show()
# 对x进行z-score标准化
x_zscore = stats.zscore(x)
# 绘制直方图
plt.hist(x_zscore, bins=50, density=True, alpha=0.6, color='g')
plt.show()

image
建议:Z-Score并没有改变原始数据的分布形状,所以我认为可以改为:使得处理后的数据具有固定均值0和标准差1(但不一定是正态)。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant