置信区间,就是一种区间估计。
先来看看什么是点估计,什么是区间估计。
以前很流行一种刮刮卡:
游戏规则是(假设只有一个大奖):
大奖事先就固定好了,一定印在某一张刮刮卡上
买了刮刮卡之后,刮开就知道自己是否中奖
那么我们起码有两种策略来刮奖:
很显然区间估计的命中率会更高(当然费用会更高,因为风险降低了)。
接下来,我们看看置信区间是如何进行区间估计的。
我们通过对人类身高的估计来讲解什么是置信区间。
2.1 上帝视角
对于人类真实的平均身高,我们是没有办法知道的,因为几乎不可能把每个人都统计到。
但这个数据肯定是真实存在的,我们可以说,上帝知道。
在这里我们引入了上帝视角,即上帝看到的人类身高的真实分布。
假设人类的身高分布服从如下正态分布(
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mo%22%20transform%3D%22translate(881%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-3D%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%20transform%3D%22translate(1937%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-31%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-34%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%221001%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mo%22%20transform%3D%22translate(3439%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-2C%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(3884%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3C3%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mo%22%20transform%3D%22translate(4734%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-3D%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%20transform%3D%22translate(5790%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-31%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-2E%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-34%22%20x%3D%22779%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
):
也就是说全体人类的平均身高为145cm,为了表示只有上帝可以看到,我把真实分布用虚线来表示:
2.2 点估计
作为愚蠢的人类,我们只能在人群中抽样统计:
比如下面是一次抽样数据,我把算出来的样本均值(记作
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-texatom%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-munderover%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mo%22%20transform%3D%22translate(79%2C20)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-5E%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
)画在图上(蓝色的点):
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-texatom%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-munderover%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mo%22%20transform%3D%22translate(79%2C20)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-5E%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
就是对真实的
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
的一次点估计。
通过一次次的抽样,我们可以算出不同的身高均值的点估计:
如果我们关闭上帝视角,我们分辨不出哪个点估计更好:
区间估计可以改进此问题。
2.3 置信区间
置信区间,提供了一种区间估计的方法。
下面采用
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
置信区间来构造区间估计(什么是
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
置信区间,这个我们后面解释):
通过
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
置信区间构造出来的区间,我们可以看到,基本上都包含了真实的
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
,除了红色的那根。
关闭上帝视角,我们仍然不知道哪一个区间估计更好:
但是,和点估计比较:
点估计和区间估计,都不知道哪个点或者哪个区间更好
但是,按照
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
置信区间构造出来的区间,随便选一个区间,有
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
的概率会包含
这就好像用渔网捞鱼,我知道每一网下去有
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
的几率捞到想要的那条鱼,但是并不知道是不是现在这一网:
剩下的问题就是
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
置信区间是如何构造的。
3
置信区间假设人群的身高服从:
其中
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
未知,
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3C3%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
已知。
我们不断对人群进行采样,样本的大小为
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-6E%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
,样本的均值:
根据大数定律和中心极限定律,
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-4D%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
服从:
我们可以算出以
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
为中心,面积为0.95的区间,如下图:
即:
也就是,
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-4D%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
有
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
的几率落入此区间:
那么,只有一个问题了,我们不知道、并且永远都不会知道真实的
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
是多少。
我们就只有用
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-texatom%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-munderover%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mo%22%20transform%3D%22translate(79%2C20)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-5E%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
来代替
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMATHI-3BC%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
:
总结一下:
置信区间要求估计量是个常数
%22%20aria-hidden%3D%22true%22%3E%0A%3Cg%20class%3D%22mjx-svg-mrow%22%3E%0A%3Cg%20class%3D%22mjx-svg-mn%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-39%22%3E%3C%2Fuse%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-35%22%20x%3D%22500%22%20y%3D%220%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3Cg%20class%3D%22mjx-svg-mi%22%20transform%3D%22translate(1001%2C0)%22%3E%0A%20%3Cuse%20xlink%3Ahref%3D%22%23E1-MJMAIN-25%22%3E%3C%2Fuse%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fg%3E%0A%3C%2Fsvg%3E)
也被称为置信水平,是统计中的一个习惯,可以根据应用进行调整