Cousera机器学习基石第四周笔记-Machine-Learning-Foundation-Week-4-Note-in-Cousera

1 min

Feasibility of Learning

Learning is Impossible?

Probability to the Rescue

Inferring Something Unknow

in sample\rightarrowout sample

Possible versus Probable

Hoeffding’s Inequality

In big sample(N large),υ\boldsymbol{\upsilon} is probably close to u\boldsymbol{u} (within ϵ\boldsymbol{\epsilon})

P[vu>ϵ]2exp(2ϵ2N)\mathbb{P}[|v-u|>\boldsymbol{\epsilon}]\leq2exp(-2\boldsymbol{\epsilon}^2N)

called Hoeffding’s Inequality, for marbles,coin,polling

the statement v=uv=u is probably approximately correct(PAC)

  • valid for all N and ϵ\boldsymbol{\epsilon}

  • does not depend on uu,no need to knowuu

  • larger sample size N or looser gap ϵ\boldsymbol{\epsilon} \rightarrowhigher probability for v=uv=u

    if large N,can probably infer unknown uu by know vv

Connection to Learning

Added Components

for any fixed h, can probably infer unkown Eout(h)=εXP[h(x)f(x)]E_out(h)=\underset{X\approx P}{\varepsilon}[h(x)\ne f(x)]by knownEin(h)=1Nn=1N[h(x)f(x)]E_in(h)=\frac{1}{N}\sum^N_{n=1}[h(x)\ne f(x)]

The Formal Guarantee

if Ein(h)Eout(h)E_{in}(h)\approx E_{out}(h) andEin(h)smallEout(h)samllhfE_{in}(h)small\rightarrow E_{out}(h)samll\rightarrow h\approx fwith respect to P

Verification of One h

if Ein(h)E_{in}(h) small for the fixed h and A pick the h as g\rightarrow g=f PAC

if A force to pick THE h as gEin(h)\rightarrow E_{in}(h) almost always not smallgf\rightarrow g\ne f PAC

real learning:

A shall make choices\in \H (like PLA) rather than being forced to pick one h.

The ‘Verification’ Flow

Connection to Real Learning

BAD Sample and BAD Data

BAD Sample:Eout=12E_{out}=\frac{1}{2},but getting all heads(Ein=0E_{in}=0)

BAD Data for One h:Eout(h)E_{out}(h) and EinhE_{in}h far away

BAD data for Many h

BAD data for many h \Leftrightarrow no freedom of choice by A \Leftrightarrow there exists some h such that Eout(h)E_{out}(h) and Ein(h)E_{in}(h) far away

Bound of BAD Data

The Statistical Learning Flow

if H|\mathbb{H}|= M finite, N large enough,for whatever g picked by A,Eout(g)Ein(g)E_{out}(g)\approx E_{in}(g)

if A finds one g with Ein(g)0E_{in}(g)\approx 0,PAC guarantee forEout(g)E_{out}(g)\Rightarrowlearning possible

M=\infty? - see you in the next lectures~

吐槽

这个作业题是真的难啊,花了一个半小时才堪堪通过,尤其是最后几个写PLA和pocket算法的