watch: AML 02 | Learning Feasibility

3 component of using Learning

  1. Pattern exists
  2. Pin down mathematically
  3. have data (most important)

Learning set up

  1. Unkown target function
  2. Dataset:
  3. Learing algorithm picks $g\appriox f from hypothesis set H

Perceptron Learning Algorithm

Feasibility of learning.

In order to establish that learning setup and modifier the data, and in order to answer if that learning is feasible. We said that we gonna restart with specific __ and we went an example of bin. I just quickly review that.

We suppose to have a bin. In bin there are red marbles and green marbles as we see. And probility that if you pick a red marble is called $\mu$, and a probability that you pick a

BIN Model

  • Bin with red and blue marbles; Pick a sample of N marble independently

    • $\mu$: probability to pick a red marbles from the bin (blue: 1-$\mu$)
    • $\nu$: fraction of red marbles in the sample
  • In a large sample (large N), ν is probably close to μ (within tolerance ε)

    Hoeffding’s Inequality:

    $$ \begin{aligned} P[Bad] &= P[|ν-μ|>ε]≤ 2e^{-2ε^2N}, & \text{for any ε>0} \ P[Good] &= P[|ν-μ|≤ε]> 1-2e^{-2ε^2N},& \text{for any ε>0} \end{aligned} $$

    • N is large, ε is large, the P[Bad] becomes small, ν and μ are very close to each other.

      • N=1000, ε=0.05, the probability of ν-ε ≤ μ ≤ ν+ε is 0.986
      • N=1000, ε=0.1, the probability of ν-ε ≤ μ ≤ ν+ε is 0.999
      • μ∈[ν-ε, μ+ε], error bar is ±ε
    • learn from ν and reach outside the data (μ). There still is a probability of getting wrong sample, but not often.

    • μ≈ν is probably approximately correct (PAC-learning)

      “probably” : probabilty 2exp(-2 ε^2 N)
      “approximately” : tolerance ε