By Johan A. K. Suykens

Advances in learning theory: methods, models, and applications

**Sample text**

This fact will play an important role towards constructing new function estimation methods. 4 13 Distribution independent bounds for the rate of convergence of learning processes Consider sets of functions which possess a finite VC-dimension h. We distinguish then between the following two cases: 1. The case where the set of loss functions Q(z, a), a G A is a set of totally bounded functions 2. The case where the set of loss functions Q(z, a), a G A is not necessarily a set of totally bounded functions.

Note, in the last inequality we replaced r7 by its definition and used that \/CK < (C# + 7). 2) where CQ = j, ci = In (4|i), c2 = C\, the tth power of Ct. Note that one could fix, for example, t = 1. 2) to obtain the equation ^(w) = vt+2 - ^-v* - % = 0 CQ CD and note that this equation has only one positive zero by Lemma 7. Let v*(m, 5) be this solution. Then, also by Lemma 7, and and we can conclude stating the following result. Theorem 3 Given m>l and 0 < S < I, for all 7 > 0, the expression bounds the sample error with confidence at least 1 — 8.

D Lemma 5 For all 7, e > 0, 2e PROOF. - 4me ^M']2 From Lemmas 3 and 4 it follows that, with a probability at least 1 —2 I e 2(C ^^)2 +e for every t € X, <2e. i=l 37 Best Choices for Regularization Parameters in Learning Theory Note that, since max{Mp, ||/p||oo + x/Ctf^y} < M -f 7 the confidence above is at least Applying this to £ = x\,... ,xm and writing the m resulting inequalities in matrix form we obtain that, with confidence at least the one in the statement, 1 1 rrry <2e. D Lemma 6 For all 7, e > 0, 7m / " PROOF.