Third lecture task
Use whichever platform you prefer to work on – we will give references and hints for R
and for Python
.
Do one of the following tasks:
Task
What is the most economical way you can store the entire distance matrix for a particular dataset?
Assume no storage overhead, and 16 bit numbers: how large a dataset can you comfortably handle using a 4G memory allocation?
Task
One commonly used kernel is the Gaussian or RBF kernel with positive bandwidth parameter \(\sigma\).
$$
K(x,y) = \exp\left[-\frac{\|x-y\|^2}{2\sigma^2}\right]
$$
- Show that this kernel is a kernel in the sense of the lecture slides.
- Is this a metric? Prove or provide counterexample.