Visual Knowledge Distillation

Low dimensional embeddings needs decision boundaries.

What can you do to visualize a probabilistic classifier? Well, if the data is in 2D, you'll probably visualize its decision boundary directly. But what if the data lives in high dimension?

Dimensionality reduction solely on the data points is the wrong answer! Why? Because for a classification task, low dimensional embeddings is meaningless without the corresponding decision boundary. One can in fact interpret the end classification arbitrarily if only sees the data embeddings.

What should we do, instead?

The solution is to visualize both the data and the classifier in a lower dimension space. This can be done by jointly performing dimension reduction and knowledge distillation via Darksight.