TensorLearn
Back to Course
Computer Vision Engineering
Module 9 of 11

9. Self-Supervised Learning

1. Learning without Labels

Labeling 1M images costs $100k. The internet has billions of unlabeled images. Self-Supervised Learning (SSL) creates "Pseudo-labels" from the data itself.

2. MAE (Masked Autoencoders)

The "BERT" of Vision.

  1. Take an image.
  2. Hide 75% of the patches.
  3. Ask the model to reconstruct the missing pixels. This forces the model to understand "Dog" to paint the missing tail.

3. DINO (Distillation with NO labels)

A Teacher and Student network view different crops of the same image. The Student tries to output the same features as the Teacher.

Mark as Completed

TensorLearn - AI Engineering for Professionals