Back to Course
NLP Specialist: BERT & Beyond
Module 9 of 11
9. Topic Modeling (BERTopic)
1. Beyond LDA
Old School: Latent Dirichlet Allocation (LDA) uses bags of words. New School: BERTopic uses Embeddings.
2. The Pipeline
- Embed documents (SBERT).
- Reduce Dimensions (UMAP).
- Cluster (HDBSCAN).
- Extract Keywords (c-TF-IDF).