NLP Training Part 1
Python Text Analysis, Part 1: Bag-of-Words Representations
How do we convert text into a representation that we can operate on computationally? This requires developing a numerical representation of the text. In this part of the workshop, we study one of the foundational numerical representation of text data: the bag-of-words model. This model relies heavily on word frequencies in order to characterize text corpora. We build bag-of-words models, and their variations (e.g., TF-IDF), and use these representations to perform classification on text. Chou Hall Spieker Forum (6th Floor). Hosted by D-Lab.
Skill Level:The workshop is geared towards those with a basic familiarity with Python, but people without any familiarity with Python should be able to follow along with the conceptual presentation of the materials.