Cover Image for From Dictionaries to LLMs: Text Analysis in R

Presented by

R-Rome is committed to promoting open-source tools, knowledge sharing, and accessible data science education. We are a community dedicated to creating opportunities to learn, connect, and grow.

Hosted By

89 Went

AI

From Dictionaries to LLMs: Text Analysis in R

R-Rome

Zoom

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

This workshop walks through a complete text analysis pipeline in R, combining traditional methods with newer AI-powered tools.

In the first part, we will cover tokenization with tidytext, stopword removal (including custom stopwords), word frequency visualization with wordclouds and bar plots, dictionary-based sentiment analysis using three lexicons (AFINN, Bing, and NRC), topic modeling with LDA, and bigram analysis with word networks.

In the second part, we explore how the mall package can be used to perform sentiment analysis with local LLMs (via Ollama), comparing this approach to dictionary-based methods, and look at related tasks like text classification and entity extraction.

Familiarity with R and the tidyverse is assumed.

Dariia Mykhailyshyna is a postdoctoral researcher in Economics at the Kyiv School of Economics, working in political economy, migration, and causal inference. She holds PhD in Economics from the University of Bologna. She teaches statistics and data science courses, and organizes "Workshops for Ukraine" - a charity R workshop series where registration fees support Ukrainian causes. She has over seven years of experience with R and regularly uses it in her applied research.

Presented by

R-Rome

R-Rome is committed to promoting open-source tools, knowledge sharing, and accessible data science education. We are a community dedicated to creating opportunities to learn, connect, and grow.

Hosted By

89 Went

AI