Research

Cecilia Aragon

Spring 2021

Emotions and Relationship-Building in Online Fanfiction Communities

Led by HCDE PhD student Sourojit Ghosh, with guidance from Professor Cecilia Aragon

This research group will investigate the role played by shared or conflicting emotions in the process of relationship-building in online communities. We aim to explore that role through extensive qualitative coding of individual fanfiction reviews. This work will be the final quarter of an ongoing research project on this topic, utilizing subsets of a large dataset of fanfiction data collected by the Human-Centered Data Science Lab in previous years. Past explorations with this dataset have put forward the theory of distributed mentoring, a phenomenon where people from all over the world and all age groups collaboratively give and receive support through an informal yet substantive network of constructive advice. The goal for this DRG will be to finish our qualitative coding via a novel collaborative coding and visualization tool, and to contribute to a research paper to be submitted to CSCW this academic year.

Participants in this DRG will gain hands-on experience with large datasets, learning to qualitatively analyze each data point for its rich content while also looking at it in the larger context of the entire set.

This DRG is at capacity for Spring 2021 and no longer accepting applications


Spring 2021

Human Centered Natural Language Processing and Text Visualization

This research group will apply human-centered techniques to the field of natural language processing (NLP) to study very large text corpora, with an additional focus on text visualization. We’re looking for students with experience in either (a) programming and analysis of large text datasets or (b) machine learning and data science. No NLP experience is required as we will be reading seminal papers in the field and applying those techniques to a text dataset.

We plan to use a previously-collected dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites as a test dataset for human-centered NLP techniques.

This DRG is at capacity for Spring 2021 and no longer accepting applications


Dr. Aragon's Research Group archive