Cecilia Aragon

Winter 2019

Games for Good: Designing a Data Science Ethics Game

Co-directed by Cecilia Aragon and data scientist Bernease Herman

This research group will explore the use of analog and digital games to introduce users to ethical and human-centered issues in data science and computing. Students will be hands-on in exploring examples of educational games, brainstorming and providing ideas for games, and creating prototypes using paper and/or a computer game engine such as Unity3D. Some themes we will consider include data privacy, trust of algorithmic systems, predictive policing, fairness, and others.

At the end of ten weeks, we aim to produce a working prototype of the game, including several rounds of playtesting.

We are looking for a relatively small group of people who are each interested in between 2 and 5 credit hours of credit/no credit grade in HCDE 496/596. Interested undergraduate and graduate students may apply. Graphic design experience or programming experience is recommended, but not required for motivated students. To apply, fill out the following form explaining your interest in the project, and attach a resume and an unofficial transcript here.

Please send any questions to both Bernease Herman <> and Cecilia Aragon <>.

The group will meet on Thursdays from 3-4:20 p.m. in Sieg 427.

Winter 2019

Cultural differences in data privacy perspectives on social media

Note: This DRG is at capacity for Winter 2018

The Cambridge Analytica scandal has triggered a discussion about data privacy in social media. As the news regarding this issue has traveled around the world, a worldwide public discussion about data privacy has emerged. Motivated by this context, we aim to answer this research question: Does the public online debate reveal different perspectives on data privacy across countries/cultures? To do so, we have collected Twitter activity associated with data privacy and the Cambridge Analytica scandal in both English and Spanish. Our work will result in insights about the different aspects of data privacy that are emphasized by people in different countries; a characterization of how geography, time, and bots influence the worldwide online conversation on data privacy; and, lessons learned about how best to apply human-centered data science techniques to support cross-cultural comparisons of social media data.

We have collected a large-scale Twitter dataset around this issue and are in the process of analyzing the data through both qualitative coding and automated analysis. The research group will take a mixed-methods approach to understanding the data, and as a result we are currently focused on qualitative coding of a large Twitter dataset.

The group is open both graduate and undergraduate students. Qualitative research experience in grounded theory and qualitative coding is desirable but not required. Bilingualism is a plus, particularly in Spanish. We strongly encourage interested undergrads to apply, even if you have little or no experience with this type of research. This is an excellent opportunity to be introduced to the methods of human-centered data science, as well as a chance to gain valuable insight into the way that research is carried out.

Note: This DRG is at capacity for Winter 2018


Winter 2019

Distributed mentoring and fanfiction data analytics

Co-directed by John Frens, PhD student; Cecilia Aragon, Professor

Note: This DRG is at capacity for Winter 2018

Are you interested in applying human-centered data science to study how people learn from online fandom?

This ongoing research project studies informal learning in online fanfiction communities. We are looking for students with experience in either (a) programming and analysis of large text datasets or (b) qualitative research in online fandoms, to join an existing research group. We have published multiple papers on our research and are in the process of submitting others.

We have found quantitative and qualitative evidence that distributed mentoring plays a positive role in fanfiction authors’ development as writers, and this quarter’s project continues our efforts with a specific focus on visual analytics of a large dataset. We’ve collected a vast, rich text dataset of over 61.5 billion words (the largest fiction dataset outside of the Google Books corpus) of stories, reviews, and associated metadata from fanfiction sites and have applied both qualitative (ethnography) and quantitative techniques (machine learning, statistical analysis, data visualization) to investigate the relationship between distributed mentoring and writing quality (e.g., grammar, reading level).

Note: This DRG is at capacity for Winter 2018


Dr. Aragon's Research Group archive