2016 Seminar Series
Why Humans Should Care About Data Science
Extraordinary advances in our ability to acquire and generate data are transforming the fundamental nature of discovery across domains. Much of the research in data science has focused on automated methods of analyzing data such as machine learning and new database techniques. Less attention has been directed to the human aspects of data science, including how to build interactive tools that maximize creativity and human insight, and the ethics and societal factors involved in the next generation of data science discoveries. In this talk, Dr. Aragon will argue for the importance of a human centered approach to data science as necessary for the success of 21st century discovery. Further, she attests that we need to go beyond well-designed user interfaces for data science software tools to consider the entire ecosystem of software development and use: we need to study people interacting with technology as socio-technical systems, where both technical and social approaches are interwoven. Aragon will discuss promising research in this area, introduce the new Master's Degree in Data Science at UW, and speculate upon future directions for data science.
2015 Seminar Series
Being Human in a Big Data World: Human-Centered Data Science
Small scale, qualitative approaches to data collection and analysis offer researchers the opportunity to obtain very rich, deep insights about very specific phenomena - often in a very bounded or limited context. Such studies often face challenges related to generalization, extension, verification, and validation. On the other hand, large scale, quantitative approaches to data collection and analysis offer researchers broad assemblages of data, but such data is often much more shallow - missing the rich detail associated with deep study.
But what happens as qualitative data sets grow ever larger? With the ease of collecting qualitative data such as text and multimedia photos and videos, such data sets are becoming an increasing challenge to analyze with the same level of detail and depth. How do we preserve the richness so well associated with traditional qualitative techniques in a world of such Big Data? How can we be sure not to lose the compelling and inspiring stories of individuals in the sea of aggregated data at scale?
There are clear advantages of each perspective - one can choose methods and techniques which facilitate deep, but narrow analysis, or one can be broad, but shallow. In this talk, Cecilia Aragon will discuss and explore some of the particular sets of problems and challenges sociotechnical researchers face with regards to this small-data versus big-data tension, and seek ways of overcoming (or at least identifying potential solutions to) these problems and addressing the challenges.
2014 Seminar Series
The Hearts and Minds of Data Science
Thanks in part to the recent popularity of the buzzword "big data," it is now generally understood that many important scientific breakthroughs are made by interdisciplinary collaborations of scientists working in geographically distributed locations, producing and analyzing vast and complex data sets. The extraordinary advances in our ability to acquire and generate data in physical, biological, and social sciences are transforming the fundamental nature of science discovery across domains. Much of the research in this area, which has become known as data science, has focused on automated methods of analyzing data such as machine learning and new database techniques. Less attention has been directed to the human aspects of data science, including how to build interactive tools that maximize scientific creativity and human insight, and how to train, support, motivate, and retain the individuals with the necessary skills to produce the next generation of scientific discoveries.
In this talk, Aragon will argue for the importance of a human centered approach to data science as necessary for the success of 21st century scientific discovery. Further, she attests that we need to go beyond well-designed user interfaces for data science software tools to consider the entire ecosystem of software development and use: we need to study scientific collaborations interacting with technology as socio-technical systems, where both computer science and social science approaches are interwoven. Aragon will discuss promising research in this area, describe opportunities to participate in the recently announced $37.8M Moore/Sloan Data Science Environment at UW, and speculate upon future directions for data science.
The Role of Emotion in Collaborative Work and Games
Professor Aragon will discuss various projects from the Scientific Collaboration and Creativity Lab involving both automated detection and qualitative analysis of the role of emotion in spontaneous text communication such as chat, forums, social media sites, and collaborative games. Projects include:
Aloe, an open source tool developed to train and test machine learning classifiers for automatically labeling chat messages with different emotion or affect categories.
Agave, open source software that enables research groups explore large tweet data sets through interactive filtering, sentiment analysis, visualization of changes over time, and discussion.
MAX5, a collaborative bioinformatics learning game focusing on sociotechnical challenges in scientific collaborations that utilizes emotional and affective experiences to engage high school students with biology and computer science learning concepts.
Lab members involved in these projects will present demos and walk-throughs.
Cecilia Aragon is an Associate Professor in the Department of Human Centered Design & Engineering and a member of the eScience Institute at the University of Washington. She directs the Scientific Collaboration & Creativity Laboratory. Previously, she was a computer scientist at Lawrence Berkeley National Laboratory for six years, after earning her PhD in Computer Science from UC Berkeley in 2004. She earned her BS in Mathematics from the California Institute of Technology. She and her students develop collaborative visual analytics tools to facilitate data science, and study current scientific practice around large and complex data sets. Her research interests span human-computer interaction, computer supported cooperative work, visual analytics, information visualization, scientific collaborations, usability and sustainability, collaborative games, distributed creativity, distributed affect, social media, and new methods of computer-mediated communication. In 2009, she received the Presidential Early Career Award for Scientists and Engineers for her work in collaborative data-intensive science.