View recent publications and filter by topic, author, year, and more.

latest publications

Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing

Tal August, Lucy Lu Wang, Jonathan Bragg, Marti A. Hearst, Andrew Head, and Kyle Lo
TOCHI 2023

When seeking information not covered in patient-friendly documents, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access to medical papers, we explore four features enabled by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guides readers to answering passages, and plain language summaries of those passages. We embody these features into a prototype system, Paper Plain. We evaluate Paper Plain, finding that participants who used the prototype system had an easier time reading research papers without a loss in paper comprehension compared to those who used a typical PDF reader. Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers.

Scim: Intelligent Skimming Support for Scientific Papers

Raymond Fok, Hita Kambhamettu, Luca Soldaini, Jonathan Bragg, Kyle Lo, Marti Hearst, Andrew Head, and Daniel S Weld
IUI 2023

Scholars need to keep up with an exponentially increasing flood of scientific papers. To aid this challenge, we introduce Scim, a novel intelligent interface that helps experienced researchers skim – or rapidly review – a paper to attain a cursory understanding of its contents. Scim supports the skimming process by highlighting salient paper contents in order to direct a reader’s attention. The system’s highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by readers at both the global and local level. We evaluate Scim with both an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper. We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools.

Representation, Self-Determination, and Refusal: Queer People’s Experiences with Targeted Advertising

Princess Sampson, Ro Encarnación, and Danaë Metaxa
FAccT 2023

Targeted online advertising systems increasingly draw scrutiny for the surveillance underpinning their collection of people's private data, and subsequent automated categorization and inference. The experiences of LGBTQ+ people, whose identities call into question dominant assumptions about who is seen as “normal,” and deserving of privacy, autonomy, and the right to self-determination, are a fruitful site for exploring the impacts of ad targeting. We conducted semi-structured interviews with LGBTQ+ individuals (N=18) to understand their experiences with online advertising, their perceptions of ad targeting, and the interplay of these systems with their queerness and other identities. Our results reflect participants’ overall negative experiences with online ad content—they described it as stereotypical and tokenizing in its lack of diversity and nuance. But their desires for better ad content also clashed with their more fundamental distrust and rejection of the non-consensual and extractive nature of ad targeting. They voiced privacy concerns about continuous data aggregation and behavior tracking, a desire for greater control over their data and attention, and even the right to opt-out entirely. Drawing on scholarship from queer and feminist theory, we explore targeted ads’ homonormativity in their failure to represent multiply-marginalized queer people, the harms of automated inference and categorization to identity formation and self-determination, and the theory of refusal underlying participants’ queer visions for a better online experience.

Concept-Labeled Examples for Library Comparison

Concept-Labeled Examples for Library Comparison

Litao Yan, Miryung Kim, Björn Hartmann, Tianyi Zhang, and Elena L. Glassman
UIST 2022

Programmers often rely on online resources—such as code examples, documentation, blogs, and Q&A forums—to compare similar libraries and select the one most suitable for their own tasks and contexts. However, this comparison task is often done in an ad-hoc manner, which may result in suboptimal choices. Inspired by Analogical Learning and Variation Theory, we hypothesize that rendering many concept-annotated code examples from different libraries side-by-side can help programmers (1) develop a more comprehensive understanding of the libraries' similarities and distinctions and (2) make more robust, appropriate library selections. We designed a novel interactive interface, ParaLib, and used it as a technical probe to explore to what extent many side-by-side concept-annotated examples can facilitate the library comparison and selection process. A within-subjects user study with 20 programmers shows that, when using ParaLib, participants made more consistent, suitable library selections and provided more comprehensive summaries of libraries' similarities and differences.

Visualizing Examples of Deep Neural Networks at Scale

Litao Yan, Tianyi Zhang, and Elena L. Glassman
CHI 2021

Many programmers want to use deep learning due to its superior accuracy in many challenging domains. Yet our formative study with ten programmers indicated that, when constructing their own deep neural networks (DNNs), they often had a difficult time choosing appropriate model structures and hyperparameter values. This paper presents ExampleNet—a novel interactive visualization system for exploring common and uncommon design choices in a large collection of open-source DNN projects. ExampleNet provides a holistic view of the distribution over model structures and hyperparameter settings in the corpus of DNNs, so users can easily filter the corpus down to projects tackling similar tasks and compare and contrast design choices made by others. We evaluated ExampleNet in a within-subjects study with sixteen participants. Compared with the control condition (i.e., online search), participants using ExampleNet were able to inspect more online examples, make more data-driven design decisions, and make fewer design mistakes.

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

Andrew Head, Kyle Lo, Dongyeop Kang, Raymond Fok, Sam Skjonsberg, Daniel S. Weld, and Marti A. Hearst
CHI 2021

Despite the central importance of research papers to scientific progress, they can be difficult to read. Comprehension is often stymied when the information needed to understand a passage resides somewhere else—in another section, or in another paper. In this work, we envision how interfaces can bring definitions of technical terms and symbols to readers when and where they need them most. We introduce ScholarPhi, an augmented reading interface with four novel features: (1) tooltips that surface position-sensitive definitions from elsewhere in a paper, (2) a filter over the paper that “declutters” it to reveal how the term or symbol is used across the paper, (3) automatic equation diagrams that expose multiple definitions in parallel, and (4) an automatically generated glossary of important terms and symbols. A usability study showed that the tool helps researchers of all experience levels read papers. Furthermore, researchers were eager to have ScholarPhi’s definitions available to support their everyday reading.

An Image of Society: Gender and Racial Representation and Impact in Image Search Results for Occupations

Danaë Metaxa, Michelle A. Gan, Su Goh, Jeff Hancock, and James A. Landay
CSCW 2021

Algorithmically-mediated content is both a product and producer of dominant social narratives, and it has the potential to impact users’ beliefs and behaviors. We present two studies on the content and impact of gender and racial representation in image search results for common occupations. In Study 1, we compare 2020 workforce gender and racial composition to that reflected in image search. We find evidence of underrepresentation on both dimensions: women are underrepresented in search at a rate of 42% women for a field with 50% women; people of color are underrepresented with 16% in search compared to an occupation with 22% people of color (the latter being proportional to the U.S. workforce). We also compare our gender representation data with that collected in 2015 by Kay et al., finding little improvement in the last half-decade. In Study 2, we study people’s impressions of occupations and sense of belonging in a given field when shown search results with different proportions of women and people of color. We find that both axes of representation as well as people’s own racial and gender identities impact their experience of image search results. We conclude by emphasizing the need for designers and auditors of algorithms to consider the disparate impacts of algorithmic content on users of marginalized identities.

Composing Flexibly-Organized Step-by-Step Tutorials from Linked Source Code, Snippets, and Outputs

Composing Flexibly-Organized Step-by-Step Tutorials from Linked Source Code, Snippets, and Outputs

Andrew Head, Jason Jiang, James Smith, Marti A. Hearst, and Björn Hartmann
CHI 2020

Programming tutorials are a pervasive, versatile medium for teaching programming. In this paper, we report on the con- tent and structure of programming tutorials, the pain points authors experience in writing them, and a design for a tool to help improve this process. An interview study with 12 expe- rienced tutorial authors found that they construct documents by interleaving code snippets with text and illustrative outputs. It also revealed that authors must often keep the related ar- tifacts of source programs, snippets, and outputs consistent as a program evolves. A content analysis of 200 frequently- referenced tutorials on the web also found that most tutorials contain related artifacts—duplicate code and outputs generated from snippets—that an author would need to keep consistent with each other. To address these needs, we designed a tool called Torii with novel authoring capabilities. An in-lab study showed that tutorial authors can successfully use the tool for the unique affordances identified, and provides guidance for designing future tools for tutorial authoring.