Reflections from CHIIR 2026 by Gavindya Jayawardena

Associate Professor Jacek Gwizdka, Dan Zhang, and I (Gavindya Jayawardena) attended the 2026 ACM Conference on Human Information Interaction and Retrieval (CHIIR), held March 22–26 at the University of Washington in Seattle, Washington. With cherry blossoms in full bloom, it was a beautiful spring backdrop for the conference.

Cherry blossoms in full bloom at the Quad, University of Washington.
On March 21, 2026, 29 Yoshino cherry trees lit up the UW Quad in pink and purple for one night to celebrate the bloom.

The IX Lab had a perfect record this year, with both of our submitted papers accepted. The conference opened with a Doctoral Consortium and Tutorials before moving into the main program. UT Austin had a strong showing overall, taking home two awards. The Best Short Paper went to Yujin Choi and Professor Soo Young Rieh for Interest-Driven Search in AI-Mediated Information Environments: An Audio Diary Study. The conference also presented its first-ever Test-of-Time Award, one of two given, to Soo Young Rieh’s Assessing Learning Outcomes in Web Search: A Comparison of Tasks and Query Strategies, recognized for its lasting impact on the field.

Our Work

IX Lab had two papers in the opening session on cognitive factors and measurements. Our perspective paper, Attention! Rethinking What We Measure in CHIIR Studies, presented by Dan, took a closer look at how attention, a central but surprisingly underspecified construct in Interactive Information Retrieval, is defined and measured across the CHIIR literature. While interest in attention has grown, studies rarely define it explicitly, and tend to rely on behavioral proxies like eye fixations, dwell time, or interaction logs without a guiding conceptual framework, which makes it difficult to compare findings across studies. We systematically reviewed CHIIR publications from 2016 to 2025, ultimately identifying 19 papers for in-depth analysis. The review revealed considerable variation in how attention is interpreted and operationalized, pointing to a broader tension between cognitive theory and information interaction research. Our hope is that this work encourages researchers to be more deliberate in how they define and measure attention, ultimately strengthening rigor and reproducibility in IIR research.

Dan Zhang presenting the perspective paper, “Attention! Rethinking What We Measure in CHIIR Studies.”

I (Gavindya Jayawardena) presented our full paper, Effects of Working Memory Capacity and Search Task Complexity on Cognitive Load, which used a novel pupillometry-based algorithm to track cognitive load in near-real-time during search. We looked at how working memory capacity and task type (fact-checking vs. decision-making) shaped both cognitive load and emotional responses throughout the search process. Participants were split into high and low working memory groups based on N-back performance, and cognitive load was estimated from pupil diameter signals across three task phases. Decision-making tasks induced higher cognitive load overall, with low working memory individuals showing the highest cognitive load, while no such group differences emerged during fact-checking. Cognitive load tended to be highest at the start of tasks and declined over time, suggesting some degree of cognitive adaptation. Interestingly, the emotional profiles of the two groups diverged: high working memory individuals showed positive associations between cognitive load and engagement, joy, and valence, whereas low working memory individuals were more likely to experience confusion and surprise, particularly during decision-making. These findings show how working memory capacity shapes not just how hard people find a task, but how they feel while doing it.

Gavindya Jayawardena presenting the full paper, “Effects of Working Memory Capacity and Search Task Complexity on Cognitive Load.”

Papers That Stood Out

Across the sessions, a few papers were particularly memorable. One used eye tracking to characterize personality traits through a multimodal time-series model that integrated gaze data alongside gaze missingness, periods when the user’s gaze is not captured by the tracker. Rather than treating missingness as noise to be cleaned away, they used it as a meaningful signal to infer covert attention and characterize personality without any preprocessing. It was a smart reframing of what is usually considered a data quality problem. This work was conducted by Jiaman He and Marta Micheli, alongside Damiano Spina, Dana McKay, and Johanne R. Trippas from RMIT University, and Noriko Kando from the National Institute of Informatics (NII) in Tokyo.

On the cognitive side, Marcel Gohsen, Nicola Libera, Jan Ehlers, and Benno Stein from Bauhaus-Universität Weimar, together with Johannes Kiesel from GESIS – Leibniz Institute for the Social Sciences, asked whether cognitive load impairs people’s ability to identify voice-based deepfakes. Cognitive load was manipulated with a 1-back task and a secondary video condition. The 1-back task didn’t produce enough interference to affect performance, but the video condition actually increased detection accuracy, a counterintuitive result worth following up on.

In the search-as-learning space, Kelsey Urgo from the University of San Francisco, Yuan Li from the University of Alabama, and Jaime Arguello and Robert Capra from the University of North Carolina at Chapel Hill tested whether self-regulated learning frameworks, specifically goal-setting, transfer to generative AI systems. Participants with access to a sub-goal manager scored higher on post-task knowledge assessments and reported higher perceived task difficulty, but wrote fewer characters in their notes. Note quality wasn’t examined, which leaves an open question about what those shorter notes actually reflect.

A paper by Yash Prakash, Akshay Kolgar Nayak, Nithiya Venkatraman, Sampath Jayarathna, and Vikas Ashok from Old Dominion University, and Hae-Na Lee from Michigan State University, examined online shopping for blind users and found that navigation entropy increased on unfamiliar websites, as participants built mental maps of sites over time. It was a nice framing of familiarity as an accessibility variable rather than just a usability one.

The Industry Panel

The panel brought together practitioners from Microsoft, Google, Amazon, AMD, and LinkedIn for a candid conversation about AI. A recurring concern was the long-term consequences of training future models on AI-generated data. Panelists also pushed back on the idea that AI can address structural problems like literacy, and noted that there will always be communities without meaningful access to these tools. One exchange that stuck with me was about AI confidence scores, LLMs tend to express uniformly high confidence regardless of accuracy, and it’s genuinely unclear whether surfacing those scores would help or confuse the average user.

The industry panel at CHIIR 2026.

Second Keynote: Xin Luna Dong

The second keynote argued that personal assistants should observe, understand, and provide. She walked through a series of systems built on egocentric video captured through Meta glasses. The core challenge is that a world-view camera captures the entire environment, not just what a user is actually attending to, so the system uses detected triggers like books and papers to infer relevance, enabling memory queries like “Hey Meta, where did I park my car?” She described Pensieve Memory, which trains a model to anticipate what questions users are likely to ask based on generated image descriptions, and aggregates contexts over time to answer questions like “What should I buy from the grocery store?” Another application, AssoMem, treats dense memory (what we see and hear) as associatively anchored, while VisualLens uses photos and zooming behavior as proxies for user interests to drive recommendations. The Q&A surfaced hard questions about privacy: what about bystanders who are recorded without consent, or memories a user would rather not have stored? These aren’t edge cases, they’re central design problems the field hasn’t resolved.

Closing Thoughts

The final paper of the conference, a perspective paper by Chirag Shah, one of the general chairs, provided a fitting conclusion. He opened by asking the audience about proprioception, using it as a metaphor for meaningful friction in human-AI interaction. His central question was how we empower humans to do more with AI, taking the position that AI is an inevitable part of the future, and that the more productive response is to figure out how to engage with it well.

The short paper poster The Effects of Thinking Aloud on Participants during Search and Sensemaking got me thinking about eye-tracking methodology. If individuals with lower working memory capacity use thinking aloud as a way to cognitively offload to an auditory medium, there may be value in systematically integrating verbal protocols into eye-tracking paradigms, potentially capturing aspects of cognition that neither method reveals on its own. Overall, CHIIR 2026 was a stimulating few days, with a good mix of methodological work, applied systems, and bigger-picture questions about where the field is heading.

– Gavindya Jayawardena

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *