Phil 5.24.2023

Users choose to engage with more partisan news than they are exposed to on Google Search

  • If popular online platforms systematically expose their users to partisan and unreliable news, they could potentially contribute to societal issues such as rising political polarization1,2. This concern is central to the ‘echo chamber’3,4,5 and ‘filter bubble’6,7 debates, which critique the roles that user choice and algorithmic curation play in guiding users to different online information sources8,9,10. These roles can be measured as exposure, defined as the URLs shown to users by online platforms, and engagement, defined as the URLs selected by users. However, owing to the challenges of obtaining ecologically valid exposure data—what real users were shown during their typical platform use—research in this vein typically relies on engagement data4,8,11,12,13,14,15,16 or estimates of hypothetical exposure17,18,19,20,21,22,23. Studies involving ecological exposure have therefore been rare, and largely limited to social media platforms7,24, leaving open questions about web search engines. To address these gaps, we conducted a two-wave study pairing surveys with ecologically valid measures of both exposure and engagement on Google Search during the 2018 and 2020 US elections. In both waves, we found more identity-congruent and unreliable news sources in participants’ engagement choices, both within Google Search and overall, than they were exposed to in their Google Search results. These results indicate that exposure to and engagement with partisan or unreliable news on Google Search are driven not primarily by algorithmic curation but by users’ own choices.


  • The meeting went well yesterday, I think? Need to write up some thoughts on Stable Diffusion and general meeting notes.
  • Travel reimbursement – done
  • Slides!
  • Q5 Report – good progress
  • JSC kickoff – done

GPT Agents

  • 4:00 meeting
    • Good discussion. I was convinced to write a first draft of a paper that talks about verifiable context prompting, where source indexes are listed after each end punctuation in the context so that they can be searched for in the response and checked. Hallucinations should have non-existent indices. Verify against some book part, since the GPT hasn’t read that, and see how it does with how do I find a boyfriend/girlfriend uber-prompt. Need to write a small experiment class and put it in a new experiment folder.
  • Gene Set Summarization using Large Language Models
    • Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpreting gene lists can also be framed as a textual summarization task, enabling the use of Large Language Models (LLMs), potentially utilizing scientific texts directly and avoiding reliance on a KB.
      We developed SPINDOCTOR (Structured Prompt Interpolation of Natural Language Descriptions of Controlled Terms for Ontology Reporting), a method that uses GPT models to perform gene set function summarization as a complement to standard enrichment analysis. This method can use different sources of gene functional information: (1) structured text derived from curated ontological KB annotations, (2) ontology-free narrative gene summaries, or (3) direct model retrieval.
      We demonstrate that these methods are able to generate plausible and biologically valid summary GO term lists for gene sets. However, GPT-based approaches are unable to deliver reliable scores or p-values and often return terms that are not statistically significant. Crucially, these methods were rarely able to recapitulate the most precise and informative term from standard enrichment, likely due to an inability to generalize and reason using an ontology. Results are highly nondeterministic, with minor variations in prompt resulting in radically different term lists. Our results show that at this point, LLM-based methods are unsuitable as a replacement for standard term enrichment analysis and that manual curation of ontological assertions remains necessary.
    • This is an interesting idea. If a language model can *do* biochemistry, then it is sophisticated enough to be biochemistry