Authors

Venue

Date

Share

Literature-based Hypothesis Generation: Predicting the evolution of scientific literature to support scientists

Tarek R Besold

Uchenna Akujuobi

Samy Badreddine

Jihun Choi

Hatem ElShazly

Frederick Gifford

Kana Maruyama

Kae Nagano

Pablo Sanchez Martin

Thiviyan Thanapalasingam

Alessandra Toniato

Christoph Wehner

AI4X

2025

Abstract

Science is advancing at an increasingly quick pace, as evidenced, for instance, by the exponential growth in the number of published research articles per year [1]. On the one hand, this poses an
increasingly pressing challenge: Effectively navigating this ever-growing body of knowledge is tedious
and time-consuming in the best of cases, and more often than not becomes infeasible for individual scientists. On the other hand, from an AI point of view, scientific literature offers a great opportunity: Thebody of published research works offers a vast collection of highest-quality—literally expert reviewed— data about the relationships of concepts and the governing laws of our physical world.
Making use of the opportunity in order to mitigate the challenge, computational systems have been introduced which aim to support human researchers in the initial phase of the scientific process by automatically extracting hypotheses from the knowledge contained in published resources, i.e., by performing automated hypothesis-generation (HG). Famously, [2] systematically used a scientific literature database to find potential connections between previously disjoint bodies of research, as a result hypothesizing a (later confirmed) curative relationship between dietary fish oils and Raynaud’s syndrome. Swanson and Smalheiser then automatized the search and linking process in the ARROWSMITH system [3]. Their work and other more recent examples such as [4, 5, 6] demonstrate the usefulness of computational methods in extracting latent information from the vast body of scientific publications. In the following, we summarize the current state of our efforts to contribute to the development of a fit-for-use HG system. Our report includes recent developments regarding approaches to literaturebased HG and work aiming to make the resulting type of HG system fit-for-use by scientists through the provision of flanking explanatory information.

Related Publications

From Neural Networks to Logical Theories: The Correspondence between Fibring Modal Logics and Fibring Neural Networks

ICLR, 2026
Ouns El Harzli, Bernardo Cuenca Grau, Artur d'Avila Garcez*, Ian Horrocks, Tarek R Besold

Fibring of modal logics is a well-established formalism for combining countable families of modal logics into a single fibred language with common semantics, characterized by fibred models. Inspired by this formalism, fibring of neural networks was introduced as a neurosymbo…

Bridging Perceptual Gaps in Food NLP: A Structured Approach Using Sensory Anchors

ACL, 2025
Kana Maruyama, Angel Hsing-Chi Hwang, Tarek R Besold

Understanding how humans perceive and describe food is essential for NLP applications such as semantic search, recommendation, and structured food communication. However, textual similarity often fails to reflect perceptual similarity, which is shaped by sensory experience, …

Gastro-Health Project: Revolutionizing Personalized Nutrition and Health Forecasting Through Integrated AI Technologies

AI4X, 2025
Uchenna Akujuobi, Jiu Yi, Maria Enrique Chung, Tarek Besold

Knowledge graphs are powerful tools for modelling complex, multi-relational data and supporting hypothesis generation, particularly in applications like drug repurposing. However, for predictive methods to gain acceptance as credible scientific tools, they must ensure not on…

  • HOME
  • Publications
  • Literature-based Hypothesis Generation: Predicting the evolution of scientific literature to support scientists

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.