Literature-based Hypothesis Generation: Predicting the evolution of scientific literature to support scientists
Abstract
Science is advancing at an increasingly quick pace, as evidenced, for instance, by the exponential growth in the number of published research articles per year [1]. On the one hand, this poses an
increasingly pressing challenge: Effectively navigating this ever-growing body of knowledge is tedious
and time-consuming in the best of cases, and more often than not becomes infeasible for individual scientists. On the other hand, from an AI point of view, scientific literature offers a great opportunity: Thebody of published research works offers a vast collection of highest-quality—literally expert reviewed— data about the relationships of concepts and the governing laws of our physical world.
Making use of the opportunity in order to mitigate the challenge, computational systems have been introduced which aim to support human researchers in the initial phase of the scientific process by automatically extracting hypotheses from the knowledge contained in published resources, i.e., by performing automated hypothesis-generation (HG). Famously, [2] systematically used a scientific literature database to find potential connections between previously disjoint bodies of research, as a result hypothesizing a (later confirmed) curative relationship between dietary fish oils and Raynaud’s syndrome. Swanson and Smalheiser then automatized the search and linking process in the ARROWSMITH system [3]. Their work and other more recent examples such as [4, 5, 6] demonstrate the usefulness of computational methods in extracting latent information from the vast body of scientific publications. In the following, we summarize the current state of our efforts to contribute to the development of a fit-for-use HG system. Our report includes recent developments regarding approaches to literaturebased HG and work aiming to make the resulting type of HG system fit-for-use by scientists through the provision of flanking explanatory information.
Authors
- Tarek R Besold
- Uchenna Akujuobi
- Samy Badreddine
- Jihun Choi
- Hatem ElShazly
- Frederick Gifford
- Kana Maruyama
- Kae Nagano
- Pablo Sanchez Martin
- Thiviyan Thanapalasingam
- Alessandra Toniato
- Christoph Wehner
Venue
AI4X
Date
2025