Revisiting named entity recognition in food computing: enhancing performance and robustness

VIEW PUBLICATION

Uchenna Akujuobi

Shuhong Liu*

Tarek R Besold

* External authors

AIR

2024

Abstract

In the ever-evolving domain of food computing, named entity recognition (NER) presents transformative potential that extends far beyond mere word tagging in recipes. Its implications encompass intelligent recipe recommendations, health analysis, and personalization. Nevertheless, existing NER models in food computing encounter challenges stemming from variations in recipe input standards, limited annotations, and dataset quality. This article addresses the specific problem of ingredient NER and introduces two innovative models: SINERA, an efficient and robust model, and SINERAS, a semi-supervised variant that leverages a Gaussian Mixture Model (GMM) to learn from untagged ingredient list entries. To mitigate issues associated with data quality and availability in food computing, we introduce the ARTI dataset, a diverse and comprehensive repository of ingredient lines. Additionally, we identify and tackle a pervasive challenge—spurious correlations between entity positions and predictions. To address this, we propose a set of data augmentation rules tailored for food NER. Extensive evaluations conducted on the ARTI dataset and a revised TASTEset dataset underscore the performance of our models. They outperform several state-of-the-art benchmarks and rival the BERT model while maintaining smaller parameter sizes and reduced training times.

Related Publications

From Neural Networks to Logical Theories: The Correspondence between Fibring Modal Logics and Fibring Neural Networks

ICLR, 2026
Ouns El Harzli, Bernardo Cuenca Grau, Artur d'Avila Garcez*, Ian Horrocks, Tarek R Besold

Fibring of modal logics is a well-established formalism for combining countable families of modal logics into a single fibred language with common semantics, characterized by fibred models. Inspired by this formalism, fibring of neural networks was introduced as a neurosymbo…

Bridging Perceptual Gaps in Food NLP: A Structured Approach Using Sensory Anchors

ACL, 2025
Kana Maruyama, Angel Hsing-Chi Hwang, Tarek R Besold

Understanding how humans perceive and describe food is essential for NLP applications such as semantic search, recommendation, and structured food communication. However, textual similarity often fails to reflect perceptual similarity, which is shaped by sensory experience, …

Literature-based Hypothesis Generation: Predicting the evolution of scientific literature to support scientists

AI4X, 2025
Tarek R Besold, Uchenna Akujuobi, Samy Badreddine, Jihun Choi, Hatem ElShazly, Frederick Gifford, Kana Maruyama, Kae Nagano, Pablo Sanchez Martin, Thiviyan Thanapalasingam, Alessandra Toniato, Christoph Wehner

Science is advancing at an increasingly quick pace, as evidenced, for instance, by the exponential growth in the number of published research articles per year [1]. On the one hand, this poses anincreasingly pressing challenge: Effectively navigating this ever-growing body o…

SEE ALL

HOME
Publications
Revisiting named entity recognition in food computing: enhancing performance and robustness

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.

LEARN MORE