Unified Pipeline for Reproducible Benchmarking of Sequence Models on Biomedical Data
2 June 2026, by Viktoria Wrobel

Photo: base.camp
This project aims to design and implement a unified, modular pipeline for training and evaluating sequence models within the context of computational biology. While modern architectures such as Transformers, state-space models, and emerging alternatives are widely used, their comparison is often hindered by inconsistent preprocessing, training procedures, and evaluation protocols. We propose a standardized framework that enables fair, reproducible, and efficient benchmarking across heterogeneous model classes on structured biomedical data. The pipeline will support interchangeable model components, configurable experiments, and consistent metrics, thereby facilitating systematic analysis of architectural trade-offs. Its key contribution lies in bridging methodological gaps between models and enabling transparent, scalable experimentation in real-world research settings.

