LibContinual: A Comprehensive Library towards Realistic Continual Learning
Abstract
A fundamental challenge in Continual Learning (CL) is catastrophic forgetting, where adapting to new tasks degrades the performance on previous ones. While the field has evolved with diverse methods, this rapid surge in diverse methodologies has culminated in a fragmented research landscape. The lack of a unified framework, including inconsistent implementations, conflicting dependencies, and varying evaluation protocols, makes fair comparison and reproducible research increasingly difficult. To address this challenge, we propose LibContinual, a comprehensive and reproducible library designed to serve as a foundational platform for realistic CL. Built upon a high-cohesion, low-coupling modular architecture, LibContinual integrates 19 representative algorithms across five major methodological categories, providing a standardized execution environment. Meanwhile, leveraging this unified framework, we systematically identify and investigate three implicit assumptions prevalent in mainstream evaluation: (1) offline data accessibility, (2) unregulated memory resources, and (3) intra-task semantic homogeneity. We argue that these assumptions often overestimate the real-world applicability of CL methods. Through our comprehensive analysis using strict online CL settings, a novel unified memory budget protocol, and a proposed category-randomized setting, we reveal significant performance drops in many representative CL methods when subjected to these real-world constraints. Our study underscores the necessity of resource-aware and semantically robust CL strategies, and offers LibContinual as a foundational toolkit for future research in realistic continual learning. The source code is available from \href{https://github.com/RL-VIG/LibContinual}{https://github.com/RL-VIG/LibContinual}.
Summary
This paper introduces LibContinual, a comprehensive and reproducible Python library for continual learning (CL) research. The authors address the fragmented landscape of CL methodologies by providing a unified framework built on PyTorch, integrating 19 representative algorithms across five major categories. LibContinual features a modular architecture with decoupled components like Trainer, Model, Buffer, and DataModule, configured via YAML files. This allows researchers to easily mix and match different backbones, classifiers, and buffer strategies in a standardized environment, facilitating fair comparisons. Beyond just a repository of algorithms, the paper leverages LibContinual to critically examine three implicit assumptions prevalent in mainstream CL evaluation: offline data accessibility (multi-epoch training), unregulated memory resources (inconsistent accounting of storage costs), and intra-task semantic homogeneity (grouping semantically related classes into tasks). Through novel evaluation protocols, including strict online CL, a unified memory budget, and a category-randomized setting, the authors demonstrate significant performance drops for many CL methods when these assumptions are relaxed. The research highlights the need for resource-aware and semantically robust CL strategies, positioning LibContinual as a foundational toolkit for future research in realistic continual learning.
Key Insights
- •LibContinual provides a unified and modular framework for CL research, integrating 19 algorithms and enabling fair comparisons by addressing conflicting dependencies and inconsistent evaluation protocols.
- •The paper identifies and investigates three key implicit assumptions in CL evaluation: offline data accessibility, unregulated memory resources, and intra-task semantic homogeneity.
- •A strict online CL setting reveals weaknesses in models designed for multi-epoch training, highlighting the importance of learning efficiency and rapid adaptation.
- •A novel unified memory budget protocol exposes the varying storage costs (Image-based, Feature-based, Model-based, Parameter-based, and Prompt-based) of different CL methods, enabling more equitable cost-benefit analyses.
- •The category-randomized setting demonstrates that models can exploit intra-task semantic homogeneity as a shortcut, leading to performance drops when this homogeneity is removed.
- •The authors reproduce the performance of various CL algorithms using LibContinual and compare them against the performance reported in the original papers. This demonstrates the framework's ability to closely replicate existing research. (See Table II)
- •The paper underscores the need for developing resource-aware and semantically robust CL strategies for realistic applications.
Practical Implications
- •LibContinual can be used by researchers and engineers to benchmark existing CL algorithms and develop new ones in a standardized and reproducible environment.
- •The insights regarding implicit assumptions can guide the design of more realistic and robust CL evaluation protocols.
- •The unified memory budget protocol can help practitioners choose CL methods that are appropriate for resource-constrained environments.
- •The category-randomized setting can be used to evaluate the semantic robustness of CL methods and identify those that are less prone to exploiting task-level shortcuts.
- •Future research should focus on developing CL methods that are resource-aware, semantically robust, and capable of learning from single-pass data streams.