Set-Input Trees: Discovering New Effective Tree-Based Multiple Instance Learning Algorithms

Keywords: Set-Input Trees, Multiple Instance Learning, Attention Mechanism, Regres-sion, Classification, Interpretation

Abstract

We propose gradient-based Set-Input Trees, a novel tree-based architecture for Multiple Instance Learning (MIL) that addresses both classification and regression. Unlike conventional methods relying on fixed aggregation (e.g., min/max pooling), the proposed architecture integrates gradient-boosted trees with an attention mechanism: instances are processed independently, while leaf embeddings are pooled via learned attention weights. This preserves interpretability while capturing bag-level structure. For regression, we introduce a synthetic MIL formulation, feature-to-bag conversion, enabling evaluation on continuous targets. Experiments show outperformance of the proposed algorithms on standard MIL benchmarks comparing to tree-based models. The model’s tree-based design ensures scalability and transparency, bridging instance-level decisions with set-valued predictions. Codes implementing the proposed algorithms are publicly available.

Author Biography

Lev Utkin, Peter the Great St. Petersburg Polytechnic University

д.т.н., профессор

Published
2025-10-24