Parameterized Algorithms and Complexity for Function Merging with Branch Reordering
Binary size reduction is an increasingly important optimization objective for compilers, especially in the context of mobile applications and resource-constrained embedded devices. In such domains, binary size often takes precedence over compilation time. One emerging technique that has been shown effective is function merging, where multiple similar functions are merged into one, thereby eliminating redundancy. The state-of-the-art approach to perform the merging, due to Rocha et al. [CGO 2019, PLDI 2020], is based on sequence alignment, where functions are viewed as linear sequences of instructions that are then matched in a way maximizing their alignment.
In this paper, we consider a significantly generalized formulation of the problem by allowing reordering of branches within each function, subsequently allowing for more flexible matching and better merging. We show that this makes the problem NP-hard, and thus we study it through the lens of parameterized algorithms and complexity, where we identify certain parameters of the input that govern its complexity. We look at two natural parameters: the branching factor and nesting depth of input functions.
Concretely, our input consists of two functions F_1, F_2 where each F_i has size n_i, branching factor b_i, and nesting depth d_i. Our task is to reorder the branches of F_1 and F_2 in a way that yields linearizations achieving the maximum sequence alignment. Let n = max(n_1, n_2), and define b, d similarly. Our results are as follows:
-
A simple algorithm running in time 2^{O(bd)} n^2, establishing that the problem is fixed-parameter tractable (FPT) with respect to all four parameters b_1, d_1, b_2, d_2.
-
An algorithm running in time 2^{O(bd_2)} n^7, showing that even when one of the functions has an unbounded nesting depth, the problem remains in FPT.
-
A hardness result showing that the problem is NP-hard even when constrained to constant d_1, b_2, d_2.
To the best of our knowledge, this is the first systematic study of function merging with branch reordering from an algorithmic or complexity-theoretic perspective.
Fri 19 JunDisplayed time zone: Mountain Time (US & Canada) change
11:00 - 12:40 | Compiler Optimization for AcceleratorsPLDI Research Papers at Flatirons 4 Chair(s): Tyler Sorensen Microsoft Research; University of California at Santa Cruz | ||
11:00 20mTalk | Compiling Strassen-like Matrix Multiplication Algorithms to Fast CUDA Kernels PLDI Research Papers Abhinav Jangda Microsoft Research DOI | ||
11:20 20mTalk | Parameterized Algorithms and Complexity for Function Merging with Branch Reordering PLDI Research Papers Amir K. Goharshady University of Oxford, Kerim Kochekov Hong Kong University of Science and Technology, Tian Shu Hong Kong University of Science and Technology, Ahmed Khaled Zaher Hong Kong University of Science and Technology DOI | ||
11:40 20mTalk | Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs PLDI Research Papers Yifan Zhao University of Illinois Urbana-Champaign, Egan Johnson University of Illinois Urbana-Champaign, Prasanth Chatarasi IBM Research, Vikram S. Adve University of Illinois Urbana-Champaign, Sasa Misailovic University of Illinois Urbana-Champaign DOI | ||
12:00 20mTalk | SparseZETA: Intelligent Auto-tuner for Designing High-Performance SpMV Programs PLDI Research Papers Zhen Du Institute of Computing Technology at Chinese Academy of Sciences, Ying Liu Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xionghui Chen Nanjing University, Yanbo Zhao North Carolina State University, Xiaobing Feng Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Huimin Cui Institute of Computing Technology at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Jiajia Li North Carolina State University DOI | ||