## László Kozma :: Research## Research interestsData structures and algorithms, combinatorics (permutations, graphs, set systems, posets), computational geometry, machine learning.## Publications**Analysis of smooth heaps and slim heaps** with Maria Hartmann, Corwin Sinnamon, Robert E. Tarjan, ICALP 2021. [pdf-soon] [smooth-heap.c] [slim-heap.c]**Splay trees on trees** with Benjamin Aram Berendsohn, Draft 2020. [pdf]**Geometric group testing** with Benjamin Aram Berendsohn, Draft 2020. [pdf]**Exact exponential algorithms for two poset problems** SWAT 2020. [pdf] [talk-notes]**Finding and counting permutations via CSPs** with Benjamin Aram Berendsohn, Dániel Marx, IPEC 2019. (Journal version: Algorithmica 2021, by invitation.) [pdf] [journal-link]**Hamiltonicity below Dirac's condition** with Bart Jansen, Jesper Nederlof, WG 2019. [pdf] [slides]**Time- and space-optimal algorithm for the many-visits TSP** with André Berger, Matthias Mnich, Roland Vincze, SODA 2019. (Journal version: ACM TALG 2020.) [pdf] [journal-link] [talk-notes]-
**Selection from heaps, row-sorted matrices and X + Y using soft heaps** with Haim Kaplan, Or Zamir, Uri Zwick, SOSA@SODA 2019. [pdf] [slides -- Uri Zwick] **Improved bounds for multipass pairing heaps and path-balanced binary search trees** with Dani Dorfman, Haim Kaplan, Seth Pettie, Uri Zwick, ESA 2018. [pdf]-
**Smooth heaps and a dual view of self-adjusting data structures** with Thatchaphol Saranurak, STOC 2018, HALG 2019. (Journal version: SICOMP 2019, by invitation.) [pdf] [slides] [poster -- Thatchaphol] [journal link] [smooth-heap.c] [slim-heap.c] -
**Multi-finger binary search trees** with Parinya Chalermsook, Mayank Goswami, Kurt Mehlhorn, Thatchaphol Saranurak, ISAAC 2018. [pdf] [slides] -
**Pairing heaps: the forward variant** with Dani Dorfman, Haim Kaplan, Uri Zwick, MFCS 2018. [pdf] [slides] -
**Maximum Scatter TSP in Doubling Metrics** with Tobias Mömke, SODA 2017. [pdf] [slides, extra slides]
Supersedes preliminary paper presented at EuroCG 2016. [pdf] -
**Binary search trees, rectangles and patterns**, PhD thesis, Saarland University, 2016. [pdf] [slides] -
**Binary search trees and rectangulations** with Thatchaphol Saranurak, Technical report (work-in-progress), 2016. [pdf] -
**Hitting Set for hypergraphs of low VC-dimension** with Karl Bringmann, Shay Moran, N.S. Narayanaswamy, ESA 2016. [pdf] -
**Pattern-avoiding access in binary search trees** with Parinya Chalermsook, Mayank Goswami, Kurt Mehlhorn, Thatchaphol Saranurak, FOCS 2015, HALG 2016. [pdf] [slides I -- Thatchaphol] [slides II] -
**Greedy Is an Almost Optimal Deque** with Parinya Chalermsook, Mayank Goswami, Kurt Mehlhorn, Thatchaphol Saranurak, WADS 2015. [pdf] [slides -- Thatchaphol] **Self-Adjusting Binary Search Trees: What Makes Them Tick?** with Parinya Chalermsook, Mayank Goswami, Kurt Mehlhorn, Thatchaphol Saranurak, ESA 2015. [pdf] [slides]-
**Streaming Algorithms for Partitioning Integer Sequences** with Christian Konrad, Manuscript, 2014. [pdf] **Shattering, Graph Orientations, and Connectivity** with Shay Moran, Electronic Journal of Combinatorics, Vol 20(3), 2013. [pdf] [slides]-
**Privacy by Fake Data: A Geometric Approach** with Victor Alvarez, Erin Chambers, CCCG 2013. [pdf] **Minimum Average Distance Triangulations** ESA 2012. [pdf] [slides] [poster]**Binary Principal Component Analysis in the Netflix Collaborative Filtering Task** with Alexander Ilin, Tapani Raiko, MLSP 2009. [pdf] [slides]
Summary: soonSummary: Search trees on trees (STTs) generalize binary search trees (BSTs): whereas BSTs search over a linear search space, SSTs search over a tree. We show the following: (1) a poly-time algorithm to find an (1+eps)-approximation of the optimal STT for a given search distribution, (2) linear rotation-distance for a broad class of STTs, (3) a generalization of Splay trees to STTs, (4) static optimality of this generalized Splay, asymptotically matching the optimum tree (without knowing the search distribution). Summary: Given m objects (points in d dimensional space), how many tests (axis-parallel boxes) are needed to identify t defective objects? A test reports whether at least one of the objects in it is defective, and all tests are run in parallel. This is a geometric version of the classical group testing problem. We give upper and lower bounds for various values of t and d. Summary: Exact exponential-time algorithms are given for the following problems: counting linear extensions when the input poset is two-dimensional, and the jump number problem in arbitrary posets (both running times: ~1.82^n). Summary: The permutation pattern problem asks whether a given permutation of length n contains a given pattern of length k. We give two new algorithms for the problem: one with running time n^{0.25k+o(k)}, and one with running time O(1.6181^n) or n^{0.5k+o(k)}. The results improve the earlier best bounds for the problem for patterns of size k≈log(n) or larger and reduce the problem to constraint-satisfaction. Our second algorithm uses polynomial space and is simpler than previous approaches for the problem. We include hardness results and improved bounds for special cases, and the counting version of the problem is also studied.
Berendsohn's thesis includes refinements of the hardness results with more details. The paper supersedes an earlier report [preprint]. Summary: Dirac's theorem (1952) is a classical result of graph theory, stating that an n-vertex graph is Hamiltonian if every vertex has degree at least n/2. We show that there are efficient algorithms to determine the Hamiltonicity of a graph, when the degree-requirements of Dirac's theorem are relaxed in certain ways. In 2020, Fomin, Golovach, Sagunov, and Simonov [preprint] provide a generalization. Summary: In the many-visits TSP, a salesperson must visit n cities, each of them precisely a prescribed number of times (in the cheapest possible way). We give a polynomial-space algorithm for this problem, with running time that depends single-exponentially on n. This improves on the previous best algorithm by Cosmadakis and Papadimitriou from 1984, that has superexponential time- and space-complexity.Follow-up: The running time of our algorithms is of the form c^n with c=16 (when polynomial space is used) and c=5 (with exponential space). In 2020, Kowalik, Li, Nadara, Smulewicz, and Wahlström [preprint] improve the first to c≈7.88 and refine the analysis of the second to c=4. Summary: Using soft heaps, we obtain simpler optimal algorithms for selecting the k-th smallest item from heap-ordered trees, from sets of sorted lists, and from sets of pairwise sums (X + Y), matching, and in some ways extending classical results of Frederickson (1993) and Frederickson and Johnson
(1982).Summary: We improve the analysis of a classical pairing heap variant (Fredman, Sedgewick, Sleator, Tarjan, 1985) and a classical self-adjusting binary search tree heuristic (Sleator, 198x), reducing the gap to the information-theoretic lower bound from roughly (loglog n) to less than (loglog...log n), where log(.) can be iterated any constant number of times. Summary: We describe a new connection ("duality") betweden self-adjusting binary search trees (BSTs) and heaps. This allows us to transfer results between the two settings, obtaining: (1) a broad class of "stable" heap algorithms, (2) instance-specific lower and upper bounds for stable heaps, (3) a new heap data structure called "smooth heap", which we show to be the heap-counterpart of a BST that is conjectured to be "instance-optimal", (4) new "offline" BST algorithms. Summary: We show that Binary Search Trees (BSTs) with multiple fingers can be efficiently simulated by standard (one-finger) BSTs with rotations. The results connect two prominent online problems: dynamic BSTs and k-server. As an application, we show that BSTs can take advantage of certain regularities in the input, for instance, when accessing items close to some recently accessed item.
The paper builds upon (and partially supersedes) the technical report The landscape of bounds for binary search trees. [pdf]Results found only in the report: a survey of BST properties; new connections/separations between BST properties; a new, more intuitive analysis of Splay trees; a technique for composing ("interleaving") sequences and combining BST properties; an observation about the Move-to-root heuristic.Summary: We give a new analysis of a classical pairing heap variant (Fredman, Sedgewick, Sleator, Tarjan, 1985), improving the bound on the cost of operations from O(n^0.5) to better than O(n^eps) for any constant eps>0. The main novelty is a potential function that captures rank-differences between nodes.Summary: We give a polynomial time (1+eps)-approximation for euclidean (and more general) cases of the Traveling-Salesman variant in which no hop may be too short, improving on the earlier best ratio of 2.Summary: Contains results related to binary search trees and the dynamic optimality conjecture from three papers, expanded with a survey of the problem, and some new observations.Summary: Serving a sequence of searches in a binary search tree with rotations is shown to be equivalent with the previously studied problem of finding a sequence of flips between two rectangulations. Connections of both problems to Manhattan networks are also explored. The material also appears (with small updates) as Chapter 5 of my thesis.Summary: The Hitting Set problem remains hard (in the parameterized sense), even in set systems of very small VC-dimension. If we restrict the structure even more, then Hitting Set is solvable in polynomial time. Our easy class is a generalization of Edge Cover.Summary: Searches in binary search trees (BSTs) take almost constant (amortized) time if the search sequence avoids an arbitrary fixed pattern. Moreover, this is attained by Greedy, a general-purpose BST algorithm that is conjectured optimal (for every sequence). Pattern avoiding sequences generalize some well-studied earlier examples, which can be seen as avoiding some particular pattern. Many of the arguments rely on results from forbidden submatrix theory. Part of the material appears (with some updates) as Chapter 4 of my thesis.Summary: We extend the "geometry of BST" model of Demaine et al. to handle insert and delete operations, and we show that the Greedy algorithm is almost optimal on deque sequences (where insert and delete happens only at the minimum and maximum). The results give circumstantial evidence that Greedy may be instance-optimal.Summary: We give combinatorial conditions that guarantee the efficiency of self-adjusting binary search tree algorithms, unifying the analysis of several known heuristics, and obtaining new, efficient heuristics based on depth-reduction. We also characterize all heuristics that are local. The paper supersedes an earlier draft note [pdf] which may nonetheless contain additional intuition. Part of the material appears (with some updates) as Chapter 3 of my thesis.Summary: We study the problem of partitioning a stream of integers into blocks with roughly equal total value, in one pass. We look at the trade-off between approximation-ratio and amount of memory used.Summary: We interpret results for set systems, shattering, and VC-dimension, in the context of graph orientations. We thus obtain a number of inequalities (some known, some new) involving distances, flows, connectivity, forbidden subgraphs, etc., in graphs. A warm-up example: the number of orientations of a graph G in which there is an s-to-t directed path equals the number of spanning subgraphs of G in which s and t are connected, for every G, s, t.Summary: Given n points in d dimensions, add as few extra points as possible, so that no point can be isolated within a unit box. The problem arises in the context of data privacy. We give approximation and hardness results.Summary: How to connect n points in the plane with a non-crossing network, so that the average distance between a pair of points is as small as possible. In general, the problem is shown to be hard, in the unit-weight case a non-trivial polynomial-time algorithm is found, and the metric/euclidean case is left open. Summary: We propose a PCA-like factorization algorithm for binary matrices with missing values, that scales well to very high dimensional and very sparse data. Together with a binarization method, the algorithm directly reconstructs integer entries within a small range, it is thus well-suited for predicting ratings in collaborative filtering.## Misc.-
**Useful inequalities cheat sheet**, 2012-... [link to page] [pdf] [ps.gz] - Book reviews for ACM SIGACT column: G. Navarro: Compact data structures, A. M. Hinz, S. Klavžar, U. Milutinović, C. Petr: The tower of Hanoi.
To my home page. |