László Kozma :: Research
Research interestsFundamental data structures and algorithms, combinatorics (permutations, graphs, set systems), computational geometry, machine learning.
Summary: Using soft heaps, we obtain simpler optimal algorithms for selecting the k-th smallest item from heap-ordered trees, from sets of sorted lists, and from sets of pairwise sums (X + Y), matching, and in some ways extending classical results of Frederickson (1993) and Frederickson and Johnson (1982).
Summary: We present a new connection ("duality") between self-adjusting binary search trees (BSTs) and heaps. This allows us to transfer results between the two problems, in particular we describe: (1) a broad class of "stable" heap algorithms, (2) instance-specific lower bounds for stable heaps, (3) a new heap data structure called "smooth heap", which we show to be the heap counterpart of a BST that is conjectured to be the best.
Summary: We show that Binary Search Trees (BSTs) with several fingers can be efficiently simulated by standard (one-finger) BSTs. The results connect two prominent online problems: dynamic BSTs and k-server. As an application we show that BSTs are efficient when accessing items close to some recently accessed item.
The paper builds upon (and partially supersedes) the technical report The landscape of bounds for binary search trees. [pdf]
Results found only in the report: a survey of BST properties; new connections/separations between BST properties; a new, more intuitive analysis of Splay trees; a technique for composing ("interleaving") sequences and combining BST properties; an observation about the Move-to-root heuristic.
Summary: We improve the analysis of a classical pairing heap variant (Fredman, Sedgewick, Sleator, Tarjan, 1985) from O(n^0.5) to better than O(n^eps) for any constant eps>0.
Summary: We give a polynomial time (1+eps)-approximation for euclidean (and more general) cases of the TSP variant in which neighboring points may not be too close to each other, improving on the earlier best ratio of 2.
Summary: Contains results related to binary search trees and the dynamic optimality conjecture from three separate papers, expanded, and with an additional broad survey of the problem, as well as some new observations.
Summary: Serving a sequence of searches in a binary search tree with rotations is (in some sense) the same problem as finding a sequence of flips between two rectangulations.
Summary: The Hitting Set problem remains hard (in the parameterized sense), even in set systems of very small VC-dimension. If we restrict the structure even more, then Hitting Set is solvable in polynomial time. The easy class is a generalization of Edge Cover.
Summary: Searches in binary search trees (BSTs) take almost constant (amortized) time if the search sequence avoids an arbitrary fixed pattern. Moreover, this is attained by Greedy, a conjectured optimal online BST algorithm. Pattern avoiding sequences generalize some well-studied earlier examples, which can be seen as avoiding some particular pattern. Some of the arguments rely on results from forbidden submatrix theory.
Summary: We extend the "geometry of BST" model of Demaine et al. to handle insert and delete operations, and we show that the Greedy algorithm is almost optimal on deque sequences (insert and delete happens only at the minimum and maximum). The results give evidence that Greedy may be instance-optimal.
Summary: We give combinatorial conditions that guarantee the efficiency of self-adjusting binary search tree algorithms, unifying the analysis of several known heuristics, and obtaining new, efficient heuristics based on depth-reduction.
Summary: We study the problem of partitioning a stream of integers into blocks with roughly equal total value, in one pass. We look at the trade-off between approximation-ratio and amount of memory used.
Summary: We interpret results for set systems, shattering, and VC-dimension, in the context of graph orientations, obtaining statements about distances, flows, connectivity, forbidden subgraphs, etc., some known, some new. An easy example: the number of orientations of G in which there is an s-to-t directed path equals the number of spanning subgraphs of G in which s and t are connected, for every G, s, t.
Summary: Given n points in d dimensions, add as few extra points as possible, such that no point can be isolated within a unit box. The problem arises in the context of data privacy. We give approximation and hardness results.
Summary: How to connect n points in the plane with a non-crossing network, such that the average distance between a pair of points is as small as possible. In general, the problem is shown to be hard, in the unit-weight case we give a polynomial-time algorithm, and we leave the metric/euclidean cases open.
Summary: We propose a PCA-like factorization algorithm for binary matrices with missing values, that scales well to very high dimensional and very sparse data. Together with a binarization method, the algorithm directly reconstructs small integer entries, it is thus applicable for predicting ratings in collaborative filtering.
To my home page.