32nd International Symposium on Theoretical Aspects of Computer Science

(1)

32nd International Symposium on Theoretical Aspects of

Computer Science

STACS’15, March 4–7, 2015, Garching, Germany

Edited by

Ernst W. Mayr

Nicolas Ollinger

(2)

Editors

Ernst W. Mayr Nicolas Ollinger

Fakultät für Informatik LIFO

Technische Universität München Université d’Orléans

mayr@in.tum.de nicolas.ollinger@univ-orleans.fr

ACM Classification 1998

F.1.1 Models of Computation, F.2.2 Nonnumerical Algorithms and Problems, F.4.1 Mathematical Logic, F.4.3 Formal Languages, G.2.1 Combinatorics, G.2.2 Graph Theory

ISBN 978-3-939897-78-1

Published online and open access by

Schloss Dagstuhl – Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing, Saarbrücken/Wadern, Germany. Online available at http://www.dagstuhl.de/dagpub/978-3-939897-78-1.

Publication date February, 2015

Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.

License

This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC-BY 3.0):

http://creativecommons.org/licenses/by/3.0/legalcode.

In brief, this license authorizes each and everybody to share (to copy, distribute and transmit) the work under the following conditions, without impairing or restricting the authors’ moral rights:

Attribution: The work must be attributed to its authors.

The copyright is retained by the corresponding authors.

Digital Object Identifier: 10.4230/LIPIcs.STACS.2015.i

ISBN 978-3-939897-78-1 ISSN 1868-8969 http://www.dagstuhl.de/lipics

(3)

iii

LIPIcs – Leibniz International Proceedings in Informatics

LIPIcs is a series of high-quality conference proceedings across all fields in informatics. LIPIcs volumes are published according to the principle of Open Access, i.e., they are available online and free of charge.

Editorial Board

Susanne Albers (TU München) Chris Hankin (Imperial College London) Deepak Kapur (University of New Mexico) Michael Mitzenmacher (Harvard University)

Madhavan Mukund (Chennai Mathematical Institute) Catuscia Palamidessi (INRIA)

Wolfgang Thomas (RWTH Aachen)

Pascal Weil (Chair, CNRS and University Bordeaux) Reinhard Wilhelm (Saarland University)

ISSN 1868-8969

http://www.dagstuhl.de/lipics

(4)

(5)

Foreword

The Symposium on Theoretical Aspects of Computer Science (STACS) conference series is an international forum for original research on theoretical aspects of computer science.

Typical areas are (cited from the call for papers for this year’s conference):

algorithms and data structures, including: parallel, distributed, approximation, and randomized algorithms, computational geometry, cryptography, algorithmic learning theory, algorithmic game theory, analysis of algorithms; automata and formal languages; computational complexity, parameterized complexity, randomness in computation; logic in computer science, including: semantics, specification and verification, rewriting and deduction; current challenges, for example: natural computing, quantum computing, mobile and net computing.

STACS is held alternately in France and in Germany. This year’s conference (taking place March 4–7 in Garching near Munich) is the 32nd in the series. Previous meetings took place in Paris (1984), Saarbrücken (1985), Orsay (1986), Passau (1987), Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan (1992), Würzburg (1993), Caen (1994), München (1995), Grenoble (1996), Lübeck (1997), Paris (1998), Trier (1999), Lille (2000), Dresden (2001), Antibes (2002), Berlin (2003), Montpellier (2004), Stuttgart (2005), Marseille (2006), Aachen (2007), Bordeaux (2008), Freiburg (2009), Nancy (2010), Dortmund (2011), Paris (2012), Kiel (2013), and Lyon (2014).

The interest in STACS has remained at a high level over the past years. The STACS 2015 call for papers led to 235 submissions with authors from 39 countries. Each paper was assigned to three program committee members who, at their discretion, asked external reviewers for reports. The committee selected 55 papers during a three-week electronic meeting held in November/December. For the first time within the STACS conference series, there was also a rebuttal period during which authors could submit remarks to the PC concerning the reviews of their papers. As co-chairs of the program committee, we would like to sincerely thank all its members and the many external referees for their valuable work.

In particular, there were intense and interesting discussions. The overall very high quality of the submissions made the selection a difficult task.

This year, the conference includes two tutorials. We would like to express our thanks to the speakers Felix Brandt (TUM) and Paul Goldberg (Oxford) for these tutorials, as well as to the invited speakers, Sanjeev Arora (Princeton), Manuel Bodirsky (Dresden), and Peter Sanders (Karlsruhe). Special thanks also go to Andrei Voronkov for his EasyChair software (http://www.easychair.org). Moreover, we would like to warmly thank Christine Lissner

and Ernst Bayer for continuous help throughout the conference organization.

We would also like to thank Marc Herbstritt and Michael Wagner from the Dag- stuhl/LIPIcs team for assisting us in the publication process and the final production of the proceedings. These proceedings contain extended abstracts of the accepted contributions and abstracts of the invited talks and the tutorials. The authors retain their rights and make their work available under a Creative Commons license. The proceedings are published electronically by Schloss Dagstuhl – Leibniz-Center for Informatics within their LIPIcs series.

STACS 2015 has received funds and help from the Deutsche Forschungsgemeinschaft (DFG), for which we are very grateful.

Munich and Orléans, February 2015 Ernst W. Mayr and Nicolas Ollinger

(6)

(7)

Conference Organization

Program Committee

Andris Ambainis FPM, U Riga Hagit Attiya CS, Technion, Haifa Johannes Blömer CS, U Paderborn Mikołłaj Bojańczyk II, U Warsaw Tomas Brazdil Masaryk U, Brno Niv Buchbinder SOR, Tel Aviv U Anuj Dawar CL, U Cambridge

Adrian Dumitrescu CS, U Wisconsin-Milwaukee Matthias Englert DIMAP/DCS, U Warwick Funda Ergun SCS SFU and SoIC Indiana U Fedor Fomin IN, U Bergen

Tobias Friedrich FMI, FSU Jena Christian Glaßer I1, U Würzburg Etienne Grandjean GREYC, Caen Tomasz Jurdzinski U Wroclaw Manfred Kufleitner FMI, U Stuttgart Jerome Leroux CNRS, LaBRI, Bordeaux Ernst W. Mayr TUM, München (co-chair) Peter Bro Miltersen CS, U Aarhus

Nicolas Ollinger LIFO, Orléans (co-chair) Sylvain Perifel LIAFA, U Paris Diderot Jayalal Sarma IIT, Madras

Nicolas Schabanel CNRS, LIAFA, Paris 7 Lutz Schröder FAU Erlangen-Nürnberg Dimitrios M. Thilikos CNRS, LIRMM and U Athens Gerhard Woeginger TUE, Eindhoven

Local Organization Committee

Ernst W. Mayr, TUM, München (chair) Christine Lissner, TUM, München Ernst Bayer, TUM, München

(8)

(9)

External Reviewers

Sebastian Abshoff Oswin Aichholzer Helmut Alt Vikraman Arvind James Aspnes

Mohamed Faouzi Atig Erfan Sadeqi Azer Golnaz Badkobeh Christel Baier Valeriy Balabanov János Balogh Evangelos Bampas Hideo Bannai Régis Barbanchon Rafael Barbosa Leonid Barenboim Laurent Bartholdi Surender Baswana Tugkan Batu Florent Becker Petra Berenbrink Attila Bernáth Valérie Berthé Dietmar Berwanger Randeep Bhatia Binay Bhattacharya Arnab Bhattacharyya Marcin Bieńkowski Olivier Bodini Manuel Bodirsky Andrej Bogdanov Bernard Boigelot Udi Boker

Guillaume Bonfante Paul Bonsma Adam Bouland Andreas Brandstadt Simina Brânzei Sascha Brauer Michael Bremner Karl Bringmann Joshua Brody Véronique Bruyère Kevin Buchin Kathrin Bujna Jannis Bulian Benjamin Burton

Jarosław Byrka Daniel Cabarcas Gruia Calinescu Olivier Carton Parinya Chalermsook Jérémie Chalopin Witold Charatonik Krishnendu Chatterjee Arkadev Chattopadhyay Dimitris Chatzidimitriou Kaustuv Chaudhuri Ke Chen

Shahar Chen Christine Cheng Otfried Cheong Mahdi Cheraghchi Ferdinando Cicalese Francisco Claude Lorenzo Clemente Ilan Cohen Dinu Coltuc Nadia Creignou Maxime Crochemore Marek Cygan Artur Czumaj Jurek Czyżowicz Shantanu Das Samir Datta Mark de Berg Bart de Keijzer

Nicolas de Rugy-Altherre Ronald de Wolf

Emmanuel Delucchi Dariusz Dereniowski Krishnammorthy Dinesh Michael Dinitz

Itai Dinur Shahar Dobzinski Laurent Doyen Anne Driemel Léo Ducas Fabien Durand Christoph Dürr Zdeňek Dvořák Stefan Dziembowski Rüdiger Ehlers Kord Eickmeyer

(10)

x External Reviewers

Martina Eikel Yuval Emek Alina Ene David Eppstein Leah Epstein Bruno Escoffier William Evans Yuri Faenza John Fearnley Moran Feldman Nathanaël Fijalkow Aris Filos-Ratsikas Samuel Fiorini Lila Fontes Vojtěch Forejt Mathew Francis Robert Fraser Anna Frid Oliver Friedmann Alan Frieze Travis Gagie Anahi Gajardo Jakub Gajarský Iftah Gamzu Robert Ganian Pierre Ganty Leszek Gąsieniec Serge Gaspers Philippe Gaucher Paweł Gawrychowski Guido Gherardi Anirban Ghosh Panos Giannopoulos Archontia Giannopoulou Hugo Gimbert

Tomasz Gogacz Stefan Göller Petr Golovach Daniel Gonçalves David Gosset

Dominique Gouyou-Beauchamps Sathish Govindarajan

Vineet Goyal Serge Grigorieff Joshua Grochow Peter Günther Heng Guo Rishi Gupta Shalmoli Gupta Christoph Haase

Magnús M. Halldórsson Sean Hallgren

Michal Hańćkowiak Kristoffer Arnsfelt Hansen Thomas Dueholm Hansen Sariel Har-Peled

Thomas Hayes Pinar Heggernes Lauri Hella

Benjamin Hellouin de Ménibus Monika Henzinger

Frédéric Herbreteau Ulrich Hertrampf Cameron Donnay Hill Jeff Hirst

Petr Hliněný Martin Hoefer Seok-Hee Hong Florian Horn Pavel Hrubes Andreas Hülsing Paul Hunter John Iacono

Rasmus Ibsen-Jensen Sungjin Im

Radu Iosif Rani Izsak Bart M. P. Jansen Emmanuel Jeandel Stacey Jeffery Anders Jensen Artur Jeż Łukasz Jeż Ajay Joneja Mark Jones Antoine Joux Jakob Juhnke Tomasz Jurkiewicz Mark Kaminski Marcin Kamiński

Mamadou Moustapha Kanté Michael Kapralov

Juhani Karhumaki Shiva Kasiviswanathan Jonathan Kausch Akitoshi Kawamura Edon Kelmendi Iordanis Kerenidis Eun Jung Kim Shelby Kimmel

(11)

External Reviewers xi

Valerie King Hartmut Klauck Bartek Klin Peter Kling

Hirotada Kobayashi Yusuke Kobayashi Johannes Koebler Pascal Koiran Stavros Kolliopoulos Balagopal Komarath Eryk Kopczyński Swastik Kopparty Sajin Koroth Guy Kortsarz Adrian Kosowski Robin Kothari Lukasz Kowalik Daniel Kral Dieter Kratsch Jan Krčál Jan Křetínský Stephan Kreutzer Sebastian Krinninger R. Krithika

Anton Krohmer Antonín Kučera Ravi Kumar Piyush Kurur Eyal Kushilevitz Martin Kutrib Roman Kuznets Jakub Łącki Peter Lammich Michael Lampis Kasper Green Larsen Yanfang Le

Bastien Le Gloannec Thierry Lecroq Axel Legay Daniel Lemire Hendrik W. Lenstra Anthony Leverrier Asaf Levin Nutan Limaye Vincent Limouzy Gennadij Liske Maciej Liśkiewicz Xiao Liu

Daniel Lokshtanov Florian Lonsing

Krzysztof Loryś Michael Ludwig Frédéric Magniez Meena Mahajan Johann Makowsky Ritankar Mandal Spyridon Maniatis Rajsekar Manokaran Sabrina Mantaci Bodo Manthey

Alberto Marchetti-Spaccamela Jerzy Marcinkowski

Russell Martin Dániel Marx Tomas Masopust Kevin Matulef Elvira Mayordomo Arne Meier Daniel Meister Stefan Mengel George Mertzios Pierre-Étienne Meunier Friedhelm Meyer auf der Heide Othon Michail

Henryk Michalewski Matúš Mihalák Samuel Mimram Matteo Mio Neeldhara Misra Tal Mizrahi Matthias Mnich Morteza Monemizadeh Walter Morris

Benjamin Moseley Amer Mouawad Jean-Yves Moyen Yannis Moysoglou Marcin Mucha

Partha Mukhopadhyay Wolfgang Mulzer Daniel Nagaj

Viswanath Nagarajan Alberto Naibo N. S. Narayanaswamy Meghana Nasre Gonzalo Navarro Alantha Newman Calvin Newport Phong Nguyen Patrick K. Nicholson

(12)

xii External Reviewers

Nicolas Nisse Petr Novotný Zeev Nutov Jan Obdržálek Alexander Okhotin Alberto Ordóñez Sebastian Ordyniak Sigal Oren

Rotem Oshman Yota Otachi Youssouf Oualhadj Kenta Ozeki Katarzyna Paluch Konstantinos Panagiotou Fahad Panolan

Evanthia Papadopoulou Charles Paperman Mike Paterson Boaz Patt-Shamir Christophe Paul Daniel Paulusma Arno Pauly

Emmanuel Paviot-Adet Ami Paz

Lehilton L. C. Pedrosa Andrzej Pelc

Pablo Pérez-Lantero Dominique Perrin Leonid Petrov Giovanni Pighizzini Michał Pilipczuk Chris Pinkau Marek Piotrów Thomas Place Sebastian Pokutta Valentin Polishchuk Natacha Portier Cristian Prisacariu Ariel Procaccia Kirk Pruhs Simon Puglisi Mikaël Rabie M. S. Ramanujan Narad Rampersad Ramyaa Ramyaa Mickael Randour B. V. Raghavendra Rao Baharak Rastegari Saurabh Ray

Jean-Florent Raymond

Oded Regev Vojtěch Řehák Eric Remila

Pierre-Alain Reynier Gaétan Richard David Richerby Liam Roditty Martin Roetteler Heiko Röglin Adi Rosen Günter Rote Sasanka Roy Alan Roytman Michał Różański Philipp Rümmer Ignaz Rutter Aleksi Saarela Benjamin Sach Sigve Hortemo Sæther Ville Salo

Arnaud Sangnier Piotr Sankowski Kanthi Sarpatwar Ignasi Sau Nitin Saurabh Saket Saurabh Guido Schaefer Marcus Schaefer Patrick Scharpfenecker Christian Scheffer Christian Scheideler Sven Schewe Maximilian Schlund Markus L. Schmid Dominique Schmitt Sylvain Schmitz Henning Schnoor Roy Schwartz Luc Segoufin Géraud Sénizergues Olivier Serre Jiří Sgall Paul Shafer Chintan Shah Mordechai Shalom Asaf Shapira John Shareshi Alexander Shen Arseny Shur

Anastasios Sidiropoulos

(13)

External Reviewers xiii

Laurent Simon Rakesh Sinha Naveen Sivadasan Alexander Skopalik Shakhar Smorodinsky Christian Sohler Shay Solomon Eric Sopena

Troels Bjerre Sørensen Jiří Srba

Srikanth Srinivasan Grzegorz Stachowiak Gawiejnowicz Stanislaw Daniel Stefankovic Eckhard Steffen Damien Stehlé Darren Strash Howard Straubing Hsin-Hao Su Scott Summers Grégoire Sutre Stefan Szeider Luis Tabera Avishay Tal Navid Talebanfard Tami Tamir Pingzhong Tang Till Tantau Gabor Tardos Hanjo Täubig Sébastien Tavenas Balder ten Cate Lidia Tendera Véronique Terrier Raghunath Tewari Abhradeep Thakurta Johan Thapper Guillaume Theyssier Erez Timnat

Alexander Tiskin Stefan Toman Jacobo Torán Eric Torng Dave Touchette

Craig Tovey Henry Towsner Ashutosh Trivedi Torsten Ueckerdt Seeun Umboh Pierre Valarcher Leo van Iersel

Erik Jan van Leeuwen Dieter van Melkebeek Rob van Stee

Anke van Zuylen Adi Vardi Shai Vardi

Sergei Vassilvitskii Sander Verdonschot José Verschae

Aravindan Vijayaraghavan Tobias Walter

Haitao Wang Justin Ward Xiangzhi Wei Jeremias Weihmann Armin Weiss

Matthias Westermann James Wilson

Maximilian Witek Philipp Woelfel Alexander Wolff Damien Woods James Worrell Thomas Worsch Zhilin Wu

Christian Wulff-Nilsen Tim Wylie

Mingyu Xiao G Xu Li Yan

Yuichi Yoshida Victor Zamaraev Meirav Zehavi Marc Zeitoun Jie Zhang Yuan Zhou

(14)

(15)

Invited talks

Overcoming Intractability in Unsupervised Learning

Sanjeev Arora . . . 1 The Complexity of Constraint Satisfaction Problems

Manuel Bodirsky . . . 2 Parallel Algorithms Reconsidered

Peter Sanders . . . 10

Tutorials

Computational Social Choice

Felix Brandt . . . 19 Algorithmic Game Theory

Paul Goldberg . . . 20

Regular contributions

The Minimum Oracle Circuit Size Problem

Eric Allender, Dhiraj Holden, and Valentine Kabanets . . . 21 Graph Searching Games and Width Measures for Directed Graphs

Saeed Akhoondian Amiri, Łukasz Kaiser, Stephan Kreutzer, Roman Rabinovich, and Sebastian Siebertz . . . 34 Subset Sum in the Absence of Concentration

Per Austrin, Petteri Kaski, Mikko Koivisto, and Jesper Nederlof . . . 48 On Sharing, Memoization, and Polynomial Time

Martin Avanzini and Ugo Dal Lago . . . 62 Proof Complexity of Resolution-based QBF Calculi

Olaf Beyersdorff, Leroy Chew, and Mikoláš Janota . . . 76 Welfare Maximization with Friends-of-Friends Network Externalities

Sayan Bhattacharya, Wolfgang Dvořák, Monika Henzinger, and Martin Starnberger 90 Markov Decision Processes and Stochastic Games with Total Effective Payoff

Endre Boros, Khaled Elbassioni, Vladimir Gurvich, and Kazuhisa Makino . . . 103 Advice Complexity for a Class of Online Problems

Joan Boyar, Lene M. Favrholdt, Christian Kudahl, and Jesper W. Mikkelsen . . . 116 Las Vegas Computability and Algorithmic Randomness

Vasco Brattka, Guido Gherardi, and Rupert Hölzl . . . 130 Understanding Model Counting forβ-acyclic CNF-formulas

Johann Brault-Baron, Florent Capelli, and Stefan Mengel . . . 143

(16)

xvi Contents

Parameterized Complexity Dichotomy for Steiner Multicut

Karl Bringmann, Danny Hermelin, Matthias Mnich, and Erik Jan van Leeuwen . . 157 Solving Totally Unimodular LPs with the Shadow Vertex Algorithm

Tobias Brunsch, Anna Großwendt, and Heiko Röglin . . . 171 Improved Local Search for Geometric Hitting Set

Norbert Bus, Shashwat Garg, Nabil H. Mustafa, and Saurabh Ray . . . 184 Arc Diagrams, Flip Distances, and Hamiltonian Triangulations

Jean Cardinal, Michael Hoffmann, Vincent Kusters, Csaba D. Tóth, and Manuel Wettstein . . . 197 Tractable Probabilisticµ-Calculus That Expresses Probabilistic Temporal Logics

Pablo Castro, Cecilia Kilmurray, and Nir Piterman . . . 211 Tribes Is Hard in the Message Passing Model

Arkadev Chattopadhyay and Sagnik Mukhopadhyay . . . 224 Network Design Problems with Bounded Distances via Shallow-Light Steiner Trees

Markus Chimani and Joachim Spoerhase . . . 238 Combinatorial Expressions and Lower Bounds

Thomas Colcombet and Amaldev Manuel . . . 249 Construction ofµ-Limit Sets of Two-dimensional Cellular Automata

Martin Delacourt and Benjamin Hellouin de Menibus . . . 262 Derandomized Graph Product Results Using the Low Degree Long Code

Irit Dinur, Prahladh Harsha, Srikanth Srinivasan, and Girish Varma . . . 275 Space-efficient Basic Graph Algorithms

Amr Elmasry, Torben Hagerup, and Frank Kammer . . . 288 Pattern Matching with Variables: Fast Algorithms and New Hardness Results

Henning Fernau, Florin Manea, Robert Mercaş, and Markus L. Schmid . . . 302 Approximating the Generalized Terminal Backup Problem via Half-integral Multiflow Relaxation

Takuro Fukunaga . . . 316 On Matrix Powering in Low Dimensions

Esther Galby, Joël Ouaknine, and James Worrell . . . 329 The Complexity of Recognizing Unique Sink Orientations

Bernd Gärtner and Antonis Thomas . . . 341 New Geometric Representations and Domination Problems on Tolerance and

Multitolerance Graphs

Archontia C. Giannopoulou and George B. Mertzios . . . 354 Comparing 1D and 2D Real Time on Cellular Automata

Anaël Grandjean and Victor Poupet . . . 367 Tropical Effective Primary and Dual Nullstellensätze

Dima Grigoriev and Vladimir V. Podolskii . . . 379 Upper Tail Estimates with Combinatorial Proofs

Jan Hązła and Thomas Holenstein . . . 392

(17)

Contents xvii

Minimum Cost Flows in Graphs with Unit Capacities

Andrew V. Goldberg, Haim Kaplan, Sagi Hed, and Robert E. Tarjan . . . 406 Inductive Inference and Reverse Mathematics

Rupert Hölzl, Sanjay Jain, and Frank Stephan . . . 420 Dynamic Planar Embeddings of Dynamic Graphs

Jacob Holm and Eva Rotenberg . . . 434 On the Information Carried by Programs about the Objects They Compute

Mathieu Hoyrup and Cristóbal Rojas . . . 447 Communication Complexity of Approximate Matching in Distributed Graphs

Zengfeng Huang, Božidar Radunović, Milan Vojnović, and Qin Zhang . . . 460 Stochastic Scheduling of Heavy-tailed Jobs

Sungjin Im, Benjamin Moseley, and Kirk Pruhs . . . 474 On Finding the Adams Consensus Tree

Jesper Jansson, Zhaoxian Li, and Wing-Kin Sung . . . 487 Flip Distance Is inF P T TimeO(n+k·c^k)

Iyad Kanj and Ge Xia . . . 500 New Pairwise Spanners

Telikepalli Kavitha . . . 513 Multi-k-ic Depth Three Circuit Lower Bound

Neeraj Kayal and Chandan Saha . . . 527 Automorphism Groups of Geometrically Represented Graphs

Pavel Klavík and Peter Zeman . . . 540 Correlation Clustering and Two-edge-connected Augmentation for Planar Graphs

Philip N. Klein, Claire Mathieu^‡, and Hang Zhou^‡ . . . 554 Extended Formulation Lower Bounds via Hypergraph Coloring?

Stavros G. Kolliopoulos and Yannis Moysoglou . . . 568 Lempel-Ziv Factorization May Be Harder Than Computing All Runs

Dmitry Kosolobov . . . 582 Visibly Counter Languages and Constant Depth Circuits

Andreas Krebs, Klaus-Jörn Lange, and Michael Ludwig . . . 594 Optimal Decremental Connectivity in Planar Graphs

Jakub Łącki and Piotr Sankowski . . . 608 Testing Small Set Expansion in General Graphs

Angsheng Li and Pan Peng . . . 622 Paid Exchanges are Worth the Price

Alejandro López-Ortiz, Marc P. Renault, and Adi Rosén . . . 636 Undecidability in Binary Tag Systems and the Post Correspondence Problem for Five Pairs of Words

Turlough Neary . . . 649

(18)

xviii Contents

Separation and the Successor Relation

Thomas Place and Marc Zeitoun . . . 662 Computing 2-Walks in Polynomial Time

Andreas Schmid and Jens M. Schmidt . . . 676 Towards an Isomorphism Dichotomy for Hereditary Graph Classes

Pascal Schweitzer . . . 689 Existential Second-order Logic over Graphs: A Complete Complexity-theoretic

Classification

Till Tantau . . . 703 The Returning Secretary

Shai Vardi . . . 716 Homomorphism Reconfiguration via Homotopy

Marcin Wrochna . . . 730 Computing Downward Closures for Stacked Counter Automata

Georg Zetzsche . . . 743

(19)

Overcoming Intractability in Unsupervised Learning

Sanjeev Arora

Computer Science Department, Princeton University arora@princeton.edu

Abstract

Unsupervised learning – i.e., learning with unlabeled data - is increasingly important given today’s data deluge. Most natural problems in this domain – e.g. for models such as mixture models, HMMs, graphical models, topic models and sparse coding/dictionary learning, deep learning – are NP-hard. Therefore researchers in practice use either heuristics or convex relaxations with no concrete approximation bounds. Several nonconvex heuristics work well in practice, which is also a mystery.

The talk will describe a sequence of recent results whereby rigorous approaches leading to polynomial running time are possible for several problems in unsupervised learning. The proof of polynomial running time usually relies upon nondegeneracy assumptions on the data and the model parameters, and often also on stochastic properties of the data (average-case analysis). We describe results for topic models, sparse coding, and deep learning. Some of these new algorithms are very efficient and practical – e.g. for topic modeling.

1998 ACM Subject Classification F.2 Analysis of Algorithms and Problem Complexity, I.2 Artificial Intelligence

Keywords and phrases machine learning, unsupervised learning, intractability, NP-hardness Digital Object Identifier 10.4230/LIPIcs.STACS.2015.1

Category Invited Talk

(20)

The Complexity of Constraint Satisfaction Problems ^∗

Manuel Bodirsky

Institut für Algebra, Technische Universität Dresden, Germany Manuel.Bodirsky@tu-dresden.de

Abstract

The tractability conjecture for constraint satisfaction problems (CSPs) describes the constraint languages over a finite domain whose CSP can be solved in polynomial-time. The precise formulation of the conjecture uses basic notions from universal algebra. In this talk, we give a short introduction to the universal-algebraic approach to the study of the complexity of CSPs.

Finally, we discuss attempts to generalise the tractability conjecture to large classes of constraint languages over infinite domains, in particular for constraint languages that arise in qualitative temporal and spatial reasoning.

1998 ACM Subject Classification F.4.1 Mathematical Logic

Keywords and phrases constraint satisfaction, universal algebra, model theory, clones, temporal and spatial reasoning

Digital Object Identifier 10.4230/LIPIcs.STACS.2015.2 Category Invited Talk

1 The Constraint Satisfaction Problem

Constraint satisfaction problems are computational problems that can be formalised in several equivalent ways. A mathematically convenient way is to view CSPs as structural homomorphism problems, as follows. Fix a structure Γ with a finite relational signatureτ. The domain of Γ need not be finite for the following computational problem to be well-defined.

IDefinition 1(CSP(Γ)). The constraint satisfaction problem for Γ, denoted by CSP(Γ), is the computational problem to decide for a givenfinite τ-structureAwhether there exists a homomorphism to Γ.

The fixed structure Γ is often referred to as the constraint language of the constraint satisfaction problem, since we choose from the relations in Γ to formulate our constraints in the input structureA. We give some concrete examples of CSPs.

1. Graphn-colorability can be formulated as CSP(K_n) whereK_n is the complete loopless graph onnvertices.

2. The question whether a given finite digraph is acyclic, i.e., does not contain a directed cycle, can be formulated as CSP(Q;<).

3. The question whether a given directed graph has a vertex bipartition such that both parts are acyclic can be formulated as CSP(N;E) where

E:={(a, b)∈N²|a < b or (a−b) is odd}.

∗ The author has received funding from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013 Grant Agreement no. 257039).

licensed under Creative Commons License CC-BY

32nd Symposium on Theoretical Aspects of Computer Science (STACS 2015).

Editors: Ernst W. Mayr and Nicolas Ollinger; pp. 2–9

Leibniz International Proceedings in Informatics

Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

(21)

M. Bodirsky 3

4. CSP(R;≤, A, O) for A := {(a, b, c) ∈ R³ | a+b = c} and O := {1} is essentially the feasibility problem for linear programs (see [5]).

The list can be prolonged easily, and contains a variety of problems that appeared in the literature throughout theoretical computer science.

There is a great amount of work about the computational complexity of CSP(Γ) when Γ is a finite structure (i.e., has a finite domain), stimulated by the following dichotomy conjecture.

IConjecture 1 (Feder and Vardi [18]). Let Γbe a finite structure with a finite relational signature. Then CSP(Γ)is in P or NP-complete.

2 The Universal-Algebraic Approach

The central notion for the universal algebraic approach is the notion of apolymorphismof a constraint language Γ. A polymorphism of Γ is a homomorphism hfrom finite powers of Γ into Γ. In other words, when h has arity k, then we require for all relations R of Γ and (a¹₁, . . . , a¹_n) ∈ R, . . . ,(a^k₁, . . . , a^k_n) ∈ R that (h(a¹₁, . . . , a^k₁), . . . , h(a¹_n, . . . , a^k_n)) ∈ R.

Unary polymorphisms are also known asendomorphisms. Thus, polymorphisms generalise endomorphisms, and endomorphisms generalise automorphisms of Γ. We write Pol(Γ) for the set of all polymorphisms of Γ, and Aut(Γ) for the set of all polymorphisms of Γ.

The following result for structures with a finite domain, which relies on a fundamental theorem in universal algebra [19, 16], hints at the relevance of polymorphisms for CSPs.

ITheorem 2 ([23]). LetΓ₁ and Γ₂ be finite structures with the same domain and finite relational signatures such thatPol(Γ1)⊆Pol(Γ2). Then there is a deterministic linear-time many-one reduction from CSP(Γ₂)toCSP(Γ₁).

Theorem 2 has an important advancement, Theorem 3 below, which is particularly important when we want to reduce between CSPs where the constraint languages have different domains. Let us first mention that the set Pol(Γ) is afunction clone. A function clone is a setS of functions of finite arity that

is closed under composition: fork-aryg∈ S andl-ary f1, . . . , fk ∈ S thel-ary function g(f₁, . . . , f_k) given by (x₁, . . . , x_l)7→g(f₁(x₁, . . . , x_l), . . . , f_k(x₁, . . . , x_l)) is also inS, and contains the projectionsπ_i^k given by (x1, . . . , xk)7→xi.

A mapξ: Pol(Γ1)→Pol(Γ2) is called aclone homomorphismif for allg, f1, . . . , fk∈Pol(Γ1) ξ(g(f₁, . . . , f_k)) =ξ(g)(ξ(f₁), . . . , ξ(f_k))

andξ(π^k_i) =π^k_i for all 1≤i≤k. Aclone isomorphismis a bijective clone homomorphism.

ITheorem 3. Suppose thatΓ1 andΓ2 are finite structures with finite relational signature such that there exists a clone isomorphism between Pol(Γ1)toPol(Γ2). Then CSP(Γ1)and CSP(Γ₂)are equivalent under deterministic linear-time many-one reductions.

3 The Finite Domain Tractability Conjecture

Theorem 3 from the previous section tells us that the computational complexity of CSP(Γ) is coded into the equations that hold on the polymorphisms. We even have a candidate equation that might characterise the CSPs in P.

(22)

4 The Complexity of Constraint Satisfaction Problems

ITheorem 4 ([17, 28, 21]). Suppose that Γ is a finite structure. Then Γ has a Taylor¹ polymorphismf, that is, whenf has arity nthen it satisfies for everyi≤n an equation of the form

∀x, y. f(x1, . . . , xn) =f(y1, . . . , y1),

wherex₁, . . . , x_n, y₁, . . . , y_n∈ {x, y} and x_i6=y_i, or there is a structureΓ⁰ obtained from Γ by dropping all but finitely many relations such that CSP(Γ⁰)is NP-complete.

The condition given in Theorem 4 has been improved recently: the existence of a Taylor polymorphism is equivalent to the existence of an operation that satisfies an equation that is must easier to grasp.

ITheorem 5([1]). A finite structureΓ has a Taylor polymorphism if and only if it has a cyclicpolymorphismf, that is,f has arityn≥2 and satisfies

∀x1, . . . , xn. f(x1, . . . , xn) =f(x2, . . . , xn, x1).

The following conjecture has been made in different form by Bulatov, Jeavons, and Krokhin [17]; the formulation given below is equivalent by well-known facts. The conjecture complements Theorem 4, and its truth would settle the dichotomy conjecture.

IConjecture 2 (Tractability Conjecture). LetΓ be a finite structure with finite relational signature and a Taylor (or, equivalently, cyclic) polymorphism. ThenCSP(Γ)is in P.

4 Infinite Domains

The universal-algebraic approach can be generalised to constraint languages Γ over infinite domains. This generalisation is most straightforward when the automorphism group of Γ is large, in the following sense.

IDefinition 6. A permutation groupGon a setXis calledoligomorphicif the componentwise action ofGonXⁿ has only finitely many orbits, for alln∈N.

An example of an oligomorphic permutation group is the automorphism group of (Q;<).

Countable structures Γ with an oligomorphic permutation group are well-known to model- theorists: by a theorem independently due to Ryll-Nardzewski, Engeler, and Svenonius (see, e.g., [22]), these are precisely the countable structures that areω-categorical, that is, Γ has the property that all countable models of the first-order theory of Γ are isomorphic to Γ.

A versatile method to constructω-categorical structures is via Fraïssé-limits, and taking reducts, which we briefly recall here. We need the standard notion of homogeneity (sometimes called ultrahomogeneity) from model theory. A structure Γ is called homogeneous if all isomorphisms between finite substructures can be extended to automorphisms of Γ.

Homogeneous structures with finite relational signature areω-categorical [22]. Homogeneous structures are uniquely given by theirage, which is the class of finite structures that embed into them. The age of a homogeneous structure must have the amalgamation property (we again refer to [22]), and every amalgamation classCgives rise to a homogeneous structure of ageC. The fundamental model theory of homogeneous structures goes back to Fraïssé, and hence the unique homogeneous structure for a given amalgamation class is called the Fraïssé-limit of this class.

1 Note that, contrary to what can often be found in the literature, in our definition of Taylor operations, we do not insist on idempotency off.

(23)

M. Bodirsky 5

Areduct of a structure ∆ is a structure Γ on the same domain such that all relations of Γ are first-order definable (without parameters) in ∆. For example, the structure (Q; Betw) where Betw :={(x, y, z)|x < y < z∨z < y < x}(the so-calledBetweenness relation) is a reduct of (Q;<). Reducts of homogeneous structures need not be homogeneous, but reducts ofω-categorical structures remainω-categorical.

When Γ is ω-categorical, then the complexity of Γ is still coded into the polymorphisms.

ITheorem 7 ([8]). Let Γ1 andΓ2 be ω-categorical structures with the same domain and finite relational signatures such that Pol(Γ1) = Pol(Γ2). Then Γ1 and Γ2 are equivalent under deterministic linear-time many-one reductions.

An example of a permutation group which isnotoligomorphic is the automorphism group of the structure (N;E) discussed in the introduction: it has infinitely many orbits in its componentwise action onN². However, in this case it is easy to come up with a structure that has precisely the same CSP, but whose automorphism groupisoligomorphic: letQ1, Q2

be a partition ofQsuch that bothQ₁ andQ₂ are dense inQ, and consider the structure (Q;E⁰) where

E⁰:={(a, b)∈Q²|a < bor a∈Q1⇔b∈Q2}.

This is a frequent phenomenon: many computational problems in temporal and spatial reasoning can be formulated as CSPs, but often some extra care is needed to show that they can be formulated with ω-categorical constraint languages. A necessary and sufficient Myhill-Nerode-type condition that characterises the CSPs that can be formulated with an ω-categorical constraint language can be found in [3]. An example of a structure that does not satisfy the mentioned Myhill-Nerode-type condition of Example 4, in the introduction.

Hence, CSP(R;A, O) (which is essentially the feasibility problem for linear programs) cannot be formulated as CSP(Γ) with anω-categorical constraint language.

We do not know whether Theorem 3 remains valid forω-categorical structures Γ, that is, whether the isomorphism type of the polymorphism clone of Γ determines the complexity of CSP(Γ). However, the theorem can be rescued by a slight modification.

ITheorem 8([12]). Suppose thatΓ1andΓ2areω-categorical structures with finite relational signature such that there exists a clone isomorphism between Pol(Γ1)and Pol(Γ2)which is also a homeomorphism. Then CSP(Γ1) and CSP(Γ2) are equivalent under deterministic linear-time many-one reductions.

The homeomorphicity requirement in Theorem 8 is with respect to the topology of pointwise convergenceon the space of all functions of finite arity, which is defined as follows.

For elementsa, b₁, . . . , b_k of the domainD, defineF_a,b₁_,...,b_k:={f |f(b₁, . . . , b_k) =a}. Then the topology of pointwise convergence is the smallest topology where the open sets include {F_a,b₁_,...,b_k |k∈N, a, b₁, . . . , b_k ∈D}. It is a basic fact that a clone Cis closed in this space, C=C, if and only if it is the polymorphism clone of a structure.

5 A general tractability conjecture

Cyclic polymorphisms do not characterise the tractability of the CSP for ω-categorical structures: a simple counterexample is the structure (N;6=, I₄) whereI₄ is the quaternary relation defined asI4:={(a, b, c, d)∈N⁴|a=b⇒c=d}. The automorphism group of this structure is the set of all permutations ofN, which is clearly oligomorphic. The polymorphisms of this structure are precisely all functions that are composed from injective functions and

(24)

projections. Hence, the clone does not contain cyclic operations. But CSP(N;6=, I4) is easily seen to be in P; see [6].

The structure (N;6=, I4) has polymorphisms that are almost as good as cyclic operations:

every binary injective operation f will be a polymorphism, and we can always pick an injectionifromN→Nsuch that the following holds:

∀x1, x2. f(x1, x2) =i(f(x2, x1)).

We also have to describe an obstruction to general algorithmic results for the class of all ω-categorical structures. Henson [20] constructed uncountably many homogeneous directed graphs Γ, and all of these directed graphs have distinct CSPs. Since there are only countably many algorithms, there must be directed graphs in this class with an undecidable CSP. There are also CSPs of various intermediate complexities [2]. All of Henson’s digraphs have a binary polymorphismf and endomorphismse₁, e₂ satisfying

∀x1, x₂. e₁(f(x₁, x₂)) =e₂(f(x₂, x₁)),

that is, from a universal-algebraic perspective, they all ‘look like easy CSPs’, but they are not.

Henson’s directed homogeneous graphs are based on forbiddinginfinite families of finite structures. On the other hand, theω-categorical structures that appear ‘in nature’ (either in mathematics or to formulate computational problems as CSPs) can typically be described by forbidding only finitely many finite structures. More formally, we say that a homogeneous structure Γ isfinitely bounded if there exists a finite setF of finite structures such that the age of Γ is given as the class of all finite structures that do not embed any of the structures from F. We now generalise the tractability conjecture by modifying the idea of Taylor polymorphisms so that it involves outside applications of endomorphisms, as follows.

IConjecture 3. LetΓbe the reduct of a finitely bounded homogeneous structure. If Γhas a polymorphismf of arity n≥2such that for every i≤nthere are endomorphisms e1, e2 and x1, . . . , xn, y1, . . . , yn∈ {x, y} withxi6=yi such that f satisfies

e₁(f(x₁, . . . , x_n)) =e₂(f(y₁, . . . , y_n)) then CSP(Γ)is in P. Otherwise,CSP(Γ) is NP-complete.

The conjecture has been verified for several classes ofω-categorical structures:

All reducts of (Q;<) in [7];

All reducts of the Random graph (the Fraïssé-limit of the class of all finite graphs) in [10];

All reducts of the homogeneous equivalence relation with infinitely many infinite classes in [15].

The strongest tool we have for attacking this conjecture will be introduced in the next section.

6 Ramsey Theory

The complexity classification results forω-categorical structures mentioned in Section 5 rely on results from structural Ramsey theory. We say that a homogeneous structure Γ isRamsey if for all finite substructuresAandB of Γ and every colouring of the embeddings ofA into Γ with finitely many colours, there exists an embeddinge:B→Γ such that all embeddings ofAintoe(B) have the same color. Examples of homogeneous Ramsey structures are

(25)

M. Bodirsky 7

(Q;<);

theordered Random graph, and other generically ordered Fraïssé-limits of so-calledfree amalgamation classes (examples are ordered versions of the Henson digraphs) [27];

the convexly ordered homogeneous binary branching C-relation, and other tree-like structures [25, 26];

thelexicographically ordered vector space over a finite field (see [24]);

thelexicographically ordered atomless Boolean algebra (see [24]).

The fact that a structure is Ramsey can be exploited when analysing its automorphism group, endomorphism monoid, or polymorphism clone. Our usage of Ramsey theory is almost exclusively via the concept of canonical functions. For simplicity, we explain this concept for unary functions only; however, the ideas generalize straightforwardly to finitary functions; see [9] for an in-depth introduction to the method of canonical functions. A functionf: Γ→Γ is calledcanonical if for allβ∈Aut(Γ) we havef◦β∈ {αf |α∈Aut(Γ)}.

When Γ is an ordered Ramsey structure, then an arbitrary function ‘looks as a canonical function on large parts of the domain’: formally, for every function f over the domain of Γ, there exists a canonical function g in{αf β|α, β∈Aut(Γ)} – thecanonisation lemma.

In practice, we often use a generalisation of canonisation involving constants – we refer to [9] for details. Suppose now that Γ is homogeneous in a finite relational signature. Then there are only finitely many behaviours of canonical functions, and this is essential to break classification arguments dealing with endomorphisms (and polymorphisms) into finitely many cases. We hope that canonical functions and canonization can be used to reduce Conjecture 3 to Theorem 4 and Conjecture 2.

The method of canonical functions has been used extensively in [7, 13, 4, 10, 9, 15, 11], in two contexts: complexity classification of CSPs and classification of reducts of homogeneous structures.

When is it possible to apply this method to analyse the endomorphisms (and polymorphisms) ofC? We do not need Cto be Ramsey, it suffices thatChas a homogeneous expansion with finite relational signature which is Ramsey. The following question is therefore of essential importance.

I Question 1 ([14]). Is it true that every homogeneous structure with finite relational signature has a homogeneous Ramsey expansion with finite relational signature?

Similar in spirit, we ask the following.

I Question 2 ([14]). Can every ω-categorical structure be expanded to an ω-categorical structure which is Ramsey?

These questions are closely related to recent research in topological dynamics – we refer to a recent survey article for more on this connection [29]. A positive answer to Question 1 would imply that the method of Ramsey theory and canonical functions can be used to approach the tractability conjecture from Section 5 in general.

References

1 Libor Barto and Marcin Kozik. Absorbing subalgebras, cyclic terms and the constraint satisfaction problem. Logical Methods in Computer Science, 8/1(07):1–26, 2012.

2 Manuel Bodirsky and Martin Grohe. Non-dichotomies in constraint satisfaction complexity. In Luca Aceto, Ivan Damgard, Leslie Ann Goldberg, Magnús M. Halldórsson, Anna Ingólfsdóttir, and Igor Walukiewicz, editors, Proceedings of the International Colloquium

(26)

on Automata, Languages and Programming (ICALP), Lecture Notes in Computer Science, pages 184 –196. Springer Verlag, July 2008.

3 Manuel Bodirsky, Martin Hils, and Barnaby Martin. On the scope of the universal-algebraic approach to constraint satisfaction.Logical Methods in Computer Science (LMCS), 8(3:13), 2012. An extended abstract that announced some of the results appeared in the proceedings of Logic in Computer Science (LICS’10).

4 Manuel Bodirsky, Peter Jonsson, and Trung Van Pham. The reducts of the homogeneous binary branching C-relation. Preprint arXiv:1408.2554, 2014.

5 Manuel Bodirsky, Peter Jonsson, and Timo von Oertzen. Essential convexity and complexity of semi-algebraic constraints. Logical Methods in Computer Science, 8(4), 2012.

An extended abstract about a subset of the results has been published under the title Semilinear Program Feasibilityat ICALP’10.

6 Manuel Bodirsky and Jan Kára. The complexity of equality constraint languages.Theory of Computing Systems, 3(2):136–158, 2008. A conference version appeared in the proceedings

of Computer Science Russia (CSR’06).

7 Manuel Bodirsky and Jan Kára. The complexity of temporal constraint satisfaction problems. Journal of the ACM, 57(2):1–41, 2009. An extended abstract appeared in the Pro- ceedings of the Symposium on Theory of Computing (STOC’08).

8 Manuel Bodirsky and Jaroslav Nešetřil. Constraint satisfaction with countable homogeneous templates. Journal of Logic and Computation, 16(3):359–373, 2006.

9 Manuel Bodirsky and Michael Pinsker. Reducts of Ramsey structures.AMS Contemporary Mathematics, vol. 558 (Model Theoretic Methods in Finite Combinatorics), pages 489–519, 2011.

10 Manuel Bodirsky and Michael Pinsker. Schaefer’s theorem for graphs. In Proceedings of the Annual Symposium on Theory of Computing (STOC), pages 655–664, 2011. Preprint of the long version available at arxiv.org/abs/1011.2894.

11 Manuel Bodirsky and Michael Pinsker. Minimal functions on the random graph. Israel Journal of Mathematics, 200(1):251–296, 2014.

12 Manuel Bodirsky and Michael Pinsker. Topological Birkhoff. Transactions of the Amer- ican Mathematical Society, 2014. To appear (electronic version is published). Preprint arxiv.org/abs/1203.1876.

13 Manuel Bodirsky, Michael Pinsker, and András Pongrácz. The 42 reducts of the random ordered graph. Preprint arXiv:1309.2165, 2013.

14 Manuel Bodirsky, Michael Pinsker, and Todor Tsankov. Decidability of definability.Journal of Symbolic Logic, 78(4):1036–1054, 2013. A conference version appeared in the Proceedings of LICS 2011, pages 321–328.

15 Manuel Bodirsky and Michał Wrona. Equivalence constraint satisfaction problems. In Proceedings of Computer Science Logic, volume 16 of LIPICS, pages 122–136. Dagstuhl Publishing, September 2012.

16 V. G. Bodnarčuk, L. A. Kalužnin, V. N. Kotov, and B. A. Romov. Galois theory for Post algebras, part I and II. Cybernetics, 5:243–539, 1969.

17 Andrei A. Bulatov, Andrei A. Krokhin, and Peter G. Jeavons. Classifying the complexity of constraints using finite algebras. SIAM Journal on Computing, 34:720–742, 2005.

18 Tomás Feder and Moshe Y. Vardi. The computational structure of monotone monadic SNP and constraint satisfaction: a study through Datalog and group theory. SIAM Journal on Computing, 28:57–104, 1999.

19 David Geiger. Closed systems of functions and predicates.Pacific Journal of Mathematics, 27:95–100, 1968.

20 C. Ward Henson. Countable homogeneous relational systems and categorical theories.

Journal of Symbolic Logic, 37:494–500, 1972.

(27)

M. Bodirsky 9

21 David Hobby and Ralph McKenzie. The structure of finite algebras, volume 76 ofContem- porary Mathematics. American Mathematical Society, 1988.

22 Wilfrid Hodges. A shorter model theory. Cambridge University Press, Cambridge, 1997.

23 P. G. Jeavons. On the algebraic structure of combinatorial problems.Theoretical Computer Science, 200:185–204, 1998.

24 Alexander Kechris, Vladimir Pestov, and Stevo Todorcevic. Fraissé limits, Ramsey theory, and topological dynamics of automorphism groups. Geometric and Functional Analysis, 15(1):106–189, 2005.

25 Klaus Leeb. Vorlesungen über Pascaltheorie, volume 6 of Arbeitsberichte des Instituts für Mathematische Maschinen und Datenverarbeitung. Friedrich-Alexander-Universität Erlangen-Nürnberg, 1973.

26 Keith R. Milliken. A Ramsey theorem for trees. Journal of Combinatorial Theory, Series A, 26(3):215 – 237, 1979.

27 Jaroslav Nešetřil and Vojtěch Rödl. The partite construction and Ramsey set systems.

Discrete Mathematics, 75(1-3):327–334, 1989.

28 Walter Taylor. Varieties obeying homotopy laws. Canadian Journal of Mathematics, 29:498–527, 1977.

29 Lionel Nguyen Van Thé. A survey on structural ramsey theory and topological dynamics with the Kechris-Pestov-Todorcevic correspondence in mind. Accepted for publication in Zb. Rad. (Beogr.), 2014. Preprint arXiv:1412.3254v2.

(28)

Parallel Algorithms Reconsidered

Peter Sanders

Karlsruhe Institute of Technology Karlsruhe, Germany

sanders@kit.edu

Abstract

Parallel algorithms have been a subject of intensive algorithmic research in the 1980s. This research almost died out in the mid 1990s. In this paper we argue that it is high time to reconsider this subject since a lot of things have changed. First and foremost, parallel processing has moved from a niche application to something mandatory for any performance critical computer applications. We will also point out that even very fundamental results can still be obtained. We give examples and also formulate some open problems.

1998 ACM Subject Classification F.2 Analysis of Algorithms and Problem Complexity, F.1.2 Parallelism and Concurrency

Keywords and phrases parallel algorithm, algorithm engineering, communication efficient algorithm, polylogarithmic time algorithm, parallel machine model

Digital Object Identifier 10.4230/LIPIcs.STACS.2015.10

Category Invited Talk

1 Introduction

Parallel algorithms were a hot topic in the 1980 but then the subject almost died. For example, a quick, subjective count of the parallel algorithm papers in STOC 1985, 1990, 1995, 2000, 2005, 2010, 2014 gave 13, 8, 11, 6, 1, 1, 6 papers respectively. The left column of the following table gives a number of interrelated very strong reasons why this happened.

However, if you also look at the right column, you see that these reasons are not relevant any more.

Parallel computing was in practice used rarely because parallel computers were expensive and hard to program due to exotic hardware and software.

Today, parallel hardware is everywhere (see Figure 1). Even smart phones have quad-core processors. The latest Intel server processors support up to 18 cores. With multiple sockets and hardware multithreading, this already ranges into three digit numbers of threads. Graphics processors increasingly used for general purpose computing (GPGPU) have thousands of cores. For example, the NVidia GTX 980 card has 2048 cores and needs a number of hardware threads at least an order of magnitude larger to achieve full performance.

licensed under Creative Commons License CC-BY

32nd Symposium on Theoretical Aspects of Computer Science (STACS 2015).

Editors: Ernst W. Mayr and Nicolas Ollinger; pp. 10–18

Leibniz International Proceedings in Informatics

Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

(29)

P. Sanders 11

For most programmers, it was easier to wait un- til the microprocessor industry provided new processor designs that translate the additional transistor budget due to Moore’s law into higher clock frequency and higher instruction parallelism.

This stopped when processor design ran against the “power wall”

– it is no longer feasible to significantly increase processor clock speeds since this increases the energy consumption to a point where energy costs are too high and where cooling becomes too expensive [16]. For example, in 2004, Intel presented the Pentium 4 Prescott microarchitecture that increased clock frequency and energy consumption butnot benchmark performance compared to previous models. A short time later, the Netburst line of microarchitectures used for the Pentium 4 was discontinued and Intel started to design more conservative cores putting more and more of them on the same chip.

The actual applications of parallel computers were mostlynumer- icalsimulations that needed little of the results developed by the algorithm theory community. Excep- tions (that almost prove the rule) can be found for lower level aspects like network topologies, e.g. [21].

Now, we have to look for parallelization opportunities in every performance critical application since this is the only way to exploit the available hardware. Moreover, the Big Data boom has produced many new applications outside numerical simulations where massively parallel processing is crucial. Furthermore, the methodology of algorithm engineering (e.g. [29]) makes it easier to bridge gaps between theory and practice.

The machine models used by theorists, like the PRAM model were widely criticized as too remote from practice.

Message passing models (e.g., [41, 33]) or memory hierarchies [4] avoid some of the pitfalls of PRAMs. Moreover, PRAM algorithms are often not that impractical if we avoid the fallacy of speeding up computations at the cost of highly inefficient computing.

Most companies spe- cializing in parallel computers did go bankrupt.

Now, parallel computers are mainstream products of the big players.

Big data and cloud is at the core of the business of some of the worlds most profitable companies. Computer games proved to a be a killer application (almost literally), catalyzing the use of architectures (GPUs) that would otherwise have been dismissed as too exotic and

cumbersome.

In the late 1990s, theInternet boom(aka Dot- com bubble) drew parallel algorithms researchers into startups (e.g., Akamai) and into new research fields related to emerging internet applications (e.g., algorithmic game theory).

Some of these people and a new genera- tion of researchers now look at parallel algorithms from a fresh Big Data perspective. Indeed, in 2014 there were 6 STOC papers on parallel algorithms again.

These observations indicate that parallel algorithms should be an even hotter topic than in the 1980s. It seems that today the theory community is lagging behind an important trend.

One way to explain this lack of enthusiasm is the hypothesis that, perhaps, researchers may have done a very thorough job in the past and discovered almost all the really interesting parallel algorithms that are there to discover. The main purpose of the remainder of this paper is to refute this hypothesis.

First it should be noted, that in the last two decades there have been important trends in computer science that have a largely unaddressed parallel processing aspect:

There has been a lot of work on processing large data sets in the presence of memory hierarchies (e.g., [23, 43]). Many of the techniques developed there, e.g., time forward processing (e.g., [10, 11]), do not readily translate to parallel processing and thus pose important open problems.

(30)

12 Parallel Algorithms Reconsidered

1 10 100 1000 10 000 100 000 10⁶

1980 1985 1990 1995 2000 2005 2010 2015

cores

year fastest machine

Nvidia GPU Intel Xeon

Figure 1Number of processors (cores) in the worlds fastest supercomputers [40], Nvidia GPUs [44], and Intel Xeon server processors (single chip) [45].

Data movement in memory hierarchies is vertical data movement between memory units at different levels. Horizontal data movements between processors in a distributed memory machine is an equally important related problem but has been studied much less. An important difference is that horizontal communication volume can be sublinear in the input size if we manage to solve problems by predominantly local computation. The resulting area ofcommunication efficient algorithms is full of interesting open problems [33].

Streaming algorithms [18] have explicitly been developed to allow processing large data sets. However, the basic model for streaming algorithms is inherently sequential and needs parallel generalizations. See also [33].

Manysuccinct data structures(e.g. [17]) have been designed to handle large data sets.

However, for many of them it is not clear how to construct them efficiently in parallel.

Smoothed analysis[37] is a sound way to explain why certain algorithms for hard problems are efficient in practice. However, few parallel algorithms have been analyzed in this framework.

Fixed-parameter algorithms(e.g., [25]) study efficient algorithms for “easy” instances of hard problems. Few of these algorithms have been parallelized so far.

There has been some early work on parallelapproximation algorithms[22] but very little on parallelizing the vast number of approximation algorithms studied since then. It is particularly surprising that even the intensive work on scheduling parallel processors has seen very few algorithms for doing that in parallel [3, 31].

Application areas likebioinformatics orcomputational finance recently had large impact on algorithmic research. Many of the investigated problems require parallel algorithms to be useful in practice.

Thebig databoom brought a large number of new applications into focus, in particular, algorithms fordata analysisandmachine learning become important.

Theenergy consumption of computations is becoming more important than running time.

this should become important for algorithm design, in particular forexascale computing where the computer architects are already basing most of their design decisions on energy consumption (e.g. [20]).

(31)

P. Sanders 13

Applications on exascale computers and Big data also requirefault tolerance. Building that already into the algorithms is a promising research area. The algorithm theory community has done some work on resilient algorithms [14] that can survive certain memory corruptions but is has not embraced fault tolerant parallel algorithms. This is surprising because fault tolerance is actuallyeasier to achieve in a parallel system since intact processors may step in for faulty ones.

2 Examples from our Work

In order to illustrate that there is a bonanza of quite fundamental results on parallel algorithms still to be found, we describe a selection of our results on parallel algorithms published since 2013.

2.1 Sorting

Sorting is one of the most intensively studied algorithmic problems. It is of particular interest to parallel computing since sorting is often used to bring data together that has to be processed together. We were able to obtain several quite fundamental new results on sorting.

String sorting. is practically important since many big data applications have variable length keys. The theoretical challenge here is to exploit that only distinguishing prefixes need to be inspected – indeed sequential string sorting needs work only linear in the total distinguishing prefix size. We found no previous work on parallel string sorting except some PRAM algorithms always inspecting the entire input which are thus work-inefficient.

We adapted parallel sorting algorithms for atomic objects so that they only inspect the distinguishing prefixes [8, 7].

Malleable sorting. In practice, parallel programs have to share resources (e.g. processors) with other programs. Therefore, the amount of available resources for a particular program may vary over time in an online fashion. Thus parallel algorithms should be able to dynamically adapt to the amount of available resources. We have studied this phenomenon for the example of sorting and show that this yields advantages over leaving this adaptivity to the operating system [15].

Massively parallel sorting. There are many asymptotically efficient sorting algorithms running in polylogarithmic time. However, all these algorithms require the data to be moved at least a logarithmic number of times. On the other hand, there are algorithms that need to move data only once which makes them much more practical for sorting large inputs on distributed memory machines. However, these algorithms need a linear number of message startups on the critical path which makes them impractical for large machines. We have designed algorithms that interpolate between these to extremes – moving the dataktimes reduces the critical path length tokp^1/k [5]. There were similar algorithms but none with a comparable worst case guarantee.

2.2 Data Structures

There has been a lot of work on concurrent data structures (e.g. [19]). However, much of this is very slow in the worst case since contention of operations competing for the same

(32)

14 Parallel Algorithms Reconsidered

place in the data structure can occur. It turns out that these problems can sometimes be avoided by relaxing the data structure semantics or by considering bulk operations.

Relaxed Priority Queues. support concurrent insertions and deletions of elements that have near minimum values. We have designed a very simple such data structure (MultiQueue) based on multiple sequential priority queues. Insertions go to random queues and deletions take the minimum from two randomly sampled queues [28]. This data structure considerably outperforms much more complicated previous data structures.

Approximate Membership. Bloom filters save communication volume by providing a space efficient data structure for approximate membership queries. However, there is surprisingly little work on distributed Bloom filters. For example, a recent survey on Bloom filters in distributed systems [39] mentions no less than 23 variants but none that truly distributes the data structure over multiple processors and thus scales to the largest data sets. We have designed such a structure and apply it for communication efficient duplicate detection and database join [33].

2.3 Graph Algorithms

Multi-objective Shortest Paths. is an intensively studied problem of high practical relevance where parallelization is attractive since it is computationally much more expensive than the standard single-objective case. While the latter problem is difficult to parallelize in the worst case, we have shown that the additional work due to the added objectives is easy to parallelize. Indeed, a very simple generalization of Dijkstra’s well known single-objective algorithm turns out to be a scalable parallel algorithm requiring the same number of n iterations [32]. This algorithm also works well in practice [13]. Another interesting aspect of this problem is that it combines graphs and computational geometry.

Maximal matchings. We give a simple linear work polylogarithmic time algorithm for computing maximal matchings in [9]. The algorithm also computes 1/2-approximations of weighted matchings and works well in practice.

Graph partitioning. asks for partitioning the vertex set of a graph intok pieces of about equal size such that the number of cut edges is small. This is a frequently needed (NP-hard) problem that is particularly important for processing graphs in parallel. Our partitioner KaHIP [34] yields the highest quality world wide for a large spectrum of inputs including some of the largest inputs considered so far [1, 24]. The algorithms used are complex heuristics combining many techniques. What is interesting from an algorithm theory point of view is that the practical quality improvements we achieve are in large parts due to integrating solvers for graph theoretical subproblems for which polynomial time algorithms are known.

For example, this includes maximum flows, strongly connected components, negative cycle detection, or edge coloring.

2.4 Linear Algebra

One criticism of classical PRAM algorithms is not so much founded in the machine model but in the strive for polylogarithmic execution time even at the cost of inefficient algorithms.

One such example are algorithms for matrix inversion and related problems. Theoretical research has found polylogarithmic time inefficient algorithms whereas the algorithms used

32nd International Symposium on Theoretical Aspects of Computer Science