10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW

10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW http://cse.unsw.edu.au/~tw

Motivation Will a problem be satisfiable or unsatisfiable? Will it be hard or easy? How can we develop heuristics for a new problem? 10/7/2014

Take home messages Hard problems often associated with a phase transition  Under constrained, easy  Critically constrained, hard  Over constrained, easier 10/7/2014

Provide definition of constrainedness Predict location of such phase transitions Can be measured during search  Observe beautiful “knife-edge” Build heuristics to get off this knife-edge 10/7/2014

Let’s start with the mother of all NP-complete problems! 10/7/2014

3-SAT  Where are the hard 3-SAT problems?  Sample randomly generated 3-SAT Fix number of clauses, l Number of variables, n By definition, each clause has 3 variables Generate all possible clauses with uniform probability

10/7/2014 Random 3-SAT Which are the hard instances?  around l/n = 4.3 What happens with larger problems? Why are some dots red and others blue? This is a so-called “phase transition”

10/7/2014 Random 3-SAT Varying problem size, n Complexity peak appears to be largely invariant of algorithm  complete algorithms like Davis-Putnam  Incomplete methods like local search What’s so special about 4.3?

10/7/2014 Random 3-SAT Complexity peak coincides with satisfiability transition  l/n < 4.3 problems under- constrained and SAT  l/n > 4.3 problems over- constrained and UNSAT  l/n=4.3, problems on “knife-edge” between SAT and UNSAT

10/7/2014 Where did this all start? At least as far back as 60s with Erdos & Renyi  thresholds in random graphs Late 80s  pioneering work by Karp, Purdom, Kirkpatrick, Huberman, Hogg … Flood gates burst  Cheeseman, Kanefsky & Taylor’s IJCAI-91 paper

10/7/2014 What do we know about this phase transition? It’s shape  Step function in limit [Friedgut 98] It’s location  Theory puts it in interval: 3.42 < l/n < 4.506  Experiment puts it at: l/n = 4.2

10/7/2014 3SAT phase transition Lower bounds (hard)  Analyse algorithm that almost always solves problem  Backtracking hard to reason about so typically without backtracking Complex branching heuristics needed to ensure success But these are complex to reason about

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X]

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] No assumptions about the distribution of X except non- negative!

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem The expected value of X can be easily calculated

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] =1) = prob(SAT) < 1

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1 n + l log2(7/8) < 0

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1 n + l log2(7/8) < 0 l/n > 1/log2(8/7) = 5.19…

10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  To get tighter bounds than 5.19, can refine the counting argument E.g. not count all solutions but just those maximal under some ordering

10/7/2014 Random 2-SAT 2-SAT is P  linear time algorithm Random 2-SAT displays “classic” phase transition  l/n < 1, almost surely SAT  l/n > 1, almost surely UNSAT  complexity peaks around l/n=1 x1 v x2, -x2 v x3, -x1 v x3, …

10/7/2014 Phase transitions in P 2-SAT  l/n=1 Horn SAT  transition not “sharp” Arc-consistency  rapid transition in whether problem can be made AC  peak in (median) checks

10/7/2014 Phase transitions above NP PSpace  QSAT (SAT of QBF)  x1  x2  x3. x1 v x2 & -x1 v x3

10/7/2014 Phase transitions above NP PSpace-complete  QSAT (SAT of QBF)  stochastic SAT  modal SAT PP-complete  polynomial-time probabilistic Turing machines  counting problems  #SAT(>= 2^n/2)

10/7/2014 Exact phase boundaries in NP Random 3-SAT is only known within bounds  3.42 < l/n < 4.506 Exact NP phase boundaries are known:  1-in-k SAT at l/n = 2/k(k-1) Are there any NP phase boundaries known exactly?

10/7/2014 Backbone Variables which take fixed values in all solutions  alias unit prime implicates Let f k be fraction of variables in backbone  in random 3-SAT l/n < 4.3, f k vanishing (otherwise adding clause could make problem unsat) l/n > 4.3, f k > 0 discontinuity at phase boundary!

10/7/2014 Backbone Search cost correlated with backbone size  if f k non-zero, then can easily assign variable “wrong” value  such mistakes costly if at top of search tree One source of “thrashing” behaviour  can tackle with randomization and rapid restarts Can we adapt algorithms to offer more robust performance guarantees?

10/7/2014 Backbone Backbones observed in structured problems  quasigroup completion problems (QCP) Backbones also observed in optimization and approximation problems  coloring, TSP, blocks world planning … Can we adapt algorithms to identify and exploit the backbone structure of a problem?

10/7/2014 2+p-SAT Morph between 2-SAT and 3- SAT  fraction p of 3-clauses  fraction (1-p) of 2-clauses 2-SAT is polynomial (linear)  phase boundary at l/n =1  but no backbone discontinuity here! 2+p-SAT maps from P to NP  p>0, 2+p-SAT is NP-complete

10/7/2014 2+p-SAT phase transition

10/7/2014 2+p-SAT phase transition l/n p

10/7/2014 2+p-SAT phase transition Lower bound  are the 2-clauses (on their own) UNSAT?  n.b. 2-clauses are much more constraining than 3- clauses p <= 0.4  transition occurs at lower bound  3-clauses are not contributing!

10/7/2014 2+p-SAT backbone f k becomes discontinuous for p>0.4  but NP-complete for p>0 ! search cost shifts from linear to exponential at p=0.4 similar behavior seen with local search algorithms Search cost against n

10/7/2014 2+p-SAT trajectories Input 3-SAT to a SAT solver like Davis Putnam REPEAT assign variable  Simplify all unit clauses  Leaving subproblem with a mixture of 2 and 3-clauses For a number of branching heuristics (e.g random,..)  Assume subproblems sample uniformly from 2+p-SAT space  Can use to estimate runtimes!

10/7/2014 2+p-SAT trajectories UNSAT SAT

10/7/2014 Beyond 2+p-SAT Optimization  MAX-SAT Other decision problems  2-COL to 3-COL  Horn-SAT to 3-SAT  XOR-SAT to 3-SAT  1-in-2-SAT to 1-in-3-SAT  NAE-2-SAT to NAE-3-SAT ..

10/7/2014 COL Graph colouring  Can we colour graph so that neighbouring nodes have different colours? In k-COL, only allowed k colours  3-COL is NP-complete  2-COL is P

10/7/2014 Random COL Sample graphs uniformly  n nodes and e edges Observe colourability phase transition  random 3-COL is "sharp", e/n = approx 2.3  BUT random 2-COL is not "sharp" As n->oo prob(2-COL @ e/n=0) = 1 prob(2-COL @ e/n=0.45) = approx 0.5 prob(2-COL @ e/n=1) = 0

10/7/2014 2+p-COL Morph from 2-COL to 3-COL  fraction p of 3 colourable nodes  fraction (1-p) of 2 colourable nodes Like 2+p-SAT  maps from P to NP  NP for any fixed p>0 Unlike 2+p-SAT  maps from coarse to sharp transition

10/7/2014 2+p-COL

10/7/2014 2+p-COL sharpness p=0.8

10/7/2014 2+p-COL search cost

10/7/2014 2+p-COL Sharp transition for p>0.8 Transition has coarse and sharp regions for 0<p<0.8 Problem hardness appears to increase from polynomial to exponential at p=0.8 2+p-COL behaves like 2-COL for p<0.8  NB sharpness alone is not cause of complexity since 2-SAT has a sharp transition!

10/7/2014 Location of phase boundary For sharp transitions, like 2+p-SAT: As n->oo, if l/n = c+epsilon, then UNSAT l/n = c-epsilon, then SAT For transitions like 2+p-COL that may be coarse, we identify the start and finish:  delta 2+p = sup{e/n | prob(2+p-colourable) = 1}  gamma 2+p = inf{e/n | prob(2+p-colourable) = 0}

10/7/2014 Basic properties  monotonicity: delta <= gamma  sharp transition iff delta=gamma  simple bounds: delta_2+p = 0 for all p<1 gamma_2 <= gamma_2+p <= min(gamma_3,gamma_2/1-p)

10/7/2014 2+p-COL phase boundary

10/7/2014 XOR-SAT  Replace or by xor  XOR k-SAT is in P for all k Phase transition  XOR 3-SAT has sharp transition  0.8894 <= l/n <= 0.9278 [Creognou et al 2001]  Statistical mechanics gives l/n = 0.918 [Franz et al 2001]

10/7/2014 XOR-SAT to SAT Morph from XOR-SAT to SAT  Fraction (1-p) of XOR clauses  Fraction p of OR clauses NP-complete for all p>0  Phase transition occurs at:  0.92 <= l/n <= min(0.92/1-p, 4.3) Upper bound appears loose for all p>0  Polynomial subproblem does not dominate!  3-SAT contributes (cf 2+p-SAT, 2+p-COL)

10/7/2014 Other morphs between P and NP NAE 2+p-SAT  NAE = not all equal  NAE 2-SAT is P, NAE 3-SAT is NP-complete 1-in-2+p-SAT  1-in-k SAT = exactly one in k literals true  1-in-2 SAT is P, 1-in-3 SAT is NP-complete …

10/7/2014 NAE to SAT Morph between two NP-complete problems  Fraction (1-p) of NAE 3-SAT clauses  Fraction p of 3-SAT clauses Each NAE 3-SAT clause is equivalent to two 3-SAT clauses  NAE 3-SAT phase transition occurs around l/n = 2.1 Tantalisingly close to half of 4.2  NAE(a,b,c) = or(a,b,c) & or(-a,-b,-c) Can we ignore many of the correlations that this encoding of NAE SAT into SAT introduces?

10/7/2014 NAE to SAT Compute “effective” clause size  Consider (1-p)l NAE 3-SAT clauses and pl 3-SAT clauses  These behave like 2(1-p)l 3-SAT clauses and pl 3-SAT clauses  That is, (2-p)l 3-SAT clauses  Hence, effective clause to variable ratio is (2-p)l/n Plot prob(satisfiable) and search cost against (2-p)l/n

NAE to SAT 10/7/2014

The real world isn’t random? Very true! Can we identify structural features common in real world problems? Consider graphs met in real world situations  social networks  electricity grids  neural networks ...

10/7/2014 Real versus Random Real graphs tend to be sparse  dense random graphs contains lots of (rare?) structure Real graphs tend to have short path lengths  as do random graphs Real graphs tend to be clustered  unlike sparse random graphs L, average path length C, clustering coefficient (fraction of neighbours connected to each other, cliqueness measure) mu, proximity ratio is C/L normalized by that of random graph of same size and density

10/7/2014 Small world graphs Sparse, clustered, short path lengths Six degrees of separation  Stanley Milgram’s famous 1967 postal experiment  recently revived by Watts & Strogatz  shown applies to: actors database US electricity grid neural net of a worm...

10/7/2014 An example 1994 exam timetable at Edinburgh University  59 nodes, 594 edges so relatively sparse  but contains 10-clique less than 10^-10 chance in a random graph  assuming same size and density clique totally dominated cost to solve problem

10/7/2014 Small world graphs To construct an ensemble of small world graphs  morph between regular graph (like ring lattice) and random graph  prob p include edge from ring lattice, 1-p from random graph real problems often contain similar structure and stochastic components?

10/7/2014 Small world graphs ring lattice is clustered but has long paths random edges provide shortcuts without destroying clustering

10/7/2014 Small world graphs

10/7/2014 Colouring small world graphs

10/7/2014 Small world graphs Other bad news  disease spreads more rapidly in a small world Good news  cooperation breaks out quicker in iterated Prisoner’s dilemma

10/7/2014 Other structural features It’s not just small world graphs that have been studied High degree graphs  Barbasi et al’s power-law model Ultrametric graphs  Hogg’s tree based model Numbers following Benford’s Law  1 is much more common than 9 as a leading digit! prob(leading digit=i) = log(1+1/i)  such clustering, makes number partitioning much easier

10/7/2014 High degree graphs Degree = number of edges connected to node Directed graph  Edges have a direction  E.g. web pages = nodes, links = directed edges In-degree, out-degree  In-degree = links pointing to page  Out-degree = links pointing out of page

10/7/2014 In-degree of World Wide Web Power law distribution  Pr(in-degree = k) = ak^-2.1 Some nodes of very high in-degree  E.g. google.com, …

10/7/2014 Out-degree of World Wide Web Power law distribution  Pr(in-degree = k) = ak^-2.7 Some nodes of very high out-degree  E.g. people in SAT

10/7/2014 High degree graphs World Wide Web Electricity grid Citation graph  633,391 out of 783,339 papers have < 10 citations  64 have > 1000 citations  1 has 8907 citations Actors graph  Robert Wagner, Donald Sutherland, …

10/7/2014 High degree graphs Power law in degree distribution  Pr(degee = k) = ak^-b where b typically around 3 Compare this to random graphs  Gnm model n nodes, m edges chosen uniformly at random  Gnp model n nodes, each edge included with probability p  In both, Pr(degree = k) is a Poisson distribution tightly clustered around mean

10/7/2014 Random v high degree graphs

10/7/2014 Generating high-degree graphs Grow graph Preferentially attach new nodes to old nodes according to their degree  Prob(attach to node j) proportional to degree of node j  Gives Prob(degree = k) = ak^-3

10/7/2014 High-degree = small world? Preferential attachment model  n=16, mu=1  n=64, mu=1.35  n=256, mu=2.12  … Small world topology thus for large n!

10/7/2014 Search on high degree graphs Random  Uniformly hard Small world  A few long runs High degree  More uniform  Easier than random

10/7/2014 What about numbers? So far, we’ve looked at structural features of graphs Many problems contain numbers  Do we see phase transitions here too?

10/7/2014 Number partitioning What’s the problem?  dividing a bag of numbers into two so their sums are as balanced as possible What problem instances?  n numbers, each uniformly chosen from (0,l ]  other distributions work (Poisson, …)

10/7/2014 Number partitioning Identify a measure of constrainedness  more numbers => less constrained  larger numbers => more constrained  could try some measures out at random (l/n, log(l)/n, log(l)/sqrt(n), …) Better still, use kappa!  (approximate) theory about constrainedness  based upon some simplifying assumptions e.g. ignores structural features that cluster solutions together

10/7/2014 Theory of constrainedness Consider state space searched  see 10-d hypercube opposite of 2^10 possible partitions of 10 numbers into 2 bags Compute expected number of solutions,  independence assumptions often useful and harmless!

10/7/2014 Theory of constrainedness Constrainedness given by: kappa= 1 - log2( )/n where n is dimension of state space kappa lies in range [0,infty)  kappa=0, =2^n, under-constrained  kappa=infty, =0, over-constrained  kappa=1, =1, critically constrained phase boundary

10/7/2014 Phase boundary Markov inequality  prob(Sol) Now, kappa > 1 implies < 1 Hence, kappa > 1 implies prob(Sol) < 1 Phase boundary typically at values of kappa slightly smaller than kappa=1  skew in distribution of solutions (e.g. 3-SAT)  non-independence

10/7/2014 Examples of kappa 3-SAT  kappa = l/5.2n  phase boundary at kappa=0.82 3-COL  kappa = e/2.7n  phase boundary at kappa=0.84 number partitioning  kappa = log2(l)/n  phase boundary at kappa=0.96

10/7/2014 Number partition phase transition Prob(perfect partition) against kappa

10/7/2014 Finite-size scaling Simple “trick” from statistical physics  around critical point, problems indistinguishable except for change of scale given by simple power-law Define rescaled parameter  gamma = kappa-kappa c. n^1/v kappa c  estimate kappa c and v empirically e.g. for number partitioning, kappa c =0.96, v=1

10/7/2014 Rescaled phase transition Prob(perfect partition) against gamma

10/7/2014 Rescaled search cost Optimization cost against gamma

10/7/2014 Easy-Hard-Easy? Search cost only easy-hard here?  Optimization not decision search cost!  Easy if (large number of) perfect partitions  Otherwise little pruning (search scales as 2^0.85n) Phase transition behaviour less well understood for optimization than for decision  sometimes optimization = sequence of decision problems (e.g branch & bound)  BUT lots of subtle issues lurking?

Looking inside search 10/7/2014 Clauses/variables down search branch

Looking inside search 10/7/2014 Clauses length down search branch

Constrainedness knife-edge 10/7/2014 kappa down search branch

Constrainedness knife-edge 10/7/2014 Real world register allocation graph colouring problem

Constrainedness knife-edge 10/7/2014 Optimisation problems too (number partitioning)

Exploiting the knife-edge Get off the knife-edge asap  Aka minize constrainedness Many existing heuristics can be viewed in this light  E.g. fail first heuristic in CSPs  E.g. KK heuristic for number partitioning  … 10/7/2014

Exploiting the knife-edge Get off the knife-edge asap  Aka minize constrainedness Many existing heuristics can be viewed in this light  E.g. fail first heuristic in CSPs  E.g. KK heuristic for number partitioning  … Good way to design new heuristics  Branch into subproblem with minimal kappa  Challenge: to compute this efficiently! 10/7/2014

The future? What open questions remain? Where to next?

10/7/2014 Open questions Prove random 3-SAT occurs at l/n = 4.3  random 2-SAT proved to be at l/n = 1  random 3-SAT transition proved to be in range 3.42 < l/n < 4.506 2+p-COL  Prove problem changes around p=0.8  What happens to colouring backbone?

10/7/2014 Open questions Does phase transition behaviour give insights to help answer P=NP?  it certainly identifies hard problems!  problems like 2+p-SAT and ideas like backbone also show promise But problems away from phase boundary can be hard to solve over-constrained 3-SAT region has exponential resolution proofs under-constrained 3-SAT region can throw up occasional hard problems (early mistakes?)

10/7/2014 Summary That’s nearly all from me!

10/7/2014 Conclusions Phase transition behaviour ubiquitous  decision/optimization/...  NP/PSpace/P/…  random/real Phase transition behaviour/constrainedness gives insight into problem hardness  suggests new branching heuristics  ideas like the backbone help understand branching mistakes

10/7/2014 Conclusions AI becoming more of an experimental science?  theory and experiment complement each other well  increasing use of approximate/heuristic theories to keep theory in touch with rapid experimentation Phase transition behaviour is FUN  lots of nice graphs as promised  and it is teaching us lots about complexity and algorithms!

10/7/2014 Very partial bibliography Cheeseman, Kanefsky, Taylor, Where the really hard problem are, Proc. of IJCAI-91 Gent et al, The Constrainedness of Search, Proc. of AAAI-96 Gent & Walsh, The TSP Phase Transition, Artificial Intelligence, 88:359-358, 1996 Gent & Walsh, Analysis of Heuristics for Number Partitioning, Computational Intelligence, 14 (3), 1998 Gent & Walsh, Beyond NP: The QSAT Phase Transition, Proc. of AAAI-99 Gent et al, Morphing: combining structure and randomness, Proc. of AAAI-99 Hogg & Williams (eds), special issue of Artificial Intelligence, 88 (1-2), 1996 Mitchell, Selman, Levesque, Hard and Easy Distributions of SAT problems, Proc. of AAAI-92 Monasson et al, Determining computational complexity from characteristic ‘phase transitions’, Nature, 400, 1998 Walsh, Search in a Small World, Proc. of IJCAI-99 Walsh, Search on High Degree Graphs, Proc. of IJCAI-2001. Walsh, From P to NP: COL, XOR, NAE, 1-in-k, and Horn SAT, Proc. of AAAI-2001. Watts & Strogatz, Collective dynamics of small world networks, Nature, 393, 1998

Some blatent adverts! 2nd International Optimisation Summer School Jan 12th to 18th, Kioloa, NSW Will also cover local search http://go.to/optschool NICTA Optimisation Research Group 10/7/2014 We love having visitors stop by to give talks, or for longer (week, month or sabbaticals!)

10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW

Similar presentations

Presentation on theme: "10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW

Similar presentations

Presentation on theme: "10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW"— Presentation transcript:

Similar presentations

About project

Feedback