Stream: Coq users

Topic: Proving log_2(n!) bound for sorting


view this post on Zulip Lessness (Oct 14 2020 at 19:27):

Is there some good literature about this bound (that's helpful for formalizing the proof in Coq)? Or maybe some developments in Coq already?

view this post on Zulip Yishuai Li (Oct 19 2020 at 17:46):

Did you mean https://en.wikipedia.org/wiki/Stirling%27s_approximation ?

view this post on Zulip Li-yao (Oct 19 2020 at 18:24):

@Yishuai Li that's a much stronger result than you need to prove the lower bound of the complexity of sorting by comparison.

view this post on Zulip Karl Palmskog (Oct 19 2020 at 18:25):

in any case, I think this is the one meant? https://github.com/coq-community/coqtail-math/blob/master/Reals/RStirling.v - so if no hammer is available, a wrecking ball may suffice

view this post on Zulip Li-yao (Oct 19 2020 at 18:30):

Formalizing "comparison sorting algorithms" sounds like a fun rabbit hole.

view this post on Zulip Cody Roux (Oct 19 2020 at 19:37):

In Isabelle, there is this: https://www.isa-afp.org/entries/Comparison_Sort_Lower_Bound.html

view this post on Zulip Paolo Giarrusso (Oct 19 2020 at 20:21):

(deleted)

view this post on Zulip Paolo Giarrusso (Oct 19 2020 at 20:34):

(deleted)

view this post on Zulip Paolo Giarrusso (Oct 19 2020 at 20:37):

ignore what I wrote; Cormen’s proof obtains the lg (n!) lower bound first, and _then_ uses Stirling to get n lg n.

view this post on Zulip Lessness (Oct 20 2020 at 17:51):

Li-yao said:

Formalizing "comparison sorting algorithms" sounds like a fun rabbit hole.

It is. :) At least I learn a lot doing that, even when I'm not close to being successful yet.

view this post on Zulip Cody Roux (Oct 20 2020 at 20:09):

I think it's not that crazy, depending on what your model of computation is. If it's an abstract machine that can do 3 operations:

  1. Compare 2 elements
  2. Swap the elements.
  3. Recurse.

Then I don't think things are hopeless. In particular the Isabelle formalization I linked to earlier seems pretty tractable.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:13):

from my understanding, it's quite easy to "cheat" in the Isabelle model. You only count the comparisons (or other magically ordained operations), but you can do exponential amounts of other computation. For radix sort and the like, this is a no go.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:14):

this seems to be a more fundamental approach, not to say this solves all kinds of cheating issues: https://github.com/uds-psl/coq-library-complexity

view this post on Zulip Cody Roux (Oct 20 2020 at 20:18):

Ooof, that repo does not suffer from an overabundance of comments.

I honestly don't know if we can even prove any lower bounds on sorting in a concrete TM model. what would an appropriate model for this be? I think that only counting the comparisons seems reasonable...

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:21):

to be honest, count-the-comparisons is a bad tradeoff between two things:

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:23):

one very practical alternative is to work in a deep embedding of a language with a cost model. Then, the language implementers can ensure the cost model is realized in practice, and you can run your complexity-verified program using their compiler/runtime.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:24):

my understanding of coq-library-complexity is that they are working towards having something a complexity theorist would sign off on, and there could also be the possibility of a runtime which is verified to implement computation-model-steps in constant time

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:32):

to completely avoid cheating, I don't see any other option than reasoning about hardware-level timings

view this post on Zulip Lessness (Oct 20 2020 at 20:34):

My model is like this (I skipped the part where it is executed). For now it's only for 8 numbers, I know.
It allowed me to implement merge-insertion sort and to prove that it works with 16 comparisons.

Inductive value := x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7.
Inductive variable := aux | num: value -> variable.

Inductive assignment := assign: variable -> variable -> assignment.
Inductive comparison := GT: forall (more less: value), comparison.

Inductive step :=
| assignments: forall (L: list assignment), step
| conditional: forall (c: comparison) (positive negative: step), step.

Definition algorithm := list step.

view this post on Zulip Lessness (Oct 20 2020 at 20:37):

But it makes hard to prove things about it. At least for me.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 20:38):

basically, the next step in abstraction is to work in a monad, like they did here: https://github.com/coq-contribs/quicksort-complexity/tree/v8.9

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:47):

from my understanding, it's quite easy to "cheat" in the Isabelle model. You only count the comparisons (or other magically ordained operations), but you can do exponential amounts of other computation. For radix sort and the like, this is a no go.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:47):

@Karl Palmskog but I think that’s the correct result from Cormen.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:48):

this is a _lower bound_ on the number of comparisons.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:48):

“exponential amount of other computation” seems irrelevant

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:49):

unless you’re asking whether this is a generally valid model of algorithms, but I don’t think that was the question.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:49):

I’m also not sure why you bring up radix sort — we know the theorem does not apply to radix sort, and that radix sort indeed beats the bound.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 21:50):

do you think a complexity theorist would accept the Isabelle model as valid for establishing the lower bound?

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:51):

Depends. Do they accept the Cormen proof?

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:51):

also, if they reject it, it depends on why. as mentioned, I don’t think that argument is relevant.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 21:52):

the Cormen "proof" is obviously just a sketch, which they will likely interpret in a realm of Turing machines

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:53):

double-checking, and I think not.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:53):

“The decision tree model. [...] Control, data movement, and all other aspects of the algorithm are ignored.”

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:54):

I want to re-skim it a moment, and I agree it’s a sketch, but it doesn’t seem a sketch about Turing machines.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 21:56):

but if you want it to be a result about what is computable (non-existence of computable functions), how do you avoid Turing machines?

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:56):

but radix sort exists, so I don’t think that can be the result.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:57):

actually let me riddle you a question here.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:57):

to me, this theorem relies on some parametricity argument.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:57):

to a PL person like me, a “comparison sort” is a parametric function forall T, list T -> (T -> T -> bool) -> list T

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 21:59):

I can imagine trying to formalize this without knowing programming languages, but I’m not sure you can do it without formalizing an ad-hoc version of parametricity anyway: your algorithm has to behave the same on any two input lists where the “comparison results” are the same.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:00):

and BTW, I’m not sure that’s enough for Turing machines, since in that contexts functions are _data_ not codata so they can be inspected.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:01):

from hanging out with complexity theorists, I think they will interpret the lower bound as a way to disprove anyone who says they have a Turing machine that performs comparison-based sorting in O(n) computation steps

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:02):

you haven’t said what is “comparison-based sorting”

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:02):

but I remembered the keyword they’d use: “oracle”

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:02):

a sorting algorithm is a recursive datatype containing either a HOL function or a query of a comparison oracle with a continuation containing the remaining computation. This makes it possible to force the algorithm to use only comparisons and to track the number of comparisons made.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:02):

yes, you can define comparison based sorting at the Turing machine level as a TM that has access to an oracle for comparisons

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:03):

oracles are as opaque as we need (they’re codata, like functions)

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:03):

and models where all that counts are queries to oracles are fine

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:04):

but okay, I do have an objection.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:04):

but why restrict to counting only oracle queries?

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:04):

because that’s the theorem statement?

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:05):

OTOH, HOL functions need not be computable. However, that means the Isabelle result is _stronger_ than needed.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:05):

it's not how you would use it for anything, you can talk about all other computation that is done as well

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:05):

Theorem 8.1. “Any comparison sort algorithm requires Omega (n lg n) comparisons in the worst case.”

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:06):

“The decision tree model. [...] Control, data movement, and all other aspects of the algorithm are ignored.”

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:06):

so I totally agree that model is not suitable for computability in general, but it seems perfectly fine for this theorem.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:07):

I argue that if you want to apply that theorem for anything useful, you have to put it in a general Turing machine context

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:07):

in fact, the Isabelle theorem remains applicable in models of “hypercomputation” beyond Turing machines

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:08):

I can record that’s your opinion, but I’m afraid I don’t see an actual argument anywhere.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:08):

to the extent that I see it, I would concede one hole:

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:08):

you’d have to prove that Turing machines can be turned into HOL functions.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:09):

so, why would I care about some result in the decision tree model? What implements this model? How do I apply something proved about it elsewhere?

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:11):

AFAIK from Cormen, the main application is “to beat n log n, go outside that model”

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:12):

it’s a bit like working on external sorting algorithms or B-trees

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:12):

but even a discussion of "beating n log n" presumes some underlying model of computation

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:13):

just to delimit discussion a moment — the thread is about formalizing a certain textbook result.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:13):

and basically, there has to be a theorem somewhere relating the decision tree model to your general computation model, for the discussion about "beating" to make sense

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:14):

the textbook result in question is about the decision tree model, and the Isabelle work satisfies that.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:14):

your questions seem relevant to going _beyond_ that result and filling in the dots surrounding it.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:15):

and I agree that for that, you should at least prove that Turing functions can be turned into HOL functions

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:15):

so that you can turn the result into a result about Turing machines + comparison oracles.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:16):

however, the comments I objected to were about a different problem — why are complexity theorists using such oracle models ignoring other computations.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:16):

and ditto for > it’s a bit like working on external sorting algorithms or B-trees

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:17):

that work is even worse here: historically (and up to Cormen), the assumption is that disks are so slow that in-memory computation is free.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:18):

and those results aren’t used just for lower bounds.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:19):

And indeed, nowadays those results aren’t so useful anymore, and their applications (relational DBMSes) have been challenged, and often replaced by other designs, on technical grounds.

view this post on Zulip Paolo Giarrusso (Oct 20 2020 at 22:21):

I _am_ curious, however, whether the original literature would do something else, but I can’t see why you’d care for a lower bound.

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:29):

by Church's law (or so-called "Church-Turing thesis"), any notion of computation is reducible to Turing machines

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:30):

lower bounds are interesting because they rule out that (a class of) computable functions exist

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:31):

to perform this ruling out formally, the lower bound must be expressed directly about Turing machines or something equivalent

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:35):

Cormen seemingly leaves to the reader the connection from decision trees to actual Turing machines, and to me this means it's not clear whether the Isabelle proofs actually talk about lower bounds of computable functions as generally understood

view this post on Zulip Karl Palmskog (Oct 20 2020 at 22:40):

in particular, I would indeed expect HOL functions to give them a hard time if someone would attempt to bridge the gap

view this post on Zulip Cody Roux (Oct 20 2020 at 22:43):

I think because it is a lower bound, there is a way to associate each run of an algorithm on a concrete random access machine with a decision tree: follow the path of decisions taken by the algorithm.

I think this translates into a lower bound for the machine. Obviously for upper bounds this is bogus, but that's ok, upper bounds are much much easier.

I don't want to take away from the coq-library-complexity, which, other than its lack of comments, is really cool, and useful.

view this post on Zulip Li-yao (Oct 20 2020 at 22:48):

It seems very hard to turn that lower bound into a statement about Turing machines, because comparison sorting assumes a very specific interface, and there is no notion of "interface" at the level of Turing machines.

view this post on Zulip Paolo Giarrusso (Oct 21 2020 at 00:05):

Re "interface", I think that's exactly what oracles are for.

view this post on Zulip Paolo Giarrusso (Oct 21 2020 at 00:09):

@Karl Palmskog AFAICS (and I’m not an expert), since Isabelle/HOL functions aren’t necessarily computable, and because of Isabelle’s result, I’m not sure that computable functions should show up in the result.

view this post on Zulip Paolo Giarrusso (Oct 21 2020 at 00:11):

that’s arguably nitpicking, but the bigger problem is that this result is _not_ a lower bound about sorting on Turing machines or computable functions.

view this post on Zulip Paolo Giarrusso (Oct 21 2020 at 00:13):

it’s about (hyper)computation with oracles, and about comparison sorting, which is a really different problem.

view this post on Zulip Paolo Giarrusso (Oct 21 2020 at 00:16):

@Li-yao I see you’re technically correct: Turing machines with oracles are _not_ Turing machines (since oracles aren’t required to be computable)

view this post on Zulip Paolo Giarrusso (Oct 21 2020 at 00:30):

(BTW, does coq-library-complexity define TMs with oracles? at least https://github.com/uds-psl/coq-library-undecidability/search?q=oracle doesn’t find that)

view this post on Zulip Yannick Forster (Oct 21 2020 at 09:32):

It doesn't (yet). The undecidability library does not, because it is based on a synthetic approach and often tries to be constructive, whereas reasoning based on oracles is inherently non-constructive. Of course it hasn't to be that way: Once you agree to implement every function you write as a Turing machine, it is in principle possible to mechanise computability based on and around oracles (e.g. Turing reducibility). I would pick the lambda-calculus for that and not Turing machines because that makes mechanisation a lot easier, but we also don't have oracles for the lambda-calculus

view this post on Zulip Yannick Forster (Oct 21 2020 at 09:34):

There is also https://github.com/uds-psl/coq-library-complexity, and because complexity theory can not be carried out synthetically, there we need to base everything on a model of computation. It would definitely be an interesting continuation of this line of work to include oracles (and non-determinism), but it's not on the immediate agenda

view this post on Zulip Lessness (Oct 21 2020 at 19:11):

Be damned the sorting! :D That's rabbit hole too deep for me, for now.
I can't even prove that each step of sorting algorithm (or more generally, permuting algorithm) is permuting.

Apologies, I just needed to vent a bit.

view this post on Zulip Lessness (Oct 21 2020 at 19:14):

It seems I need to find some easier project to do parallelly, one that causes more happiness. :)

view this post on Zulip Fabian Kunze (Oct 29 2020 at 17:20):

Cody Roux said:

Ooof, that repo does not suffer from an overabundance of comments.

Yes, sorry, I'm still in the process of polishing it and it is very work-in-progress.

If you know where to look, we have an Np-completeness proof of SAT (Cook's theorem) and the time Hierarchy theorem as results in synthetic complexity theory:

https://github.com/uds-psl/coq-library-complexity/blob/coq-8.12/theories/L/Complexity/CookLevin.v

https://github.com/uds-psl/coq-library-complexity/blob/307d840a13d86236afac57b014f8b174dc78e531/theories/L/Complexity/TimeHierarchyTheorem.v#L132

But for an outsider, the complexity-theory related stuff is not really stable nor usable yet.
Although, just verifying the time complexity of functional programs is already possible and stable, as described in our ITP-paper:
https://github.com/uds-psl/certifying-extraction-with-time-bounds

If anyone is interested in doing this, I'm, happy to answer questions.


Last updated: Apr 19 2024 at 08:01 UTC