Is there some good literature about this bound (that's helpful for formalizing the proof in Coq)? Or maybe some developments in Coq already?
Did you mean https://en.wikipedia.org/wiki/Stirling%27s_approximation ?
@Yishuai Li that's a much stronger result than you need to prove the lower bound of the complexity of sorting by comparison.
in any case, I think this is the one meant? https://github.com/coq-community/coqtail-math/blob/master/Reals/RStirling.v - so if no hammer is available, a wrecking ball may suffice
Formalizing "comparison sorting algorithms" sounds like a fun rabbit hole.
In Isabelle, there is this: https://www.isa-afp.org/entries/Comparison_Sort_Lower_Bound.html
(deleted)
(deleted)
ignore what I wrote; Cormen’s proof obtains the lg (n!)
lower bound first, and _then_ uses Stirling to get n lg n
.
Li-yao said:
Formalizing "comparison sorting algorithms" sounds like a fun rabbit hole.
It is. :) At least I learn a lot doing that, even when I'm not close to being successful yet.
I think it's not that crazy, depending on what your model of computation is. If it's an abstract machine that can do 3 operations:
Then I don't think things are hopeless. In particular the Isabelle formalization I linked to earlier seems pretty tractable.
from my understanding, it's quite easy to "cheat" in the Isabelle model. You only count the comparisons (or other magically ordained operations), but you can do exponential amounts of other computation. For radix sort and the like, this is a no go.
this seems to be a more fundamental approach, not to say this solves all kinds of cheating issues: https://github.com/uds-psl/coq-library-complexity
Ooof, that repo does not suffer from an overabundance of comments.
I honestly don't know if we can even prove any lower bounds on sorting in a concrete TM model. what would an appropriate model for this be? I think that only counting the comparisons seems reasonable...
to be honest, count-the-comparisons is a bad tradeoff between two things:
one very practical alternative is to work in a deep embedding of a language with a cost model. Then, the language implementers can ensure the cost model is realized in practice, and you can run your complexity-verified program using their compiler/runtime.
my understanding of coq-library-complexity is that they are working towards having something a complexity theorist would sign off on, and there could also be the possibility of a runtime which is verified to implement computation-model-steps in constant time
to completely avoid cheating, I don't see any other option than reasoning about hardware-level timings
My model is like this (I skipped the part where it is executed). For now it's only for 8 numbers, I know.
It allowed me to implement merge-insertion sort and to prove that it works with 16 comparisons.
Inductive value := x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7.
Inductive variable := aux | num: value -> variable.
Inductive assignment := assign: variable -> variable -> assignment.
Inductive comparison := GT: forall (more less: value), comparison.
Inductive step :=
| assignments: forall (L: list assignment), step
| conditional: forall (c: comparison) (positive negative: step), step.
Definition algorithm := list step.
But it makes hard to prove things about it. At least for me.
basically, the next step in abstraction is to work in a monad, like they did here: https://github.com/coq-contribs/quicksort-complexity/tree/v8.9
from my understanding, it's quite easy to "cheat" in the Isabelle model. You only count the comparisons (or other magically ordained operations), but you can do exponential amounts of other computation. For radix sort and the like, this is a no go.
@Karl Palmskog but I think that’s the correct result from Cormen.
this is a _lower bound_ on the number of comparisons.
“exponential amount of other computation” seems irrelevant
unless you’re asking whether this is a generally valid model of algorithms, but I don’t think that was the question.
I’m also not sure why you bring up radix sort — we know the theorem does not apply to radix sort, and that radix sort indeed beats the bound.
do you think a complexity theorist would accept the Isabelle model as valid for establishing the lower bound?
Depends. Do they accept the Cormen proof?
also, if they reject it, it depends on why. as mentioned, I don’t think that argument is relevant.
the Cormen "proof" is obviously just a sketch, which they will likely interpret in a realm of Turing machines
double-checking, and I think not.
“The decision tree model. [...] Control, data movement, and all other aspects of the algorithm are ignored.”
I want to re-skim it a moment, and I agree it’s a sketch, but it doesn’t seem a sketch about Turing machines.
but if you want it to be a result about what is computable (non-existence of computable functions), how do you avoid Turing machines?
but radix sort exists, so I don’t think that can be the result.
actually let me riddle you a question here.
to me, this theorem relies on some parametricity argument.
to a PL person like me, a “comparison sort” is a parametric function forall T, list T -> (T -> T -> bool) -> list T
I can imagine trying to formalize this without knowing programming languages, but I’m not sure you can do it without formalizing an ad-hoc version of parametricity anyway: your algorithm has to behave the same on any two input lists where the “comparison results” are the same.
and BTW, I’m not sure that’s enough for Turing machines, since in that contexts functions are _data_ not codata so they can be inspected.
from hanging out with complexity theorists, I think they will interpret the lower bound as a way to disprove anyone who says they have a Turing machine that performs comparison-based sorting in O(n) computation steps
you haven’t said what is “comparison-based sorting”
but I remembered the keyword they’d use: “oracle”
a sorting algorithm is a recursive datatype containing either a HOL function or a query of a comparison oracle with a continuation containing the remaining computation. This makes it possible to force the algorithm to use only comparisons and to track the number of comparisons made.
yes, you can define comparison based sorting at the Turing machine level as a TM that has access to an oracle for comparisons
oracles are as opaque as we need (they’re codata, like functions)
and models where all that counts are queries to oracles are fine
but okay, I do have an objection.
but why restrict to counting only oracle queries?
because that’s the theorem statement?
OTOH, HOL functions need not be computable. However, that means the Isabelle result is _stronger_ than needed.
it's not how you would use it for anything, you can talk about all other computation that is done as well
Theorem 8.1. “Any comparison sort algorithm requires Omega (n lg n) comparisons in the worst case.”
“The decision tree model. [...] Control, data movement, and all other aspects of the algorithm are ignored.”
so I totally agree that model is not suitable for computability in general, but it seems perfectly fine for this theorem.
I argue that if you want to apply that theorem for anything useful, you have to put it in a general Turing machine context
in fact, the Isabelle theorem remains applicable in models of “hypercomputation” beyond Turing machines
I can record that’s your opinion, but I’m afraid I don’t see an actual argument anywhere.
to the extent that I see it, I would concede one hole:
you’d have to prove that Turing machines can be turned into HOL functions.
so, why would I care about some result in the decision tree model? What implements this model? How do I apply something proved about it elsewhere?
AFAIK from Cormen, the main application is “to beat n log n, go outside that model”
it’s a bit like working on external sorting algorithms or B-trees
but even a discussion of "beating n log n" presumes some underlying model of computation
just to delimit discussion a moment — the thread is about formalizing a certain textbook result.
and basically, there has to be a theorem somewhere relating the decision tree model to your general computation model, for the discussion about "beating" to make sense
the textbook result in question is about the decision tree model, and the Isabelle work satisfies that.
your questions seem relevant to going _beyond_ that result and filling in the dots surrounding it.
and I agree that for that, you should at least prove that Turing functions can be turned into HOL functions
so that you can turn the result into a result about Turing machines + comparison oracles.
however, the comments I objected to were about a different problem — why are complexity theorists using such oracle models ignoring other computations.
and ditto for > it’s a bit like working on external sorting algorithms or B-trees
that work is even worse here: historically (and up to Cormen), the assumption is that disks are so slow that in-memory computation is free.
and those results aren’t used just for lower bounds.
And indeed, nowadays those results aren’t so useful anymore, and their applications (relational DBMSes) have been challenged, and often replaced by other designs, on technical grounds.
I _am_ curious, however, whether the original literature would do something else, but I can’t see why you’d care for a lower bound.
by Church's law (or so-called "Church-Turing thesis"), any notion of computation is reducible to Turing machines
lower bounds are interesting because they rule out that (a class of) computable functions exist
to perform this ruling out formally, the lower bound must be expressed directly about Turing machines or something equivalent
Cormen seemingly leaves to the reader the connection from decision trees to actual Turing machines, and to me this means it's not clear whether the Isabelle proofs actually talk about lower bounds of computable functions as generally understood
in particular, I would indeed expect HOL functions to give them a hard time if someone would attempt to bridge the gap
I think because it is a lower bound, there is a way to associate each run of an algorithm on a concrete random access machine with a decision tree: follow the path of decisions taken by the algorithm.
I think this translates into a lower bound for the machine. Obviously for upper bounds this is bogus, but that's ok, upper bounds are much much easier.
I don't want to take away from the coq-library-complexity, which, other than its lack of comments, is really cool, and useful.
It seems very hard to turn that lower bound into a statement about Turing machines, because comparison sorting assumes a very specific interface, and there is no notion of "interface" at the level of Turing machines.
Re "interface", I think that's exactly what oracles are for.
@Karl Palmskog AFAICS (and I’m not an expert), since Isabelle/HOL functions aren’t necessarily computable, and because of Isabelle’s result, I’m not sure that computable functions should show up in the result.
that’s arguably nitpicking, but the bigger problem is that this result is _not_ a lower bound about sorting on Turing machines or computable functions.
it’s about (hyper)computation with oracles, and about comparison sorting, which is a really different problem.
@Li-yao I see you’re technically correct: Turing machines with oracles are _not_ Turing machines (since oracles aren’t required to be computable)
(BTW, does coq-library-complexity define TMs with oracles? at least https://github.com/uds-psl/coq-library-undecidability/search?q=oracle doesn’t find that)
It doesn't (yet). The undecidability library does not, because it is based on a synthetic approach and often tries to be constructive, whereas reasoning based on oracles is inherently non-constructive. Of course it hasn't to be that way: Once you agree to implement every function you write as a Turing machine, it is in principle possible to mechanise computability based on and around oracles (e.g. Turing reducibility). I would pick the lambda-calculus for that and not Turing machines because that makes mechanisation a lot easier, but we also don't have oracles for the lambda-calculus
There is also https://github.com/uds-psl/coq-library-complexity, and because complexity theory can not be carried out synthetically, there we need to base everything on a model of computation. It would definitely be an interesting continuation of this line of work to include oracles (and non-determinism), but it's not on the immediate agenda
Be damned the sorting! :D That's rabbit hole too deep for me, for now.
I can't even prove that each step of sorting algorithm (or more generally, permuting algorithm) is permuting.
Apologies, I just needed to vent a bit.
It seems I need to find some easier project to do parallelly, one that causes more happiness. :)
Cody Roux said:
Ooof, that repo does not suffer from an overabundance of comments.
Yes, sorry, I'm still in the process of polishing it and it is very work-in-progress.
If you know where to look, we have an Np-completeness proof of SAT (Cook's theorem) and the time Hierarchy theorem as results in synthetic complexity theory:
https://github.com/uds-psl/coq-library-complexity/blob/coq-8.12/theories/L/Complexity/CookLevin.v
But for an outsider, the complexity-theory related stuff is not really stable nor usable yet.
Although, just verifying the time complexity of functional programs is already possible and stable, as described in our ITP-paper:
https://github.com/uds-psl/certifying-extraction-with-time-bounds
If anyone is interested in doing this, I'm, happy to answer questions.
Last updated: Apr 19 2024 at 08:01 UTC