Stream: Coq users

Topic: ✔ Simple question about removing duplicates from a list


view this post on Zulip David Landsberg (Feb 14 2022 at 09:17):

Hi, very new to Coq and still wrestling. Just want to know how to remove duplicates from a list. For example, if i have the following definition ...

Definition ml : list nat := [3;4;4;5;6].

... Then how would i remove the duplicates of this list, and also store the result in a new definition. I have tried a lot of tutorials, and have also looked at the library (in particular, the nodup functions), but to no avail. Any advice would be greatly appreciated!

view this post on Zulip David Landsberg (Feb 14 2022 at 09:32):

Addendum: It seems I can do something like: Definition mine : NoDup [3;4;4;5;6], but this seems to be in propositional form, and requires me to prove that the list has no duplicates. What I need is something that automatically computes the list without duplicates for me.

view this post on Zulip Ali Caglayan (Feb 14 2022 at 09:33):

Since you have a list of nats, we can do this for anything that has a boolean equality/decidable equality.

The idea I would go for is to have a function which destructs the list, checks that the new element isn't already in the list, and finally adds it to the list accordingly.

Something like this should work:

Require Import List.
Require Import Nat.
Import ListNotations.

Local Open Scope list_scope.
Local Open Scope nat_scope.


Fixpoint remove_duplicates_aux (new old : list nat) : list nat :=
  match old with
  | [] => new
  | x :: xs =>
    if existsb (fun n => x =? n) new then
      remove_duplicates_aux new xs
    else
      remove_duplicates_aux (x :: new) xs
  end.

Definition remove_duplicates (l : list nat) : list nat :=
  remove_duplicates_aux [] l.

Eval cbv in (remove_duplicates [3;4;4;5;6]).

view this post on Zulip Ali Caglayan (Feb 14 2022 at 09:34):

Of course when putting back the list together things are done backwards, so you might have to reverse the output of remove_duplicates_aux in the definition of remove_duplicates if you care about preserving that sort of thing.

view this post on Zulip Ali Caglayan (Feb 14 2022 at 09:35):

As a general rule of thumb, if the type you have has a decidable/boolean equality then it is typically better to work with that than the more general propositional one, since it computes.

view this post on Zulip Ali Caglayan (Feb 14 2022 at 09:36):

An exercise for you would be to prove that NoDup holds for this result. That is the proposition stating that there are no duplicates present in the list. The actual remove_duplicates function is your algorithm to do so.

view this post on Zulip David Landsberg (Feb 14 2022 at 09:37):

Really appreciated! That all compiles for me I'll have a closer look.

Thanks again for your help :+1:

view this post on Zulip Notification Bot (Feb 14 2022 at 09:45):

David Landsberg has marked this topic as resolved.

view this post on Zulip Notification Bot (Feb 14 2022 at 09:53):

David Landsberg has marked this topic as unresolved.

view this post on Zulip David Landsberg (Feb 14 2022 at 09:56):

There is a section "Effective computation of a list without duplicates", and a function (with lower case letters as opposed to NoDup), called nodup. Does this do something similar to your remove_duplicates function above? Can't get it to work. Any idea how to use it? If so, might save me having to prove the NoDup theorem for your function above! Thanks.

https://coq.inria.fr/library/Coq.Lists.List.html#nodup_fixed_point

view this post on Zulip David Landsberg (Feb 14 2022 at 09:57):

In particular, the nodup_fixed_point function looks quite similar to remove_duplicates

view this post on Zulip Michael Soegtrop (Feb 14 2022 at 09:57):

I would propose a different solution. First you already found nodup. When you do Check nodup you get:

nodup
     : forall A : Type, (forall x y : A, {x = y} + {x <> y}) -> list A -> list A

You see that nodup requires as argument a function which can decide the equality of your list elements. The {x = y} + {x <> y} means a way to compute if x=y holds or x<>y holds. You can search such a function with:

Search ({?n = ?m} + {?n <> ?m}).

(which gives many results, but it is still easy to find). With this you can do:

Eval cbv in nodup PeanoNat.Nat.eq_dec ml.

But this doesn't answer the question how to make a definition which has this value (neither does the answer of @Ali Caglayan). This is slightly tricky in Coq because when you define something, the definition is not evaluated. But you can force Coq to do so as follows:

Definition ml_nodup : list nat.
  let x:= eval cbv in (nodup PeanoNat.Nat.eq_dec ml) in exact x.
Defined.

Print ml_nodup.

which gives:

ml_nodup = [3; 4; 5; 6]
     : list nat

The trick is that you don't give a definition in Definition, so that Coq enters proof mode. There you can use tactics to construct the definition.

view this post on Zulip David Landsberg (Feb 14 2022 at 10:03):

Thanks @Michael Soegtrop , really helpful!

view this post on Zulip Notification Bot (Feb 14 2022 at 10:13):

David Landsberg has marked this topic as resolved.

view this post on Zulip Paolo Giarrusso (Feb 14 2022 at 10:33):

@Michael Soegtrop @David Landsberg in this case, you can also use
Definition ml_nodup : list nat := Eval cbv in nodup PeanoNat.Nat.eq_dec ml.

view this post on Zulip Michael Soegtrop (Feb 14 2022 at 10:46):

@Paolo Giarrusso : thanks I didn't know this - and I wouldn't expect it from the documentation of Eval in the reference manual. Should I file a documentation issue or did I miss this somehow?

view this post on Zulip Paolo Giarrusso (Feb 14 2022 at 11:36):

@Michael Soegtrop While https://coq.inria.fr/refman/language/core/definitions.html#coq:cmd.Definition mentions this syntax,
IMHO the reader (of docs) is always right, and indeed https://coq.inria.fr/refman/proofs/writing-proofs/equality.html#coq:cmd.Eval needs a link to this syntax. And maybe Eval needs another entry in the index under https://coq.inria.fr/refman/genindex.html#E.

view this post on Zulip Paolo Giarrusso (Feb 14 2022 at 11:36):

An issue is welcome, but given how these are prioritized it should probably come with "help wanted".

view this post on Zulip Paolo Giarrusso (Feb 14 2022 at 11:38):

and to avoid such issues lingering, maybe we can batch documentation issues together — looking up rst syntax takes longer than actually fixing this, but would amortize well over several doc issues

view this post on Zulip Michael Soegtrop (Feb 14 2022 at 12:11):

I created an issue. I have issues with rst syntax - more about Coq specific conventions than rst in general. I guess a Coq specific RST cheat sheet would make sense - looking at the complexity and size (=contributor count) of Coq documentation.

view this post on Zulip Pierre Roux (Feb 14 2022 at 12:28):

Do these two readmes help?

view this post on Zulip Pierre Roux (Feb 14 2022 at 12:28):

https://github.com/coq/coq/blob/master/doc/sphinx/README.rst

view this post on Zulip Pierre Roux (Feb 14 2022 at 12:28):

https://github.com/coq/coq/blob/master/doc/tools/docgram/README.md

view this post on Zulip Michael Soegtrop (Feb 14 2022 at 13:03):

@Pierre Roux : indeed the documentation looks much improved over the last time I had a look at (which is quite a while back). I guess I have to give it a try ...

view this post on Zulip zohaze (May 20 2023 at 11:47):

(deleted)


Last updated: Oct 13 2024 at 01:02 UTC