## Stream: Miscellaneous

### Topic: Hashmap formalization

#### walker (Oct 15 2022 at 11:41):

Is anyone aware of good formal model for hahsmap (paper or code .. coq or anything else)

The thing I really care about in the hashmap formalization is that it should have canonical form.

Google/scholar couldn't help me find anything so if anyon knows, please let me know.

#### Paolo Giarrusso (Oct 15 2022 at 12:36):

I assume what you need is canonical representations — two maps `m1`, `m2` with the same entries are Leibniz equal (`m1 = m2`)?
But do you need formal models for Coq hashmaps or a hashmaps in an imperative language? I'm not the expert but a few comments:

1. Why do you need a hashmap and not just a dictionary — why is stdpp's `gmap` not adequate? That's the "functional model" I'd typically use in general in separation logic. It uses a non-lossy encoding of elements into naturals, instead of a lossy hash...
2. If you want a Coq hashmap, it's going to be inefficient without mutable arrays :-|. Coq's new primitive arrays _might_ alleviate that, but functional languages often use structures closer to radix trees like gmap (https://infoscience.epfl.ch/record/213452).
3. If you solve the other challenges, how should collisions be resolved to ensure canonicity? Would you sort elements by "hash"?

#### Karl Palmskog (Oct 15 2022 at 12:50):

the Ynot project did a hashtable based on Hoare Type Teory for Coq 8.3: https://github.com/ynot-harvard/ynot/tree/master/examples/Data

However, I'm pretty sure most of the authors would call both the approach and code obsolete now. There is a new maintained version of Hoare Type Theory though, but without the hashtable: https://github.com/imdea-software/htt

#### walker (Oct 15 2022 at 13:40):

1- first yes leibniz equality is what I am looking for, why stdpp implementation is not fine?, I reached the conclusion that I might want to write proofs in the style of small step reflection to make them easily maintainable (by me) during all my experiments. But stdpp devs advised against combining both stdpp with ssreflect. So I think I will stop using stdpp and just use ssreflect and hope that my life gets easier once I re-implement all the stuff I need.

2, 3- that was why I was asking, I couldn't come up with trivial canonical form for collisions.

#### walker (Oct 15 2022 at 13:40):

I am not sure if that is wise decision (both in effort required for short term, and then effort required for long term).
Ofcourse you input will be appreciated.

#### Karl Palmskog (Oct 15 2022 at 13:41):

for ssreflect, you have finmap, which contains finite maps that are Leibniz equal when they have the same mappings: https://github.com/math-comp/finmap

#### walker (Oct 15 2022 at 13:45):

I am now excited! so I feel the urge to ask does it come also come with notion of finite sets or I need to create my own ?
( I know I can check it but I am too excited to be patient)

#### Karl Palmskog (Oct 15 2022 at 13:47):

there are multiple notions of what a finite set could mean. You have "finite set" in the sense of "a collection of elements inhabiting a finite type" already in basic MathComp. Then in finmap there are sets with finite number of elements drawn from a choiceType I believe (i.e., elements drawn from a type which has infinitely many inhabitants)

#### walker (Oct 15 2022 at 13:50):

I am not sure what the difference is, but from programming perspective, the sets I was imagining was map with key drawn from type T and value unit/true. :sweat_smile:

#### Karl Palmskog (Oct 15 2022 at 13:52):

a finite type can be formed out of the list `[1;2;3;4]`. Then a finset from MathComp is (decidable) membership predicate on that list.

On the other hand, you can have finmap finitely-supported sets even on `nat`, as long as each set has a finite number of members.

#### walker (Oct 15 2022 at 13:54):

I think I need the first one, thanks for all the pointers.

#### Karl Palmskog (Oct 15 2022 at 13:54):

https://math-comp.github.io/htmldoc_1_15_0/mathcomp.ssreflect.finset.html

#### walker (Oct 15 2022 at 13:55):

wow! real documentation!

thanks a lot!

#### Karl Palmskog (Oct 15 2022 at 13:56):

finset is not a map, however. if you need the map aspect: https://github.com/math-comp/finmap/blob/master/finmap.v (`{fmap K -> V} `)

#### walker (Oct 15 2022 at 13:57):

But I need both a collection that one can get from array and k/v pairs.

#### Paolo Giarrusso (Oct 15 2022 at 18:57):

Small clarification to your rationale: combining stdpp with the ssreflect tactics is fine (iris does it). The warning holds true for anything more advanced :-)

#### Paolo Giarrusso (Oct 15 2022 at 19:00):

(but of course the math-comp ecosystem is a very fine one, if you don't need Iris)

#### walker (Oct 15 2022 at 20:20):

It is not clear if I can get away without iris, given that I am interested in side effects:-

worst case I will have to rewrite the proofs for third time, which is not good, but not that bad, and I feel I really liked mathcomp ecosystem. .. at least the verbosity of the proofs, and ofc the documentation.

#### walker (Oct 15 2022 at 20:21):

but it is a good thing that I don't forget that I still need to learn iris, and I am patiently waiting for iris workshop next year.

#### Matthieu Sozeau (Oct 16 2022 at 07:59):

You might be interested otherwise on the canonical tries of Leroy and Appel (Google for the paper). They have perf benchmarks about them. They’re just maps though not hashmaps

#### Karl Palmskog (Oct 16 2022 at 10:15):

https://github.com/xavierleroy/canonical-binary-tries

#### Karl Palmskog (Oct 16 2022 at 10:16):

incidentally, we'd like to see this collection of finite maps become maintained....

#### walker (Oct 16 2022 at 12:38):

It certainly looks nice, and being AVL based is impressive, but my question here is, is there an advantage for using this implementation over mathcomp/finmap ?

#### Paolo Giarrusso (Oct 16 2022 at 14:00):

I was wondering how they abstract on the key type, but they don't, so I guess one would need to add that separately.

#### Paolo Giarrusso (Oct 16 2022 at 14:02):

Apart from that, https://hal.inria.fr/hal-03372247 describes the advantages. IIRC, what's nice is that they are canonical _and_ they do that without embedding canonicity proofs

#### Paolo Giarrusso (Oct 16 2022 at 14:03):

that relates to

Efficiency not only as extracted to OCaml (i.e., with proofs erased), but also as represented in Coq and computed within Coq (i.e., without proof erasure).

#### Karl Palmskog (Oct 16 2022 at 14:05):

the basic rule of thumb for choosing between mathcomp-finmap and triemap or similar: mathcomp-finmap is designed for proving, not computation inside Coq (I think most computation is locked), so unless you need computation, finmap may be the way to go

#### Paolo Giarrusso (Oct 16 2022 at 14:17):

stdpp's gmap is efficient after extraction, and acceptable for computation in Coq with `vm_compute`, but you don't want to print normalized maps because of the embedded proofs; that's exactly what the new paper should fix

#### Paolo Giarrusso (Oct 16 2022 at 14:21):

https://gitlab.mpi-sws.org/iris/stdpp/-/issues/64 discusses the issues...

#### Paolo Giarrusso (Oct 16 2022 at 14:23):

annoyingly, I'm not sure the new paper would remove _all_ proofs (at least, based on some skimming): it could probably replace the underlying storage (`stdpp.pmap`), but `gmap` needs another layer of proofs:

``````(** We pack a [Pmap] together with a proof that ensures that all keys correspond
to codes of actual elements of the countable type. *)
Definition gmap_wf K `{Countable K} {A} (m : Pmap A) : Prop :=
bool_decide (map_Forall (λ p _, encode (A:=K) <\$> decode p = Some p) m).
Record gmap K `{Countable K} A := GMap {
gmap_car : Pmap A;
gmap_prf : gmap_wf K gmap_car
}.
``````

#### Matthieu Sozeau (Oct 17 2022 at 12:38):

We found out in certicoq though that canonical tries can still be a bottleneck because of the conversion to positive. Having (hash) maps with bignum keys would certainly beat them

#### Alexander Gryzlov (Oct 21 2022 at 16:43):

Karl Palmskog said:

the Ynot project did a hashtable based on Hoare Type Teory for Coq 8.3: https://github.com/ynot-harvard/ynot/tree/master/examples/Data

However, I'm pretty sure most of the authors would call both the approach and code obsolete now. There is a new maintained version of Hoare Type Theory though, but without the hashtable: https://github.com/imdea-software/htt

Umm, we do have a hash table example in current HTT: https://github.com/imdea-software/htt/blob/master/examples/hashtab.v

Last updated: Dec 07 2023 at 17:01 UTC