Stream: Coq devs & plugin devs

Topic: memtrace


view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:29):

@Emilio Jesús Gallego Arias I don't see anything suspicious in Byte.v with memtrace?

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:32):

In particular, find_or.loop only allocates a bit more than 20M according to the tool.

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:38):

My results gave me 2GiB

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:38):

I dunno

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:38):

very suspiciou, you are in 4.11.1 right

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:38):

?

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:38):

Yeah I am getting very weird results

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:41):

4.11.1+fp

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:42):

I don't think that the +fp changes anything

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:42):

I'll upload a file

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:42):

allocations in List.v look innocuous as well

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:42):

can you upload the file for Strings/Byte.v ?

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:43):

the monad does allocate quite a bit but these allocations never reach the major heap

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:43):

let's see

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:43):

Byte.v.ctf

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:48):

Umm, what file are you compiling?

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:48):

is that Strings/Byte.v?

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:48):

Ah, sorry, I was compiling Init.Byte -_-

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:48):

misread your post :p

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:49):

Let's try again

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:50):

For example why is or_find allocating

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:50):

it should not allocate at all

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:51):

now it does look better (or not depending on your point of view)

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:51):

well it's a closure that captures variables

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:51):

so if it's not inlined, you have to allocate those variables

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:51):

in theory these shouldn't matter performancewise because they're very very short-lived

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:52):

Major heap only accounts for 320M in my trace for instance

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:52):

for a grand total of 16.6G

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:55):

Yeah I'm looking at the major heap now

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:55):

and going top down

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:55):

will switch to the simple compiler tho as to get a better understanding

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:55):

I confirm that repr_if_not_found never reaches major

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:55):

tho the pattern of allocation for quite simple proofs like in List.v seems a bit worrying

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:56):

yeah it doesn't

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:56):

I think that looking at fiat-crypto or VST is a good way to really start worrying

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:57):

Indeed let's see what happens there; but I'm a bit shocked that for example byte takes 17GiB of allocation

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 21:57):

tho most of it in the minor heap so likely harmless

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:57):

OCaml programs allocate a lot

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:57):

Coq is not specific in that regard

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:58):

The use of a two-generation GC is critical for perfs

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 21:58):

It's quite well-documented, I think that the "unraveling OCaml" book mentions this fact

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 22:01):

Sure, I mean more than it seems to me, that our proof engine is inherently leaky, in the sense that allocation grows with the number of tactics

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 22:01):

unless you call optimize proof I guess

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:01):

Heaps are inherently leaky :shrug:

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 22:02):

?

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 22:02):

Proof search is not

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:02):

If you have a global state and no GC, you're essentially leaky

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 22:02):

imagine prolog was implemeted this way

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:03):

I am afraid I do not believe in Prolog

view this post on Zulip Emilio Jesús Gallego Arias (Oct 08 2020 at 22:03):

not sure what that means

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:05):

Anyways, I'd be much more worried about the module system, which induces memory consumption by proxy

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:05):

(see e.g. my last PR https://github.com/coq/coq/pull/13156)

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:37):

can flambda remove allocations? I am missing context, but operations in the minor heap are _not_ free — they pollute the cache

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:38):

The minor heap has a good cache locality though

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:38):

But indeed flambda does remove allocations

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:38):

The find_or above is probably a good example

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:39):

if you inline the loop, you become allocation-free

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:39):

which is a typical flambda optimization

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:39):

sounds better

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:39):

I mean, doing map (+1) (map (+1) xs) without fusion for large xs means writing 2 copies of xs instead of 1

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:39):

But relying on flambda is fragile, I believe

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:40):

for tight loops we should prohibit closure creation by hand

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:40):

(I am a bit surprised that in the vanilla compiler the aforementioned function is not already inlined)

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:41):

you’re the expert, but is there any optimization in OCaml without flambda?

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:42):

as usual relying on optimizations has tradeoffs, and I don’t know how good flambda is

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:42):

Not much, but loop inlining is one of them when the compiler realizes that you wanted to write a pure FP while

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:42):

I am not too fond of flambda

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:42):

it does give a lot of optimizations, but it's not quite robust

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:42):

OTOH, some manual optimizations can require destroying modularity. So pick your poison.

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:43):

one advantage of the vanilla compiler is that it is quite predictable

view this post on Zulip Pierre-Marie Pédrot (Oct 08 2020 at 22:43):

yeah I know :/

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:45):

Anyway: allocations that die immediately are only free at GC time (because copying GC only copies live objects), not for caches (worst-case, you must throw away useful data from caches, load memory, zero it, then initialize the objects, and then stop using them and throw away other useful data and restart)

view this post on Zulip Paolo Giarrusso (Oct 08 2020 at 22:48):

Probably it’s not as bad, and maybe the young generation fits in L2, but the cache would help for other things as well :-P


Last updated: Oct 15 2021 at 20:02 UTC