Stream: Coq users

Topic: Coq tree-sitter grammar


view this post on Zulip Ayhon (Mar 04 2024 at 12:32):

I was wondering whether there is or has been any effort in creating a tree-sitter grammar for Coq. Looking in GitHub there seems to be no project called tree-sitter-coq, which according to their naming conventions is how I would have expected to find it.

view this post on Zulip Gaëtan Gilbert (Mar 04 2024 at 12:47):

the Coq grammar is dynamic, typically Notation changes it. Not sure it's possible to handle that with tree sitter.

view this post on Zulip Karl Palmskog (Mar 04 2024 at 12:51):

also, lots of plugins can and do change the grammar. The general consensus is that the only thing that can parse/lex Coq code reliably is Coq itself. See these projects for ways to obtain a Coq AST or other internal structure from Coq:

view this post on Zulip Ayhon (Mar 04 2024 at 19:58):

I see. My main motivation for building a tree-sitter grammar was to improve the syntax highlighting. I'm currently using vim's extension and it seems to be using some regex rules to perform highlighting. Would something like this be possible through coq-lsp?

view this post on Zulip Karl Palmskog (Mar 04 2024 at 20:28):

I think syntax highlighting in an editor like VS Code (and any other editor that supports LSP) can be done via coq-lsp, yes. You would annotate the syntax at the LSP (protocol) level, and then reflect graphically in VS Code or other LSP-based editor. But best to ask in #coq-lsp

view this post on Zulip Emilio Jesús Gallego Arias (Mar 05 2024 at 22:42):

Yes, syntax highlighting should be a nice project to do in coq-lsp.

Note that for plugins, you need to create a plugin that will be loaded with each plugin, but we have the infra for that, tho it is still pretty painful.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 05 2024 at 22:42):

I'll be very happy to help get the project started.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 05 2024 at 22:44):

@Ayhon see https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#textDocument_semanticTokens ; there are quite a few tricky things tho

view this post on Zulip Emilio Jesús Gallego Arias (Mar 05 2024 at 22:44):

But I'm familiar with them

view this post on Zulip Emilio Jesús Gallego Arias (Mar 06 2024 at 15:07):

Regarding the original quesiton, I think it should be possible to have a reasonable tree-sitter grammar for Coq, that would work with most plugins. For things like highligthing or indenting, that should be good enough, other applications IMO require exact parsing, then you need indeed to use Coq's parser

view this post on Zulip Ayhon (Mar 06 2024 at 16:12):

Emilio Jesús Gallego Arias said:

I'll be very happy to help get the project started.

I wouldn't mind giving it a try, although I'm not too knowledgeable in OCaml and coq and I fear that might hold me back. I went over the section in the website you linked and it seems like a fun project.

view this post on Zulip Ayhon (Mar 06 2024 at 16:15):

Emilio Jesús Gallego Arias said:

Regarding the original quesiton, I think it should be possible to have a reasonable tree-sitter grammar for Coq, that would work with most plugins. For things like highligthing or indenting, that should be good enough, other applications IMO require exact parsing, then you need indeed to use Coq's parser

I was thinking about perhaps creating a parser for the EBNF like syntax in orderedGrammar and use that to generate a tree-sitter grammar automatically. I could at least serve as a basis.

view this post on Zulip Emilio Jesús Gallego Arias (Mar 12 2024 at 15:16):

@Ayhon , indeed as of today, handling precisely Coq Ast requires OCaml, there are some projects like PyCoq that could open the door to handling stuff in python, but that's far from ready.

But indeed, I was going to point out to the grammar spec, that should give you a pretty good basic for an approximation; then if you test the grammar in some popular plugins, it should be feasible to get some decent tree-sitter support


Last updated: Jun 22 2024 at 15:01 UTC