## Stream: Coq users

### Topic: How to do simplification of expression correctly (?)

#### Lessness (Nov 06 2022 at 21:23):

Hello!

So... I have this.

``````Require Import ZArith QArith.
Set Implicit Arguments.

Definition linear_expression := list (Q * Q * nat).
``````

This is the type of linear expression for the "big M method" from Linear Programming. For example, the value `cons (1 # 2, 6 # 8, 5) (cons (-1, 0, 0) nil))` corresponds to expression `(1/2+6/8*M)*x_5 + (-1/1+0/1*M)*x_0` or, simplified, `(1/2+3/4*M)*x_5 - x_0`.

What kind of inductive datatype and other stuff would be the best for representing such simplified expression and for proving a property - given linear expression is simplified correctly?

I want to simplify any given linear expression as human would do that - coefficients "1" are removed, no additions of "0", no brackets around, for example, coefficient "2+0*M" or, simply, "2", fractions are simplified ("3/6" to "1/2", "12/3" to "4", etc), and etc.

Right now I have no adequate idea... If I find one while tinkering, I will post here, of course.

#### Paolo Giarrusso (Nov 06 2022 at 23:30):

given your examples, it sounds like you're interested in the theory of rewriting systems; simplified expressions would then be normal forms of a suitable rewriting system, with rules like `1 * x -> x`, `x * 1 -> x`

#### Paolo Giarrusso (Nov 06 2022 at 23:31):

at least, that's what you might want to represent expressions that might not yet be simplified

#### Paolo Giarrusso (Nov 06 2022 at 23:32):

no brackets around

Usually we represent expressions as abstract syntax trees, where you don't really write parentheses. Adding needed parentheses is the job of a pretty-printer

#### Paolo Giarrusso (Nov 06 2022 at 23:37):

What kind of inductive datatype and other stuff would be the best for representing such simplified expression and for proving a property - given linear expression is simplified correctly?

Actually, clarification: do you need to express predicates on `linear_expression`, or to handle simplification from arbitrary expressions into your format? My idea was about the latter because that's what your examples seem about, but maybe you meant the former...

#### Paolo Giarrusso (Nov 06 2022 at 23:38):

fractions are simplified ("3/6" to "1/2", "12/3" to "4", etc), and etc.

For that you can use `Qc` from `Coq.QArith.Qcanon.`

#### Lessness (Nov 07 2022 at 00:56):

For now I have such formalization of the idea.

``````Require Import ZArith Znumtheory QArith.
Set Implicit Arguments.

Definition linexp := list (Q * Q * nat).

Inductive exp :=
| empty: exp
| variable: nat -> exp
| bigM: exp
| plus: exp -> exp -> exp
| minus: exp -> exp -> exp
| mult: exp -> exp -> exp
| opp: exp -> exp
| integer: Z -> exp
| fraction: Z -> positive -> exp.

Fixpoint linexp_into_nonsimplified_exp (le: linexp): exp :=
match le with
| nil => integer 0
| cons (q1 # p1, q2 # p2, n) t =>
plus (mult (plus (fraction q1 p1) (mult (fraction q2 p2) bigM)) (variable n)) (linexp_into_nonsimplified_exp t)
end.

Fixpoint linexp_value (M_value: Q) (var_value: nat -> Q) (LE: linexp): Q :=
match LE with
| nil => 0#1
| cons (q1, q2, n) t => (q1 + q2 * M_value) * (var_value n) + linexp_value M_value var_value t
end.

Fixpoint exp_value (M_value: Q) (var_value: nat -> Q) (e: exp): Q :=
match e with
| empty => 0
| variable n => var_value n
| bigM => M_value
| plus e1 e2 => exp_value M_value var_value e1 + exp_value M_value var_value e2
| minus e1 e2 => exp_value M_value var_value e1 - exp_value M_value var_value e2
| mult e1 e2 => exp_value M_value var_value e1 * exp_value M_value var_value e2
| opp e1 => - exp_value M_value var_value e1
| integer z => z # 1
| fraction z p => z # p
end.

Definition exp_equiv (e1 e2: exp): Prop :=
forall (M_value: Q) (var_value: nat -> Q),
exp_value M_value var_value e1 == exp_value M_value var_value e2.

Definition linexp_exp_equiv (le: linexp) (e: exp): Prop :=
forall (M_value: Q) (var_value: nat -> Q),
linexp_value M_value var_value le == exp_value M_value var_value e.

Definition exp_has_value (e: exp) (q: Q): Prop := forall M_value var_value, exp_value M_value var_value e = q.

Fixpoint simplified (e: exp): Prop :=
match e with
| empty => True
| variable n => True
| bigM => True
| plus e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\ simplified e1 /\ simplified e2
| minus e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\ simplified e1 /\ simplified e2
| mult e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\
~ exp_has_value e1 1 /\ ~ exp_has_value e2 1 /\ simplified e1 /\ simplified e2
| opp e1 => ~ exp_has_value e1 0 /\ (forall q, exp_has_value e q -> q > 0) /\ simplified e1
| integer z => (z > 0)%Z
| fraction z p => (z > 0)%Z /\ (p > 1)%positive /\ rel_prime z (Zpos p)
end.

Definition correct_simpl_algorithm (algo: linexp -> exp): Prop :=
(forall (le: linexp), linexp_exp_equiv le (algo le) /\ simplified (algo le)).
``````

Makes sense to me right now.

#### Paolo Giarrusso (Nov 07 2022 at 02:52):

`algo : linexp -> exp` is an alternative I didn't think of, and it seems surprising: some of the normalization constraints you (might) want are enforced automatically on `linexp`, while `exp` lets you write `(x0 - x1) * x2 * x3` — which passes `simplified` ?

#### Paolo Giarrusso (Nov 07 2022 at 02:55):

For `linexp`, you could demand entries are sorted by the 3rd component in strictly increasing order (to avoid multiple entries on the same variable), then forbid entries matching `(q1, q2, n)` where `q1 == 0` and `q2 == 0`

#### Lessness (Nov 07 2022 at 05:27):

Paolo Giarrusso said:

`algo : linexp -> exp` is an alternative I didn't think of, and it seems surprising: some of the normalization constraints you (might) want are enforced automatically on `linexp`, while `exp` lets you write `(x0 - x1) * x2 * x3` — which passes `simplified` ?

`linexp_exp_equiv le (algo le)` condition in the `correct_simpl_algorithm` should take care of it, I believe... If it's satisfied, linear expression `le` corresponds to the same function as the simplified expression `algo le` (and that disallows examples like `(x0-x1)*x2*x3`).

#### Paolo Giarrusso (Nov 07 2022 at 12:09):

That seems fragile: algo is allowed to return `x0^2 + linear_poly - x0^2`, which is semantically linear (as you point out) but not syntactically linear.

#### Lessness (Nov 07 2022 at 16:12):

You're completely right. Back to thinking....

#### Paolo Giarrusso (Nov 07 2022 at 16:20):

Something like https://en.wikipedia.org/wiki/Rewriting#Term_rewriting_systems seems potentially relevant (not sure _that_ is the best reference, but it's the right topic)

#### Paolo Giarrusso (Nov 07 2022 at 16:21):

In particular, here's what I meant earlier about "normal forms": https://en.wikipedia.org/wiki/Normal_form_(abstract_rewriting)

Last updated: Jun 18 2024 at 08:01 UTC