How to do simplification of expression correctly (?) · Coq users

This is the type of linear expression for the "big M method" from Linear Programming. For example, the value cons (1 # 2, 6 # 8, 5) (cons (-1, 0, 0) nil)) corresponds to expression (1/2+6/8*M)*x_5 + (-1/1+0/1*M)*x_0 or, simplified, (1/2+3/4*M)*x_5 - x_0.

What kind of inductive datatype and other stuff would be the best for representing such simplified expression and for proving a property - given linear expression is simplified correctly?

I want to simplify any given linear expression as human would do that - coefficients "1" are removed, no additions of "0", no brackets around, for example, coefficient "2+0*M" or, simply, "2", fractions are simplified ("3/6" to "1/2", "12/3" to "4", etc), and etc.

Right now I have no adequate idea... If I find one while tinkering, I will post here, of course.

Paolo Giarrusso (Nov 06 2022 at 23:30):

given your examples, it sounds like you're interested in the theory of rewriting systems; simplified expressions would then be normal forms of a suitable rewriting system, with rules like 1 * x -> x, x * 1 -> x

Paolo Giarrusso (Nov 06 2022 at 23:31):

at least, that's what you might want to represent expressions that might not yet be simplified

Paolo Giarrusso (Nov 06 2022 at 23:32):

Usually we represent expressions as abstract syntax trees, where you don't really write parentheses. Adding needed parentheses is the job of a pretty-printer

Paolo Giarrusso (Nov 06 2022 at 23:37):

Actually, clarification: do you need to express predicates on linear_expression, or to handle simplification from arbitrary expressions into your format? My idea was about the latter because that's what your examples seem about, but maybe you meant the former...

Paolo Giarrusso (Nov 06 2022 at 23:38):

Lessness (Nov 07 2022 at 00:56):

Require Import ZArith Znumtheory QArith.
Set Implicit Arguments.

Definition linexp := list (Q * Q * nat).

Inductive exp :=
  | empty: exp
  | variable: nat -> exp
  | bigM: exp
  | plus: exp -> exp -> exp
  | minus: exp -> exp -> exp
  | mult: exp -> exp -> exp
  | opp: exp -> exp
  | integer: Z -> exp
  | fraction: Z -> positive -> exp.

Fixpoint linexp_into_nonsimplified_exp (le: linexp): exp :=
  match le with
  | nil => integer 0
  | cons (q1 # p1, q2 # p2, n) t =>
      plus (mult (plus (fraction q1 p1) (mult (fraction q2 p2) bigM)) (variable n)) (linexp_into_nonsimplified_exp t)
  end.


Fixpoint linexp_value (M_value: Q) (var_value: nat -> Q) (LE: linexp): Q :=
  match LE with
  | nil => 0#1
  | cons (q1, q2, n) t => (q1 + q2 * M_value) * (var_value n) + linexp_value M_value var_value t
  end.

Fixpoint exp_value (M_value: Q) (var_value: nat -> Q) (e: exp): Q :=
  match e with
  | empty => 0
  | variable n => var_value n
  | bigM => M_value
  | plus e1 e2 => exp_value M_value var_value e1 + exp_value M_value var_value e2
  | minus e1 e2 => exp_value M_value var_value e1 - exp_value M_value var_value e2
  | mult e1 e2 => exp_value M_value var_value e1 * exp_value M_value var_value e2
  | opp e1 => - exp_value M_value var_value e1
  | integer z => z # 1
  | fraction z p => z # p
  end.


Definition exp_equiv (e1 e2: exp): Prop :=
  forall (M_value: Q) (var_value: nat -> Q),
  exp_value M_value var_value e1 == exp_value M_value var_value e2.

Definition linexp_exp_equiv (le: linexp) (e: exp): Prop :=
  forall (M_value: Q) (var_value: nat -> Q),
  linexp_value M_value var_value le == exp_value M_value var_value e.

Definition exp_has_value (e: exp) (q: Q): Prop := forall M_value var_value, exp_value M_value var_value e = q.

Fixpoint simplified (e: exp): Prop :=
  match e with
  | empty => True
  | variable n => True
  | bigM => True
  | plus e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\ simplified e1 /\ simplified e2
  | minus e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\ simplified e1 /\ simplified e2
  | mult e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\
                  ~ exp_has_value e1 1 /\ ~ exp_has_value e2 1 /\ simplified e1 /\ simplified e2
  | opp e1 => ~ exp_has_value e1 0 /\ (forall q, exp_has_value e q -> q > 0) /\ simplified e1
  | integer z => (z > 0)%Z
  | fraction z p => (z > 0)%Z /\ (p > 1)%positive /\ rel_prime z (Zpos p)
  end.

Definition correct_simpl_algorithm (algo: linexp -> exp): Prop :=
  (forall (le: linexp), linexp_exp_equiv le (algo le) /\ simplified (algo le)).

Paolo Giarrusso (Nov 07 2022 at 02:52):

algo : linexp -> exp is an alternative I didn't think of, and it seems surprising: some of the normalization constraints you (might) want are enforced automatically on linexp, while exp lets you write (x0 - x1) * x2 * x3 — which passes simplified ?

Paolo Giarrusso (Nov 07 2022 at 02:55):

For linexp, you could demand entries are sorted by the 3rd component in strictly increasing order (to avoid multiple entries on the same variable), then forbid entries matching (q1, q2, n) where q1 == 0 and q2 == 0

Lessness (Nov 07 2022 at 05:27):

linexp_exp_equiv le (algo le) condition in the correct_simpl_algorithm should take care of it, I believe... If it's satisfied, linear expression le corresponds to the same function as the simplified expression algo le (and that disallows examples like (x0-x1)*x2*x3).

Paolo Giarrusso (Nov 07 2022 at 12:09):

That seems fragile: algo is allowed to return x0^2 + linear_poly - x0^2, which is semantically linear (as you point out) but not syntactically linear.

Stream: Coq users

Topic: How to do simplification of expression correctly (?)

Lessness (Nov 06 2022 at 21:23):