Stream: Coq users

Topic: How to do simplification of expression correctly (?)


view this post on Zulip Lessness (Nov 06 2022 at 21:23):

Hello!

So... I have this.

Require Import ZArith QArith.
Set Implicit Arguments.

Definition linear_expression := list (Q * Q * nat).

This is the type of linear expression for the "big M method" from Linear Programming. For example, the value cons (1 # 2, 6 # 8, 5) (cons (-1, 0, 0) nil)) corresponds to expression (1/2+6/8*M)*x_5 + (-1/1+0/1*M)*x_0 or, simplified, (1/2+3/4*M)*x_5 - x_0.

What kind of inductive datatype and other stuff would be the best for representing such simplified expression and for proving a property - given linear expression is simplified correctly?

I want to simplify any given linear expression as human would do that - coefficients "1" are removed, no additions of "0", no brackets around, for example, coefficient "2+0*M" or, simply, "2", fractions are simplified ("3/6" to "1/2", "12/3" to "4", etc), and etc.

Right now I have no adequate idea... If I find one while tinkering, I will post here, of course.

view this post on Zulip Paolo Giarrusso (Nov 06 2022 at 23:30):

given your examples, it sounds like you're interested in the theory of rewriting systems; simplified expressions would then be normal forms of a suitable rewriting system, with rules like 1 * x -> x, x * 1 -> x

view this post on Zulip Paolo Giarrusso (Nov 06 2022 at 23:31):

at least, that's what you might want to represent expressions that might not yet be simplified

view this post on Zulip Paolo Giarrusso (Nov 06 2022 at 23:32):

no brackets around

Usually we represent expressions as abstract syntax trees, where you don't really write parentheses. Adding needed parentheses is the job of a pretty-printer

view this post on Zulip Paolo Giarrusso (Nov 06 2022 at 23:37):

What kind of inductive datatype and other stuff would be the best for representing such simplified expression and for proving a property - given linear expression is simplified correctly?

Actually, clarification: do you need to express predicates on linear_expression, or to handle simplification from arbitrary expressions into your format? My idea was about the latter because that's what your examples seem about, but maybe you meant the former...

view this post on Zulip Paolo Giarrusso (Nov 06 2022 at 23:38):

fractions are simplified ("3/6" to "1/2", "12/3" to "4", etc), and etc.

For that you can use Qc from Coq.QArith.Qcanon.

view this post on Zulip Lessness (Nov 07 2022 at 00:56):

For now I have such formalization of the idea.

Require Import ZArith QArith.
Set Implicit Arguments.

Definition linexp := list (Q * Q * nat).

Inductive exp :=
  | empty: exp
  | variable: nat -> exp
  | bigM: exp
  | plus: exp -> exp -> exp
  | minus: exp -> exp -> exp
  | mult: exp -> exp -> exp
  | opp: exp -> exp
  | integer: Z -> exp
  | fraction: Z -> positive -> exp.

Fixpoint linexp_into_nonsimplified_exp (le: linexp): exp :=
  match le with
  | nil => integer 0
  | cons (q1 # p1, q2 # p2, n) t =>
      plus (mult (plus (fraction q1 p1) (mult (fraction q2 p2) bigM)) (variable n)) (linexp_into_nonsimplified_exp t)
  end.


Fixpoint linexp_value (M_value: Q) (var_value: nat -> Q) (LE: linexp): Q :=
  match LE with
  | nil => 0#1
  | cons (q1, q2, n) t => (q1 + q2 * M_value) * (var_value n) + linexp_value M_value var_value t
  end.

Fixpoint exp_value (M_value: Q) (var_value: nat -> Q) (e: exp): Q :=
  match e with
  | empty => 0
  | variable n => var_value n
  | bigM => M_value
  | plus e1 e2 => exp_value M_value var_value e1 + exp_value M_value var_value e2
  | minus e1 e2 => exp_value M_value var_value e1 - exp_value M_value var_value e2
  | mult e1 e2 => exp_value M_value var_value e1 * exp_value M_value var_value e2
  | opp e1 => - exp_value M_value var_value e1
  | integer z => z # 1
  | fraction z p => z # p
  end.


Definition exp_equiv (e1 e2: exp): Prop :=
  forall (M_value: Q) (var_value: nat -> Q),
  exp_value M_value var_value e1 == exp_value M_value var_value e2.

Definition linexp_exp_equiv (le: linexp) (e: exp): Prop :=
  forall (M_value: Q) (var_value: nat -> Q),
  linexp_value M_value var_value le == exp_value M_value var_value e.

Definition exp_has_value (e: exp) (q: Q): Prop := forall M_value var_value, exp_value M_value var_value e = q.


Fixpoint simplified (e: exp): Prop :=
  match e with
  | empty => True
  | variable n => True
  | bigM => True
  | plus e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\ simplified e1 /\ simplified e2
  | minus e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\ simplified e1 /\ simplified e2
  | mult e1 e2 => ~ exp_has_value e1 0 /\ ~ exp_has_value e2 0 /\
                  ~ exp_has_value e1 1 /\ ~ exp_has_value e2 1 /\ simplified e1 /\ simplified e2
  | opp e1 => ~ exp_has_value e1 0 /\ (forall q, exp_has_value e q -> q > 0) /\ simplified e1
  | integer z => (z > 0)%Z
  | fraction z p => (z > 0)%Z /\ (p > 1)%positive
  end.

Definition correct_simpl_algorithm (algo: linexp -> exp): Prop :=
  (forall (le: linexp), linexp_exp_equiv le (algo le) /\ simplified (algo le)).

Makes sense to me right now.

view this post on Zulip Paolo Giarrusso (Nov 07 2022 at 02:52):

algo : linexp -> exp is an alternative I didn't think of, and it seems surprising: some of the normalization constraints you (might) want are enforced automatically on linexp, while exp lets you write (x0 - x1) * x2 * x3 — which passes simplified ?

view this post on Zulip Paolo Giarrusso (Nov 07 2022 at 02:55):

For linexp, you could demand entries are sorted by the 3rd component in strictly increasing order (to avoid multiple entries on the same variable), then forbid entries matching (q1, q2, n) where q1 == 0 and q2 == 0

view this post on Zulip Lessness (Nov 07 2022 at 05:27):

Paolo Giarrusso said:

algo : linexp -> exp is an alternative I didn't think of, and it seems surprising: some of the normalization constraints you (might) want are enforced automatically on linexp, while exp lets you write (x0 - x1) * x2 * x3 — which passes simplified ?

linexp_exp_equiv le (algo le) condition in the correct_simpl_algorithm should take care of it, I believe... If it's satisfied, linear expression le corresponds to the same function as the simplified expression algo le (and that disallows examples like (x0-x1)*x2*x3).

view this post on Zulip Paolo Giarrusso (Nov 07 2022 at 12:09):

That seems fragile: algo is allowed to return x0^2 + linear_poly - x0^2, which is semantically linear (as you point out) but not syntactically linear.

view this post on Zulip Lessness (Nov 07 2022 at 16:12):

You're completely right. Back to thinking....

view this post on Zulip Paolo Giarrusso (Nov 07 2022 at 16:20):

Something like https://en.wikipedia.org/wiki/Rewriting#Term_rewriting_systems seems potentially relevant (not sure _that_ is the best reference, but it's the right topic)

view this post on Zulip Paolo Giarrusso (Nov 07 2022 at 16:21):

In particular, here's what I meant earlier about "normal forms": https://en.wikipedia.org/wiki/Normal_form_(abstract_rewriting)


Last updated: Feb 01 2023 at 11:04 UTC