trishullab / PutnamBench

An evaluation benchmark for undergraduate competition math in Lean4, Isabelle, Coq, and natural language.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Putnam 2009 B1 issues

LasseBlaauwbroek opened this issue · comments

The Coq formalization of 2009 B1 (which is highlighted in the paper) has an error:

Theorem putnam_2009_b1
(Factl := fix factl (l : list nat) : list nat :=
match l with
| nil => nil
| h :: t => fact h :: t
end)
: forall (q: Q), q > 0 -> exists (n d: list nat), (forall x, (In x n \/ In x d)-> prime (Z.of_nat x)) /\
inject_Z (Z.of_nat (fold_left Nat.mul (Factl n) 1%nat)) / inject_Z (Z.of_nat (fold_left Nat.mul (Factl d) 1%nat)) = q.

The helper function Factl never recurses. More generally, this formalization looks rather non-idiomatic to me (which also makes me worry about the rest of them). Factl could be written using map, In should probably be replaced by Forall, and you might consider using some coercions. If you want to keep to the standard library, I'd go for something closer to this:

Require Import List QArith Znumtheory Reals Arith.
Open Scope Q.

Local Coercion inject_Z : Z >-> Q.
Local Coercion Z_of_nat : nat >-> Z.

Theorem putnam_2009_b1 :
  let fact_prod ls := fold_left Nat.mul (map fact ls) 1%nat in
  forall (q: Q), q > 0 -> exists (n d: list nat),
  @Forall nat prime (n ++ d) /\ fact_prod n / fact_prod d = q.
Proof.
Admitted.

However, generally speaking, for these kinds of problems you are almost certainly better off using Mathcomp instead of the standard library. It has a much richer development around primes, such as a prime decomposition function that will very likely be needed to prove this theorem. The following is a formalization in Mathcomp. Hopefully it is reasonably idiomatic, but I'm not a Mathcomp expert. I strongly recommend that you avoid the stdlib for these kinds of problems.

From mathcomp Require Import ssrbool seq ssrnat prime rat ssralg ssrnum ssrint.
Local Open Scope ring_scope.

Theorem putnam_2009_b1 :
  let fact_prod ls := (\prod_(i <- ls) i`!)%:Q in
  forall q : rat, q > 0 -> exists n d,
  all prime (n ++ d) /\ fact_prod n / fact_prod d = q.
Proof.
Admitted.

Finally, the Lean version seems to be implemented using functions on finite numbers instead of lists. Why this difference?

theorem putnam_2009_b1
(isquotprodprimefact : ℚ → Prop :=
fun q => (∃ (k m : ℕ) (a : Fin k → ℕ) (b : Fin m → ℕ),
(∀ i : Fin k, Nat.Prime (a i)) ∧ (∀ j : Fin m, Nat.Prime (b j))
∧ (q = (∏ i : Fin k, Nat.factorial (a i))/(∏ j : Fin m, Nat.factorial (b j)))
))
: ∀ q : ℚ, q > 0 → isquotprodprimefact q :=

Isabelle uses yet another approach using functions from nat to nat, with a separately encoded maximum indices k and m:

theorem putnam_2009_b1:
fixes isquotprodprimefact :: "rat \<Rightarrow> bool"
defines "isquotprodprimefact \<equiv> (\<lambda>q::rat. (\<exists>(k::nat)(m::nat)(a::nat\<Rightarrow>nat)(b::nat\<Rightarrow>nat).
(\<forall>i::nat\<in>{0..(k-1)}. prime (a i)) \<and> (\<forall>j::nat\<in>{0..(m-1)}. prime (b j))
\<and> q = (\<Prod>i::nat=0..(k-1). fact (a i)) / (\<Prod>j::nat=0..(m-1). fact (b j))))"
shows "\<forall>q::rat. (q > 0 \<longrightarrow> isquotprodprimefact q)"

Thank you for pointing this out! I also agree that Factl can be written more succinctly using map. These kinds of definitions appear here and there in the benchmark because we didn't have a suitable analogue to sum_n for products in Coquelicot.Hierarchy, but we'll be modifying most formalizations to depend solely on mathcomp.

In Lean, we found it to be more natural to use the finite type Fin n to formalize the problem. It follows the conventions we've seen in other benchmarks as well. We could have used lists, which for example have an access function get with type signature (as : List α) → Fin as.length → α. I would regard lists/Fin n to be mostly isomorphic in the sense that I don't think producing a proof for one is much harder than producing a proof for the other (though I think this can be a bigger deal for the Coq formalizations). My feeling is that being robust to these book-keeping differences (or more generally some $\epsilon$ away from being perfectly idiomatic) is less difficult than producing the proper high-level argument to solve the problem.

In Isabelle, we found it to be less natural to introduce a finite type to formalize the problem and so usually "boost" the formalization from Fin n in Lean to nat in Isabelle and introduce some extra qualifiers to constrain the relevant variables.

Generally we are trying to stay fairly consistent amongst the formalizations in each language, perhaps we could do something like

From mathcomp Require Import ssrbool seq ssrnat prime rat ssralg ssrnum ssrint fintype.
Local Open Scope ring_scope.

Theorem putnam_2009_b1' :
  let fact_prod (n : nat) (z : 'I_n -> nat) : rat := \prod_(i < n) ((z i)`!)%:Q in
  forall q : rat, q > 0 -> exists (n1 d1 : nat) (n : 'I_n1 -> nat) (d : 'I_d1 -> nat),
  (forall i, prime (n i)) /\ (forall i, prime (d i)) /\ fact_prod n1 n / fact_prod d1 d = q.
Proof.
Admitted.

But this is a bit longer and harder to read. Maybe it can be shortened with some notational tricks, but which do you think it is more idiomatic?

I am not an expert on what is idiomatic in Lean. Perhaps @eric-wieser can comment here. However, generally as a functional programmer and theorem prover, the Lean statement seems rather cluttered to me. Would it not be easier if it were rewritten like this?

theorem putnam_2009_b1 :
  let fact_prod ls := (ls.map Nat.factorial).prod
  ∀ q : ℚ, q > 0 → ∃ (n d : List ℕ),
  (n ++ d).all Nat.Prime ∧ q = fact_prod n / fact_prod d := by sorry

This saves from having to quantify over the size of the finite maps, which to me is a significant simplification. Also, it just reads shorter and clearer to me. I don't know if it is easier or harder to prove... (But like you say, the extra difficulty wouldn't matter much in the grand scheme of things, so you should prefer clarity of statement over ease of proving.)

More generally speaking: The trade-off between consistency amongst proof assistants and an idiomatic statement is tricky. I'd say to prefer an idiomatic statement over consistency. I certainly wouldn't prefer the Coq formalization you give over mine...

I tend to agree with you, I think in general for any new Coq formalizations we'll aim for conciseness over consistency amongst languages. If Lean experts happen to have suggestions on modifying the Lean formalizations to be idiomatic we'll happily incorporate them.