Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure free symbols are declared in SMT2_conv after quantifier rewriting #8361

Merged
merged 2 commits into from
Jul 10, 2024

Conversation

qinheping
Copy link
Collaborator

@qinheping qinheping commented Jun 26, 2024

Resolve #8329 in which CBMC with SMT2 fails on nested loop contracts with quantifiers.

We only kept quantified symbols unique during symex_assert.
Once there are some assertion containing quantified symbols x following an assumption also containing x, symex_assume won't make the two x distinct.

This PR record and keep quantified symbols unique also in symex_assume.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copy link

codecov bot commented Jun 26, 2024

Codecov Report

Attention: Patch coverage is 95.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 78.35%. Comparing base (b3f1e71) to head (ab12de4).

Files Patch % Lines
src/solvers/smt2/smt2_conv.cpp 94.44% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8361   +/-   ##
========================================
  Coverage    78.34%   78.35%           
========================================
  Files         1726     1726           
  Lines       188415   188424    +9     
  Branches     18248    18248           
========================================
+ Hits        147623   147631    +8     
- Misses       40792    40793    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@qinheping qinheping changed the title Add a test on which SMT2 fail with quantifers in nested loop contracts Keep quantified symbols unique during symex_assume Jun 27, 2024
@rod-chapman
Copy link
Collaborator

I will pull this change, rebuild and retry on all my test cases this morning.

@kroening
Copy link
Member

Please note that the quantified symbols do not need to be unique. As in SMT-LIB, quantifiers introduce a local scope.

I hence suspect that the problem is something else, possibly after instantiating the quantifier expressions incorrectly.

@qinheping
Copy link
Collaborator Author

qinheping commented Jun 28, 2024

Please note that the quantified symbols do not need to be unique. As in SMT-LIB, quantifiers introduce a local scope.

I hence suspect that the problem is something else, possibly after instantiating the quantifier expressions incorrectly.

Sorry I wasn't clear in the description. The universal quantified expression in assertion will later be rewritten to non-quantified expression. If such bounded symbols (and later become unbounded after the rewriting) in assertions are not unique, i.e., have the same name as the bounded symbols in some previous assumptions, the smt2_conv won't correctly declare those symbols as they are already in identifier_map and has a non-nil type: https://github.com/diffblue/cbmc/blob/develop/src/solvers/smt2/smt2_conv.cpp#L4946.

@kroening
Copy link
Member

kroening commented Jul 1, 2024

This should be fixed when doing the rewriting, not before.

@tautschnig
Copy link
Collaborator

I believe the rewriting in goto-symex already takes care of introducing declarations when needed (via rewrite_quantifiers). This does not, however, preclude variables that appear both in a bound and in a free context from having the same name when they reach the SMT2 back-end. There, we fail to account for this scenario in find_symbols.

@tautschnig
Copy link
Collaborator

I'd propose the following patch instead (making the regression test pass as expected):

diff --git a/src/solvers/smt2/smt2_conv.cpp b/src/solvers/smt2/smt2_conv.cpp
index d132114cb2..45f525e3fb 100644
--- a/src/solvers/smt2/smt2_conv.cpp
+++ b/src/solvers/smt2/smt2_conv.cpp
@@ -4908,17 +4908,30 @@ void smt2_convt::find_symbols(const exprt &expr)

   if(expr.id() == ID_exists || expr.id() == ID_forall)
   {
+    std::unordered_map<irep_idt, std::optional<identifiert>> hidden_syms;
+
     // do not declare the quantified symbol, but record
     // as 'bound symbol'
     const auto &q_expr = to_quantifier_expr(expr);
     for(const auto &symbol : q_expr.variables())
     {
       const auto identifier = symbol.get_identifier();
-      identifiert &id = identifier_map[identifier];
-      id.type = symbol.type();
-      id.is_bound = true;
+      auto id_entry = identifier_map.insert({identifier, identifiert{}});
+      hidden_syms.insert(
+        {identifier,
+         id_entry.second ? std::nullopt
+                         : std::optional{id_entry.first->second}});
+      id_entry.first->second.is_bound = true;
     }
     find_symbols(q_expr.where());
+    for(const auto &[id, hidden_val] : hidden_syms)
+    {
+      auto previous_entry = identifier_map.find(id);
+      if(!hidden_val.has_value())
+        identifier_map.erase(previous_entry);
+      else
+        previous_entry->second = std::move(*hidden_val);
+    }
     return;
   }

@@ -4943,7 +4956,7 @@ void smt2_convt::find_symbols(const exprt &expr)

     identifiert &id=identifier_map[identifier];

-    if(id.type.is_nil())
+    if(!id.is_bound && id.type.is_nil())
     {
       id.type=expr.type();

@qinheping
Copy link
Collaborator Author

I'd propose the following patch instead (making the regression test pass as expected):

diff --git a/src/solvers/smt2/smt2_conv.cpp b/src/solvers/smt2/smt2_conv.cpp
index d132114cb2..45f525e3fb 100644
--- a/src/solvers/smt2/smt2_conv.cpp
+++ b/src/solvers/smt2/smt2_conv.cpp
@@ -4908,17 +4908,30 @@ void smt2_convt::find_symbols(const exprt &expr)

   if(expr.id() == ID_exists || expr.id() == ID_forall)
   {
+    std::unordered_map<irep_idt, std::optional<identifiert>> hidden_syms;
+
     // do not declare the quantified symbol, but record
     // as 'bound symbol'
     const auto &q_expr = to_quantifier_expr(expr);
     for(const auto &symbol : q_expr.variables())
     {
       const auto identifier = symbol.get_identifier();
-      identifiert &id = identifier_map[identifier];
-      id.type = symbol.type();
-      id.is_bound = true;
+      auto id_entry = identifier_map.insert({identifier, identifiert{}});
+      hidden_syms.insert(
+        {identifier,
+         id_entry.second ? std::nullopt
+                         : std::optional{id_entry.first->second}});
+      id_entry.first->second.is_bound = true;
     }
     find_symbols(q_expr.where());
+    for(const auto &[id, hidden_val] : hidden_syms)
+    {
+      auto previous_entry = identifier_map.find(id);
+      if(!hidden_val.has_value())
+        identifier_map.erase(previous_entry);
+      else
+        previous_entry->second = std::move(*hidden_val);
+    }
     return;
   }

@@ -4943,7 +4956,7 @@ void smt2_convt::find_symbols(const exprt &expr)

     identifiert &id=identifier_map[identifier];

-    if(id.type.is_nil())
+    if(!id.is_bound && id.type.is_nil())
     {
       id.type=expr.type();

Thank you Michael, I applied your patch and undo the changes in symex.

@qinheping qinheping changed the title Keep quantified symbols unique during symex_assume Keep quantified symbols unique during smt2_conv Jul 2, 2024
@kroening
Copy link
Member

kroening commented Jul 2, 2024

I believe the rewriting in goto-symex already takes care of introducing declarations when needed (via rewrite_quantifiers). This does not, however, preclude variables that appear both in a bound and in a free context from having the same name when they reach the SMT2 back-end. There, we fail to account for this scenario in find_symbols.

It's ok to have a symbol both bound and free, what's the issue?

@tautschnig
Copy link
Collaborator

It's ok to have a symbol both bound and free, what's the issue?

It should be ok, but SMT conversion's identifier_map can't have both at the same time, and so we'd (without the (updated) patch in this PR) fail to declare-fun the free variable when we saw the bound one first.

@qinheping qinheping enabled auto-merge July 3, 2024 16:46
@kroening
Copy link
Member

kroening commented Jul 4, 2024

"Unique quantified symbols" is a non goal!

@tautschnig
Copy link
Collaborator

"Unique quantified symbols" is a non goal!

That's also not what the code change actually does, it rather does the opposite.

@qinheping qinheping changed the title Keep quantified symbols unique during smt2_conv Make sure free symbols are declared in SMT2_conv after quantifier rewriting Jul 5, 2024
@qinheping
Copy link
Collaborator Author

"Unique quantified symbols" is a non goal!

I changed the commit message and the title of this PR to make them align with the actual change.

@qinheping qinheping merged commit 8ccc168 into diffblue:develop Jul 10, 2024
39 of 40 checks passed
@qinheping qinheping deleted the issue/8329 branch July 11, 2024 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CBMC 6.0.0-preview fails with mal-formed SMT
7 participants