Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update expression tests code generation #2419

Merged
merged 66 commits into from
Mar 19, 2021
Merged

Conversation

bbbales2
Copy link
Member

@bbbales2 bbbales2 commented Mar 10, 2021

Summary

I took the varmat test stuff out of this. It is currently just a cleanup to the code generation for the expressions tests (and this will feed into the varmat stuff). It makes the pull smaller.

Tests

None yet

Release notes

  • Updated code generation for expression tests

Checklist

  • Math issue How to add static matrix? #1805

  • Copyright holder: Columbia University

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

bbbales2 and others added 27 commits January 8, 2021 18:48
…t wouldn't have a varmat signature even if they were compatible) (Issue #2101)
@bbbales2
Copy link
Member Author

bbbales2 commented Mar 10, 2021

@t4c1 can you give test/generateExpressionTests.py and see what you think? I tried to move a lot of the hairy ifs and stuff into a class FunctionGenerator here.

@rok-cesnovar the way we would run this is:

./test/varmat_compatibility_test.py results.json

And then this to print the function names that are compatible:

./test/varmat_compatibility_summary.py results.json --names --fully

It takes a long time so this is something we'd put in a GHA or something (definitely not run every merge -- it has to compile one file for every signature).

Anyway I still gotta convert this all back to python2 and maybe shuffle the files a bit and add docs, so WIP (Edit: not really ready for full, careful review).

@bbbales2
Copy link
Member Author

jenkins user has access to install pip packages

Ooo, okay that is convenient thanks.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.42 3.44 1.0 -0.48% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.98 -2.12% slower
eight_schools/eight_schools.stan 0.11 0.11 0.98 -1.93% slower
gp_regr/gp_regr.stan 0.17 0.16 1.06 5.35% faster
irt_2pl/irt_2pl.stan 5.24 5.2 1.01 0.82% faster
performance.compilation 91.19 88.67 1.03 2.77% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.95 8.9 1.01 0.53% faster
pkpd/one_comp_mm_elim_abs.stan 29.21 29.99 0.97 -2.69% slower
sir/sir.stan 131.79 138.23 0.95 -4.88% slower
gp_regr/gen_gp_data.stan 0.04 0.04 1.04 3.96% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.27 3.11 1.05 4.76% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.39 0.39 1.01 0.65% faster
arK/arK.stan 1.9 1.91 1.0 -0.43% slower
arma/arma.stan 0.64 0.64 1.01 0.51% faster
garch/garch.stan 0.51 0.51 0.99 -0.96% slower
Mean result: 1.0047084071

Jenkins Console Log
Blue Ocean
Commit hash: 0e906cf


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bbbales2
Copy link
Member Author

@t4c1 This is ready for review again

Copy link
Contributor

@t4c1 t4c1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new code generation stuff could also be used for generating benchmarks, right?

Also new code needs docs for function parameters.

test/statement_types.py Show resolved Hide resolved
test/statement_types.py Outdated Show resolved Hide resolved
test/statement_types.py Show resolved Hide resolved
test/statement_types.py Outdated Show resolved Hide resolved
test/statement_types.py Show resolved Hide resolved
test/statement_types.py Outdated Show resolved Hide resolved
test/statement_types.py Show resolved Hide resolved
test/generate_expression_tests.py Show resolved Hide resolved
@rok-cesnovar
Copy link
Member

Any idea why the new expr. tests take 2x the time as the previous expressions tests? I am not saying its a deal breaker as its 10 minutes, just wondering what changed?

@bbbales2
Copy link
Member Author

Any idea why the new expr. tests take 2x the time as the previous expressions tests?

I think the code generation is less efficient (double the arguments are initialized). I'm not sure.

I did a test and the new code gen generates 52008 tests. The old code generates 52021 tests.

@bbbales2
Copy link
Member Author

@t4c1 I updated the comments.

Re: benchmarking, I want to merge this on its own, and then the next pull is a script to automatically detect which functions can compile with varmat variables or not (it is just 2 scripts).

I haven't looked at the benchmarking script yet. It probably wouldn't be too much trouble to clean it up (maybe a day to convert?) while we're on the topic of code generation.

@t4c1
Copy link
Contributor

t4c1 commented Mar 18, 2021

I did a test and the new code gen generates 52008 tests. The old code generates 52021 tests.

Any idea why there is a difference at all? I think we should understand what is happening here.

@bbbales2
Copy link
Member Author

bbbales2 commented Mar 18, 2021

Any idea why there is a difference at all?

Yeah I was curious too.

The list of functions affected is is_cholesky_factor, is_matching_dims, multinomial_log, multinomial_lpmf

Edit: Also multinomial_rng

It looks like the old code is doing codegen for the same signatures twice somehow.

Like here are two functions it generates:

TEST(ExpressionTestPrim, is_cholesky_factor0) {
  Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> arg_mat0 = stan::test::make_arg<Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic>>();
  auto res_mat = stan::math::is_cholesky_factor(arg_mat0);
  Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> arg_expr0 = stan::test::make_arg<Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic>>();
  int counter0 = 0;
  stan::test::counterOp<double> counter_op0(&counter0);
  auto res_expr = stan::math::is_cholesky_factor(arg_expr0.block(0,0,1,1).unaryExpr(counter_op0));
  EXPECT_STAN_EQ(res_expr, res_mat);
  EXPECT_LE(counter0, 1);
}

TEST(ExpressionTestPrim, is_cholesky_factor1) {
  Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> arg_mat0 = stan::test::make_arg<Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic>>();
  auto res_mat = stan::math::is_cholesky_factor(arg_mat0);
  Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> arg_expr0 = stan::test::make_arg<Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic>>();
  int counter0 = 0;
  stan::test::counterOp<double> counter_op0(&counter0);
  auto res_expr = stan::math::is_cholesky_factor(arg_expr0.block(0,0,1,1).unaryExpr(counter_op0));
  EXPECT_STAN_EQ(res_expr, res_mat);
  EXPECT_LE(counter0, 1);
}

I assume this means that one of the sets of special tests is causing the functions to get added twice.

@bbbales2
Copy link
Member Author

bbbales2 commented Mar 18, 2021

Looks like get_signatures isn't totally unique:

signatures = get_signatures()
len(signatures)
26982
len(set(signatures))
26976

New code pushes everything through sets so it doesn't show this.

Edit: Duplicates are:

['multinomial_log(array[] int, vector) => real\n'
'gp_dot_prod_cov(array[] real, array[] real, real) => matrix\n'
'is_cholesky_factor(matrix) => int', 'is_matching_dims(matrix, matrix) => int'
'multinomial_rng(vector, int) => array[] int\n'
'multinomial_lpmf(array[] int, vector) => real\n']

Edit: One of these things must be duplicated more than once for the difference in length to be 6

@bbbales2
Copy link
Member Author

Here's new code vs. old code. New code is more lines. Isn't obvious to me why it takes so much longer, but I guess all the little decltype s and auto s add up?

New code:

TEST(ExpressionTestRev, multinomial_lpmf3) {
/*
multinomial_lpmf(array[] int, vector) => real
 */
  int int0 = 1;
  std::vector<decltype(int0)> array0 = {int0};
  auto simplex1 = stan::test::make_simplex<Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1>>(0.4, 1);
  int simplex1_expr2_counter = 0;
  stan::test::counterOp<stan::math::var> simplex1_expr2_counter_op(&simplex1_expr2_counter);
  auto simplex1_expr2 = simplex1.segment(0,1).unaryExpr(simplex1_expr2_counter_op);

  int int3 = 1;
  std::vector<decltype(int3)> array3 = {int3};
  auto simplex4 = stan::test::make_simplex<Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1>>(0.4, 1);

  auto result5 = stan::math::eval(stan::math::multinomial_lpmf(array0,simplex1_expr2));
  auto result6 = stan::math::eval(stan::math::multinomial_lpmf(array3,simplex4));

  EXPECT_STAN_EQ(result5,result6);
  EXPECT_LEQ_ONE(simplex1_expr2_counter);

  auto summed_result7 = stan::math::eval(stan::test::recursive_sum(result5));
  auto summed_result8 = stan::math::eval(stan::test::recursive_sum(result6));
  auto sum_of_sums9 = stan::math::eval(stan::math::add(summed_result7,summed_result8));

  stan::test::grad(sum_of_sums9);
  stan::test::expect_adj_eq(array3,array0);
  stan::test::expect_adj_eq(simplex4,simplex1_expr2);
  stan::math::recover_memory();
}

Old code:

TEST(ExpressionTestRev, multinomial_log1) {
  std::vector<int> arg_mat0 = stan::test::make_arg<std::vector<int>>();
  Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1> arg_mat1 = stan::test::make_simplex<Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1>>();

  auto res_mat = stan::math::multinomial_log(arg_mat0, arg_mat1);

  std::vector<int> arg_expr0 = stan::test::make_arg<std::vector<int>>();
  Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1> arg_expr1 = stan::test::make_simplex<Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1>>();
  int counter1 = 0;
  stan::test::counterOp<stan::math::var> counter_op1(&counter1);

  auto res_expr = stan::math::multinomial_log(arg_expr0, arg_expr1.segment(0,1).unaryExpr(counter_op1));

  EXPECT_STAN_EQ(res_expr, res_mat);

  EXPECT_LE(counter1, 1);
  (stan::test::recursive_sum(res_mat) + stan::test::recursive_sum(res_expr)).grad();
  EXPECT_STAN_ADJ_EQ(arg_expr0,arg_mat0);
  EXPECT_STAN_ADJ_EQ(arg_expr1,arg_mat1);
}

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.47 3.45 1.01 0.66% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.94 -6.79% slower
eight_schools/eight_schools.stan 0.11 0.11 1.05 4.73% faster
gp_regr/gp_regr.stan 0.16 0.16 1.03 2.84% faster
irt_2pl/irt_2pl.stan 5.26 5.3 0.99 -0.82% slower
performance.compilation 90.4 88.61 1.02 1.98% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 9.01 8.93 1.01 0.86% faster
pkpd/one_comp_mm_elim_abs.stan 29.12 29.47 0.99 -1.22% slower
sir/sir.stan 130.57 126.42 1.03 3.18% faster
gp_regr/gen_gp_data.stan 0.04 0.04 0.99 -1.28% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.1 3.09 1.0 0.4% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.41 0.38 1.08 7.06% faster
arK/arK.stan 1.91 1.89 1.01 1.31% faster
arma/arma.stan 0.63 0.64 0.99 -1.25% slower
garch/garch.stan 0.51 0.52 0.99 -0.63% slower
Mean result: 1.00837698468

Jenkins Console Log
Blue Ocean
Commit hash: 277fbc6


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bbbales2
Copy link
Member Author

@t4c1 ready for review again

@t4c1
Copy link
Contributor

t4c1 commented Mar 19, 2021

I looked at the comparison of old and new tests. All those evals on scalars (summed_result7, summed_result8, sum_of_sums9) are redundant. Other than that I don't see anything that could double the running time (other than the fact you are comparing tests for different functions). Also I don't see why did you need to invent a new macro EXPECT_LEQ_ONE.

test/generate_expression_tests.py Outdated Show resolved Hide resolved
@bbbales2
Copy link
Member Author

@t4c1 Ready again

@bbbales2
Copy link
Member Author

bbbales2 commented Mar 19, 2021

All those evals on scalars (summed_result7, summed_result8, sum_of_sums9) are redundant.

When I call a function I evaluate the results to make sure all the returns are not expressions (and do something unexpected). Since the python here isn't generating performance sensitive code, I preferred to hide expressions with evals (edited). This might change if we did the benchmarks, but it is easy enough to change.

Also I don't see why did you need to invent a new macro EXPECT_LEQ_ONE.

I just did the first thing that came to mind. Coulda done it as a function. Could have also initialized an IntVariable and called a two argument macro, but this seemed to fit well enough with how we do other bits of code.

@t4c1
Copy link
Contributor

t4c1 commented Mar 19, 2021

When I call a function I evaluate the results to make sure all the returns are not expressions (and do something unexpected). Since the python here isn't generating performance sensitive code, I preferred to hide expressions with evals (edited). This might change if we did the benchmarks, but it is easy enough to change.

We don't have scalar expressions. So these evals never do anything.

I just did the first thing that came to mind. Coulda done it as a function. Could have also initialized an IntVariable and called a two argument macro, but this seemed to fit well enough with how we do other bits of code.

I dont like it. Calling two argument macro makes most sense to me. I don't think we define macros anywhere else just to hide one argument.

@bbbales2
Copy link
Member Author

Calling two argument macro makes most sense to me

Switched

We don't have scalar expressions. So these evals never do anything.

Right, but I wanted CodeGenerator.function_call_assign to be simple to think about (not returning expressions). If the difference becomes important can change this easy-peasy.

@t4c1
Copy link
Contributor

t4c1 commented Mar 19, 2021

Oh, I see. I just thought that might have been the reason for the slow down of expression tests.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.46 3.46 1.0 -0.24% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.98 -1.65% slower
eight_schools/eight_schools.stan 0.11 0.11 1.01 1.0% faster
gp_regr/gp_regr.stan 0.16 0.16 1.02 2.06% faster
irt_2pl/irt_2pl.stan 5.24 5.27 0.99 -0.68% slower
performance.compilation 89.72 88.68 1.01 1.16% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.87 9.03 0.98 -1.81% slower
pkpd/one_comp_mm_elim_abs.stan 29.87 31.18 0.96 -4.38% slower
sir/sir.stan 131.64 127.86 1.03 2.87% faster
gp_regr/gen_gp_data.stan 0.04 0.03 1.05 4.93% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.11 3.13 0.99 -0.8% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.37 1.01 1.43% faster
arK/arK.stan 1.9 1.9 1.0 0.14% faster
arma/arma.stan 0.63 0.63 1.0 0.06% faster
garch/garch.stan 0.51 0.51 1.0 -0.28% slower
Mean result: 1.0030080899

Jenkins Console Log
Blue Ocean
Commit hash: 7318a0b


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@rok-cesnovar rok-cesnovar merged commit 51669a6 into develop Mar 19, 2021
@rok-cesnovar rok-cesnovar deleted the feature/varmat-signatures branch March 19, 2021 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants