Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang tidy cleanup and using std algorithms #1373

Conversation

SteveBronder
Copy link
Collaborator

@SteveBronder SteveBronder commented Sep 30, 2019

Summary

This includes a few automated refactors and some hand made ones I'll review below

  1. Running the below clang-tidy (that test just so happens to touch all the files in stan-math)
make clang-tidy-fix files=./test/unit/math/mix/mat/eigen_plugins_test* \
 tidy_checks=modernize-use-bool-literals,performance-for-range-copy, modernize-use-equals-default,readability-braces-around-statements, performance-unnecessary-value-param

Links below to what each of these do:

  1. I selectively ran the clang-tidy check for range based for loops and did a few manual tweaks so that

-. If the value of the container is not primitive, we do a range based for loop with rvalue ref
(auto&& x_i : x)

-. If the value is primitive and never modified we do (const auto x_i: x)

  1. We use std::inner_product instead of a for loop for vector dot products
  2. The changes in `(prim\rev)/arr/log_sum_exp.hpp should be looked over more thoroughly. Previously we did some logic that sort of confused me
  // Loop over the values to get the max (defaulted to -inf)
  double max = -numeric_limits<double>::infinity();
  for (double xx : x) {
    if (xx > max) {
      max = xx;
    }
  }

  double sum = 0.0;
 // Accumulate those values excluding -inf values
  for (size_t ii = 0; ii < x.size(); ii++) {
    if (x[ii] != -numeric_limits<double>::infinity()) {
      sum += exp(x[ii] - max);
    }
  }
  // If any x is -inf this will return -inf?
  return max + log(sum);

Reading the above it looks like if any value is -inf or +inf the end result will still be +-inf . If that's the only edge case we were focusing on with the above I think the below change satisfies that a little cleaner

  double max_val = *std::max_element(x.begin(), x.end());
  double sum = std::accumulate(
      x.begin(), x.end(), 0.0,
      [&max_val](auto& acc, auto&& x_i) { return acc + exp(x_i - max_val); });
  return max_val + log(sum);
  1. We have a bunch of default constructors we are declaring that are just the default so I set those to explicitly use the default. Accumulate declares a destructor that's also the default. Should we just remove those and use the implicitly generated constructors?

  2. promote_elements for vectors uses braced initializers to construct the output vector while promote_elements for Eigen uses a Mat.cast<T>

  3. sum now uses an std::accumulate

  4. In a few places we now use x.coeffRef(i) to avoid bounds checking on when using operator[ ] on eigen matrices

  5. log_sum_exp_test was running a test on an uninitialized vector so I set the vector to a size of 0.

  6. There were a few places we had the C++03 style of > > at the end of a template which got cleaned up.

Tests

Refactor so idt new tests? Happy to add any if the current stuff was missing tests

Side Effects

idt so!

Checklist

  • Math issue Update internals to use more modern c++ #1308

  • Copyright holder: Steve Bronder

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@@ -34,7 +34,7 @@ class accumulator {
/**
* Destroy an accumulator.
*/
~accumulator() {}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for defining the accumulator destructor as empty here? tmk this still calls the destructor for all the accumulates members

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is OK to leave as default---as is, it's not virtual and breaks the rule of 3(5).

@SteveBronder
Copy link
Collaborator Author

@wds15 @rok-cesnovar this PR has a bunch of tbb stuff in it now (i.e. lib/tbb/libtbbmalloc_proxy.so.2), what do we need to add to the .gitignore so this stuff is not pushed?

@rok-cesnovar
Copy link
Member

lib/tbb/* should be ignored all together as its a build folder. We should add that to the integrate PR.

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.98)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.99)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.02)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.99)
(performance.compilation, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 0.98)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 1.00110653246
Commit hash: d716461

@SteveBronder SteveBronder changed the title [wip] Clang tidy cleanups Clang tidy cleanup and using std algorithms Oct 8, 2019
@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 1.0)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.05)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 0.98)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.99)
(performance.compilation, 1.03)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 0.99)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.01)
Result: Regex did not match anything
Commit hash: d716461

Copy link
Contributor

@bob-carpenter bob-carpenter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I really like seeing this kind of code cleanup.

The only thing I'm curious about is efficiecy on log-sum-exp expanded as it is. And one request to capture by value. Everything else is a comment or optional. Of the optional stuff, it'd be particularly great to vectorize the checks so that they can deal with indexing in the error message and we can remove a lot of boilerplate.

double max_val = *std::max_element(x.begin(), x.end());
double sum = std::accumulate(
x.begin(), x.end(), 0.0,
[&max_val](auto& acc, auto&& x_i) { return acc + exp(x_i - max_val); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this generate code that's as efficient as before? It will come down to how efficiently it can compile that closure.

How do we test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did this on godbolt, lhs is the code (bottom is current (labled editor 1) and top is the new one (editor 2) middle is the output from the new stuff and far right is the output from the current stuff. You can highlight certain instructions and it usually pops up a little 'heres what this does'. You can click 'add' in the top right to get a diff view of the two outputs, though it usually looks wonky at O3. You can click and drag any of the tabs for each little block to move stuff. If you right click the highlighted code on the lhs it should have an options to take you to where that line is happening in whichever of the bottom two outputs, though it's not always exact.

I like to look at -O0 to see where stuff is then looking at -O3. About lines 40-60'ish is where the loop and exp calculation happen. The code is super similar, the lambda version removes a compare and a few moves. But those are mostly because we don't do the if statement in there anymore. I can look tmrw at just removing that check there with the old version.

https://godbolt.org/z/Xe8ev_

godbolt is pretty neat! I learned last night you can also get a real graph of the call graph!

https://godbolt.org/z/cCqIAH

There's a way to make a PR on their repo so we can get Stan up there, would like to find time for that in the next week or so

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another cool internet benchmark thing!

http://quick-bench.com/3Wdd56xscm20sShrc0xZx2qgdsE

double max_val = *std::max_element(x.begin(), x.end());
double sum = std::accumulate(
x.begin(), x.end(), 0.0,
[&max_val](auto& acc, auto&& x_i) { return acc + exp(x_i - max_val); });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rules for capture are like argument passing, so that primitives like max_val should be captured by value, not by reference.

}

return max + log(sum);
double max_val = *std::max_element(x.begin(), x.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very neat!

@@ -34,7 +34,7 @@ class accumulator {
/**
* Destroy an accumulator.
*/
~accumulator() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is OK to leave as default---as is, it's not virtual and breaks the rule of 3(5).

@@ -275,11 +275,11 @@ gp_exp_quad_cov(const std::vector<T_x1> &x1, const std::vector<T_x2> &x2,
return cov;
}

for (size_t i = 0; i < x1_size; ++i) {
check_not_nan(function_name, "x1", x1[i]);
for (auto &&x1_i : x1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As is, I think these can be const.

These should be using a vectorized check_not_nan so that the index can also be printed and we don't have all this boilerplate looping.

Another alternative would be a for-each loop, which doesn't actually simplify things here, especiallyw ith explicit capture of the function name.

std::for_each(x1.begin(), x1.end(),
              [&function_name](double x) { return check_not_nan(function_name, "x", x); });

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be using a vectorized check_not_nan so that the index can also be printed and we don't have all this boilerplate looping.

Agree this should use a vectorized check_nan, but the vectorized version of check_not_nan does not work for vectors of eigen matrices atm :-(

After Andrew and I sort out the more generic templating discussion in #1425 then I'm going to come back to these check functions and clean them up so we can do that.

}
}
return max + log(sum);
double max_val = std::max_element(x.begin(), x.end())->val();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional]
This is soooo close to the double version, the only difference being the ->val() pulling out the double based value. Could the (recursive?) value_of for max_val computation allow these to be combined into a single implementation? Maybe not worth it given again how complicated the indirection would be.

Copy link
Collaborator Author

@SteveBronder SteveBronder Nov 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, it's so close! I think for a v v clean version of this we need a vectorized value_of. Then in the constructor for log_sum_exp_vector_vari we could just call op_vector_vari(log_sum_exp(value_of(x)), x).

I put a comment above log_sum_exp_as_double about this and can do those value_of's in a separate PR

@andrjohns
Copy link
Collaborator

The arr definitions for log_sum_exp bring up a point that I've been thinking about a for a while. If the eventual roadmap is to collapse the scal/mat/arr directories would it be cleaner to just Eigen::Map the std::vector inputs and call the respective mat functions rather than writing a separate (Eigen-free) definition?

@SteveBronder
Copy link
Collaborator Author

The arr definitions for log_sum_exp bring up a point that I've been thinking about a for a while. If the eventual roadmap is to collapse the scal/mat/arr directories would it be cleaner to just Eigen::Map the std::vector inputs and call the respective matfunctions rather than writing a separate (Eigen-free) definition?

I like how the std algorithms look but you make a good point. Winder if we could even get away with a single more general implementation

@andrjohns
Copy link
Collaborator

You've definitely done some neat work with the std code, so I'm not in a hurry to wipe that away! I don't think you should do anything to this pull - I'll have a look into this and create an issue with some ideas and performance testing

@wds15
Copy link
Contributor

wds15 commented Oct 14, 2019

I would be cautious with going all eigen...weren’t these slower than the non eigen implementations die to memory Lay-out stuff?

But harmonizing things is a good thought, of course.

@andrjohns
Copy link
Collaborator

It wouldn't be a blind change, I'm planning on comparing performance with the perf-math repo to make sure things scale well - just to make sure there aren't any surprises

@syclik
Copy link
Member

syclik commented Oct 31, 2019

@SteveBronder, there's a merge conflict. It should be a quick fix (I looked at it briefly and didn't know which direction to go on first glance).

@SteveBronder
Copy link
Collaborator Author

Yes apologies getting over a cold this week and back to the jobby job, I'll update this tonight.

I think I'm going to remove the changes to log_sum_exp since there's a lot of stuff going on there and probably needs a bigger discussion on refactoring (if it even needs to be)

@SteveBronder
Copy link
Collaborator Author

@bob-carpenter at work right now but I have two PRs which don't touch subtract but it looks like MathMixMatFun.subtract is failing for both of them? I can look when I get home whether I goofed with something that touches subtract but idt so

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 1.01)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.02)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.98)
(performance.compilation, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 0.92)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.99)
Result: 0.99709237878
Commit hash: c9fef6c

@syclik
Copy link
Member

syclik commented Nov 28, 2019

@SteveBronder: there are code conflicts. Can you update your branch and reopen?

@syclik syclik closed this Nov 28, 2019
@serban-nicusor-toptal serban-nicusor-toptal modified the milestones: 3.0.0++, 3.1.0 Jan 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants