Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use value_array_or_scalar in distributions #2352

Merged
merged 16 commits into from
Mar 7, 2021

Conversation

SteveBronder
Copy link
Collaborator

Summary

This just tidies up the distributions a little bit by using as_value_array_or_scalar() in places we would do things like

  const auto& y_col = as_column_vector_or_scalar(y_ref);
  const auto& alpha_col = as_column_vector_or_scalar(alpha_ref);
  const auto& sigma_col = as_column_vector_or_scalar(sigma_ref);

  const auto& y_arr = as_array_or_scalar(y_col);
  const auto& alpha_arr = as_array_or_scalar(alpha_col);
  const auto& sigma_arr = as_array_or_scalar(sigma_col);

  ref_type_t<decltype(value_of(y_arr))> y_val = value_of(y_arr);
  ref_type_t<decltype(value_of(alpha_arr))> alpha_val = value_of(alpha_arr);
  ref_type_t<decltype(value_of(sigma_arr))> sigma_val = value_of(sigma_arr);

So now it's just

  auto&& y_val = to_ref(as_value_column_array_or_scalar(y_ref));
  auto&& alpha_val = to_ref(as_value_column_array_or_scalar(alpha_ref));
  auto&& sigma_val = to_ref(as_value_column_array_or_scalar(sigma_ref));

Note auto&& here has a different meaning then we normally think. Assigning to an auto&& type accepts any type as a reference so it can be auto&, auto&&, or const auto&

Tests

No new tests

Side Effects

Release notes

Tidy up distributions

Checklist

  • Math issue #(issue number)

  • Copyright holder: Steve Bronder

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.44 3.42 1.01 0.61% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.98 -1.98% slower
eight_schools/eight_schools.stan 0.11 0.11 0.99 -0.94% slower
gp_regr/gp_regr.stan 0.15 0.15 1.0 -0.03% slower
irt_2pl/irt_2pl.stan 5.17 5.19 1.0 -0.41% slower
performance.compilation 90.7 88.24 1.03 2.71% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.67 8.58 1.01 1.07% faster
pkpd/one_comp_mm_elim_abs.stan 28.33 30.58 0.93 -7.96% slower
sir/sir.stan 136.32 134.67 1.01 1.21% faster
gp_regr/gen_gp_data.stan 0.04 0.04 0.99 -0.89% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.08 3.02 1.02 2.01% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.39 0.99 -1.16% slower
arK/arK.stan 2.54 1.87 1.36 26.52% faster
arma/arma.stan 0.6 0.77 0.79 -27.33% slower
garch/garch.stan 0.53 0.6 0.88 -13.0% slower
Mean result: 0.998772137262

Jenkins Console Log
Blue Ocean
Commit hash: 4f64dcd


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Copy link
Contributor

@t4c1 t4c1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup!

I just think that auto&& warrants some discussion, since it is used everywhere. It is definitely OK, but it is not the only way. Previously we had const auto& in such cases, however decltype(auto) would also work. All these have subtle differences, but in this case all work the same. I am leaning slightly towards const auto& since it is the simplest option that works here. Unless you have a good argument for auto&&?

Comment on lines 78 to 81
ref_type_t<decltype(promote_scalar<double>(
as_array_or_scalar(as_column_vector_or_scalar(n_obs_ref))))>
n_obs_val = promote_scalar<double>(
as_array_or_scalar(as_column_vector_or_scalar(n_obs_ref)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be less code if you do it in two steps, just like it was done before.

@SteveBronder
Copy link
Collaborator Author

Nice cleanup!

ty!

I just think that auto&& warrants some discussion, since it is used everywhere. It is definitely OK, but it is not the only way. Previously we had const auto& in such cases, however decltype(auto) would also work. All these have subtle differences, but in this case all work the same. I am leaning slightly towards const auto& since it is the simplest option that works here. Unless you have a good argument for auto&&?

Yeah I was actually having difficulty thinking about this. I went with auto&& since it binds to all the reference types nicely. Though it's not as if we move these later so it provides no real benefit besides letting me be lazy. I'd be fine with const&. Just to sanity check, to_ref() here will either return an actual type holding memory for expressions, a view expression, or a reference to the input right? For that it almost feels like we want decltype(auto) so the object type is deduced as exactly the function return.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.41 3.45 0.99 -1.08% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.07 6.19% faster
eight_schools/eight_schools.stan 0.11 0.12 0.99 -1.17% slower
gp_regr/gp_regr.stan 0.15 0.15 0.98 -1.91% slower
irt_2pl/irt_2pl.stan 5.21 5.22 1.0 -0.08% slower
performance.compilation 91.16 88.28 1.03 3.16% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.81 8.09 0.97 -3.53% slower
pkpd/one_comp_mm_elim_abs.stan 28.48 29.27 0.97 -2.77% slower
sir/sir.stan 135.51 138.99 0.97 -2.57% slower
gp_regr/gen_gp_data.stan 0.04 0.04 0.99 -0.91% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.05 2.95 1.03 3.13% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.38 1.0 0.34% faster
arK/arK.stan 2.53 2.52 1.0 0.42% faster
arma/arma.stan 0.7 0.6 1.16 14.01% faster
garch/garch.stan 0.65 0.66 0.98 -1.68% slower
Mean result: 1.00986462707

Jenkins Console Log
Blue Ocean
Commit hash: c8c5e79


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1
Copy link
Contributor

t4c1 commented Feb 11, 2021

I went with auto&& since it binds to all the reference types nicely.

So do other two options I mentioned. The difference between decltype(auto) and auto&& is only how they are deduced when assigned a rvalue reference. decltype(auto) will be a value, while auto&& will be a rvalue reference. And const auto& is obviously const and always a reference.

Just to sanity check, to_ref() here will either return an actual type holding memory for expressions, a view expression, or a reference to the input right?

Right. As I mentioned all three options work exactly the same. As we don't need to change these variables I'd prefer the const option.

@SteveBronder
Copy link
Collaborator Author

Would you mind if I used decltype(auto)? idk in my brain I know that const& is fine because it holds a constant reference to an object in the local scope. But then my brain is also like "No you are returning back an actual type use decltype(auto) so that the compiler knows it's not a reference

@t4c1
Copy link
Contributor

t4c1 commented Mar 4, 2021

I guess that logic makes just as much sense as what I suggested.

@bbbales2
Copy link
Member

bbbales2 commented Mar 4, 2021

I just like const & cause then I don't have to think. It's just a const reference.

Whenever I see decltype(auto) or && I know something possibly weird is happening and I need to stop and think about rvalues/lvalues and tricky things. const & is kinda no tricks.

Is there a reason a compiler would have trouble with a const &?

@bbbales2
Copy link
Member

bbbales2 commented Mar 4, 2021

But on average this pull simplifies the code that came before it, so if this is ready to merge and you prefer auto&& or decltype(auto), sounds good to me.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.42 3.39 1.01 0.81% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.01 1.45% faster
eight_schools/eight_schools.stan 0.12 0.11 1.07 6.32% faster
gp_regr/gp_regr.stan 0.16 0.16 1.01 1.02% faster
irt_2pl/irt_2pl.stan 5.23 5.26 0.99 -0.61% slower
performance.compilation 91.4 88.86 1.03 2.78% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.63 8.58 1.01 0.66% faster
pkpd/one_comp_mm_elim_abs.stan 31.16 30.11 1.03 3.35% faster
sir/sir.stan 130.95 135.43 0.97 -3.42% slower
gp_regr/gen_gp_data.stan 0.04 0.04 1.0 -0.03% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.99 2.91 1.03 2.51% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.37 1.03 3.31% faster
arK/arK.stan 1.8 1.81 1.0 -0.47% slower
arma/arma.stan 0.76 0.6 1.28 21.73% faster
garch/garch.stan 0.54 0.57 0.96 -4.53% slower
Mean result: 1.02804549689

Jenkins Console Log
Blue Ocean
Commit hash: 16318f0


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@SteveBronder
Copy link
Collaborator Author

Is there a reason a compiler would have trouble with a const &?

@t4c1 do you know a reference somewhere I can read about the rules surrounding cont& as the object type? I'm just not sure what to google to read more about how that works

If yinz are fine with decltype then I think this is good to go

@bbbales2
Copy link
Member

bbbales2 commented Mar 6, 2021

Gaaah I wanna say yes and move on with the deserializer but then I went and read about decltype(auto). From here:

auto a = 1 + 2;            // type of a is int
...
decltype(auto) c1 = a;   // type of c1 is int, holding a copy of a
decltype(auto) c2 = (a); // type of c2 is int&, an alias of a

Ugh what a weird thing.

Can we just be plain const &? Everything comes into these functions as const & types so I don't think that would be a major problem. Also the other temporaries in these functions are const &.

I keep flipping flopping on this in the most useless way.

@SteveBronder
Copy link
Collaborator Author

SteveBronder commented Mar 7, 2021

These are used like

decltype(auto) x_ref = to_ref(as_value_array_or_scalar(x))

and since to_ref() and as_value_array_or_scalar() both return references in the case of matrices/vectors with scalars of double the above behavior is exactly what we want!

bbbales2
bbbales2 previously approved these changes Mar 7, 2021
Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

@bbbales2
Copy link
Member

bbbales2 commented Mar 7, 2021

Alright decltype it is. I scanned through this just to make sure everything was in order. There were some leftover things in beta_lpdf. I deleted those things and grepped and there were a couple other stragglers. Diff here.

@SteveBronder
Copy link
Collaborator Author

Nice! Yeah there's a few other places we could clean up like the glm functions but I wanted to make this PR just the layups that were obvious

Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good (previous review got dismissed)

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.49 3.39 1.03 2.77% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.99 -1.06% slower
eight_schools/eight_schools.stan 0.11 0.12 0.94 -5.9% slower
gp_regr/gp_regr.stan 0.16 0.16 0.97 -3.13% slower
irt_2pl/irt_2pl.stan 5.19 5.16 1.0 0.43% faster
performance.compilation 90.62 88.76 1.02 2.05% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.68 8.6 1.01 0.89% faster
pkpd/one_comp_mm_elim_abs.stan 31.61 30.01 1.05 5.05% faster
sir/sir.stan 132.38 132.06 1.0 0.24% faster
gp_regr/gen_gp_data.stan 0.04 0.04 0.98 -1.88% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.98 2.93 1.02 1.91% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.4 0.38 1.04 4.07% faster
arK/arK.stan 1.8 1.78 1.01 0.88% faster
arma/arma.stan 0.75 0.59 1.26 20.41% faster
garch/garch.stan 0.54 0.57 0.94 -5.95% slower
Mean result: 1.01828940029

Jenkins Console Log
Blue Ocean
Commit hash: ae95fa0


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bbbales2 bbbales2 merged commit c466ddf into develop Mar 7, 2021
@rok-cesnovar rok-cesnovar deleted the cleanup/dist-as-value-array branch March 7, 2021 12:54
@t4c1
Copy link
Contributor

t4c1 commented Mar 8, 2021

@SteveBronder You may want to look at how "lifetime extension" works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants