Expr.shuffle
uses different order per column
#18233
Labels
documentation
Improvements or additions to documentation
Expr.shuffle
uses different order per column
#18233
Description
If
Expr.shuffle
is used on multiple columns, by default it doesn't keep the rows intact:Note that columns "a" and "b" were previously always the same in each row but are not after using
Expr.shuffle
.The behavior I would expect (rows shuffled instead of each column) can be achieved by using a fixed seed:
Note that now column "a" always equals column "b" as in the original DataFrame.
Suggestion
to
Though maybe that doesn't quite cover what different things you can do with Expr, I don't know.
Other possibly affected functions
Expr.sort
: currently says "Sort this column.", when "Sort each column in this expression" might be better (?).Expr.sample
currently says "Sample from this expression." when it might better say "Sample from each column in this expression" (?).Used Versions
>>> pl.__version__ '0.20.18' >>> sys.version '3.11.8 (main, Feb 25 2024, 03:41:44) [MSC v.1929 64 bit (AMD64)]'
Link
https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.shuffle.html
The text was updated successfully, but these errors were encountered: