Skip to content

Releases: pola-rs/polars

Python Polars 0.16.11

05 Mar 17:49
e558689
Compare
Choose a tag to compare

🚀 Performance improvements

  • optimize str.replace_all (#7353)
  • optimize str.replace ~2x improvement (#7347)
  • ensure utf8 apply preallocates memory (#7345)

✨ Enhancements

  • make LazyFrame.explode streamable. (#7341)
  • allow import of dtype groups from the top-level to improve discovery (#7339)

🐞 Bug fixes

  • make decimal types opt-in (#7348)
  • fix chunk_sizes in threading apply (#7351)
  • don't panic when writing NullArray values to python row tuple (#7346)

🛠️ Other improvements

  • add write_excel API docs link (#7338)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ritchie46 and @s-banach

Python Polars 0.16.10

03 Mar 20:13
d5ca27e
Compare
Choose a tag to compare

🏆 Highlights

  • Excel export support via new write_excel IO method (#7251)
  • out of core sort on multiple columns (#7244)

🚀 Performance improvements

  • improve batched csv readers perf and memory perf (#7329)
  • use inlined strings for field and schema (#7272)
  • reuse groups in binary expressions (#7202)

✨ Enhancements

  • support creation of sparklines when exporting Excel tables (#7333)
  • support sqlalchemy/pandas backed write_database (#7322)
  • add adbc database reader and writer (DataFrame.write_database) (#7318)
  • make expr.apply streamable in selection context (#7316)
  • More ergonomic unnest args (#7310)
  • initial working version of Decimal Series (#7220)
  • Support explicit Binary dtype in constructor (#7305)
  • implement serde for literal datetime and series (#7301)
  • improve error message if mmap fails in ipc (#7300)
  • add multi-threaded apply (#7277)
  • add support for serializing categoricals to json (#7276)
  • Add Expr.arg_true (#7056)
  • don't require pyarrow for initialising Series with Python datetimes (#7273)
  • Excel export support via new write_excel IO method (#7251)
  • deprecate describe_(optimized)_plan in favor of explain (#7264)
  • enable min-max skipping for binary in parquet, enable min-max skipping for is_in exprs (#7169)
  • out of core sort on multiple columns (#7244)
  • support nulls_last for multi-column sort (#7242)
  • allow optimizations flags in describe_plan (#7233)
  • implement row encoding for boolean and binary (#7218)
  • allow passing utc=True when parsing time-zone-naive date strings (#7203)
  • Add **named_exprs input for struct (#7208)
  • add sql "ARRAY_AGG" (#7204)

🐞 Bug fixes

  • fix offset in threading apply (#7330)
  • fix projection pushdown on join with unused join key (#7326)
  • raise error on time -> datetime cast (#7325)
  • raise error if output of 'apply' cannot be determined (#7317)
  • make pl.struct mappable (#7299)
  • err on duplicate with_column names (#7296)
  • don't panic on str.parse_int (#7072)
  • improve concat_list with empty list error message (#7236)
  • fix groupby_dynamic's binning when index_column is time-zone-aware (#7278)
  • fix preservation of microseconds when converting Python datetime (#7271)
  • fix us precision of datetime to anyvalue conversion (#7268)
  • no panic on empty cross join (#7266)
  • raise error on ambiguous filter predicates (#7265)
  • handle concat_list with first lit value (#7235)
  • respect schema in DataFrame initialisation for time-zone-aware datetime (#7240)
  • ensure every type is properly normalised (for groupby_dynamic and groupby_rolling) (#7238)
  • add test of median function in lazy mode (#7224)
  • dont lose precision in pl.date_range due to floating point arithmetic (#7229)
  • Conversion of negative timedelta to polars duration (#7209)
  • ensure parametric testing cols=int definition respects allowed_dtypes (#7213)

🛠️ Other improvements

  • Fix read/write_database tests (#7327)
  • Rename scan_ds to scan_pyarrow_dataset (#7320)
  • don't run tests that write to disk by default (#7321)
  • rename read_sql to read_database (#7315)
  • Address git2 vulnerability (#7309)
  • Correctly deprecate DataFrame.pearson_corr (#7307)
  • Skip write_excel doctests (#7306)
  • Run pytest-xdist with worksteal (#7304)
  • Rename pearson_corr & spearman_rank_corr (#7014)
  • refactor(python) Split io module per type (#7295)
  • Move _html module to dataframe module (#7256)
  • Enable strict for ruff TCH lints (#7234)
  • better select on map_dict dtype (#7217)
  • add warning of mmap to ipc docstring (#7216)
  • exit non-zero on fix from ruff (#7215)
  • ensure that DataFrame and LazyFrame init params don't diverge (#7214)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @aldanor, @alexander-beedie, @coinflip112, @csko, @dependabot, @dependabot[bot], @ghuls, @josemasar, @josh, @mslapek, @nrebena, @ozgrakkurt, @papparapa, @ptiza, @rben01, @ritchie46, @sorhawell, @stinodego, @universalmind303, @xyning and @zundertj

Python Polars 0.16.9

26 Feb 10:31
cc487a1
Compare
Choose a tag to compare

🚀 Performance improvements

  • improve perf of multi-args exprs in groupby context (#7186)
  • optimize sequence_to_pydf (#7044)
  • improve single argument elementwise expression pe… (#7180)

✨ Enhancements

  • show column name if read_csv errors (#7177)
  • support direct LazyFrame init (same params as DataFrame) (#7122)
  • add a base_type method to DataType (#7166)
  • add explode for binary (#7159)
  • add binary apply (#7160)
  • Allow pl.Int32 Series as output in eager repeat. (#7152)
  • improve error message when read_csv fails (#7150)
  • Improve usability of Null type. (#7136)
  • add glob support to scan_ndjson (#7143)
  • add Expr.pipe (#7134)
  • streaming: scale chunk_size on table width (#7119)
  • additional read functions (#7102)
  • More ergonomic explode args (#7115)

🐞 Bug fixes

  • fix(rust, python); make list function 'map' and refactor multi-arg ex… (#7185)
  • Fix Series.argsort (#7183)
  • validate trees before inserting streaming node (#7179)
  • Raise ValueError for getitem when column indexes are out… (#7167)
  • fix list take logical types (#7163)
  • fix null cmp fast paths (#7157)
  • fix df division dispatch (#7155)
  • don't panic un unsupported arithmetic type (#7154)
  • don't let a cast unset agg_state and keep logical … (#7151)
  • expose sort expressions to stack-optimizer (#7148)
  • improve error message when read_csv fails (#7150)
  • make cast unknown a no-op (#7147)
  • fix panic on cum_prod (#7141)
  • respect f32 schema in deep expressions (#7146)
  • fix return type of _unpack_schema to prevent potential TypeError (#7128)
  • fix docstring in set_tbl_cols() (#7121)
  • fix deadlock in scan_csv()->sink_parquet() (#7118)
  • subtracting Series from date has wrong sign (#7107)
  • fix scan_ipc receiving storage_options (#7085)
  • nested sql exprs (#7112)

🛠️ Other improvements

  • ensure binary branches are executed in parall… (#7193)
  • Deprecate pl.get_dummies (#7055)
  • Add Series.cut, deprecate pl.cut (#7058)
  • examples functional programing (#7135)
  • fix docstring in set_tbl_cols() (#7121)
  • Build versioned API reference (#7114)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @Trippy3, @alexander-beedie, @foxcroftjn, @ghuls, @iamsmkr, @jakob-keller, @josh, @mslapek, @papparapa, @ritchie46, @romanovacca, @stinodego, @universalmind303 and @zundertj

Python Polars 0.16.8

22 Feb 14:56
55a2043
Compare
Choose a tag to compare

🚀 Performance improvements

  • optimize arr.sum for list array with inner nulls (#7053)
  • optimize arr.min/arr.max (#7050)
  • optimize arr.mean (#7048)
  • optimize arr.sum (#7047)
  • optimize 'arg_where' (#7039)
  • More efficient handling of *args/**kwargs (#7026)

✨ Enhancements

  • allow for simple creation of n-row empty frame/series via clear (#7095)
  • Make polars not copy data when importing from arrow (#7084)
  • More ergonomic drop args (#7063)
  • More ergonomic partition_by args (#7065)
  • More ergonomic exclude args (#7082)
  • allow inline expressions in asof_join (#7088)
  • add 'use_statistics' option to parquet readers (#7087)

🐞 Bug fixes

  • allow map_dict on categorical dtype (#7097)
  • fix logical types in arr.get (#7094)
  • allow fill_null in eager if type now known (#7092)
  • do projection just before concat to ensure same sizes (#7089)
  • fix 'filter' in groupby context when expression is… (#7041)
  • fix type hint of 'when->then->otherwise' (#7040)
  • accept more types in from_records (#7033)

🛠️ Other improvements

  • Rename pivot aggregate_fn to aggregate_function (#7059)
  • Add TYPE_CHECKING lints (#7070)
  • Deprecate more non-keyword arguments (#7030)
  • Rename kwarg reverse to descending (#6914)
  • Rename args f/func to function (#7032)
  • let read_csv take Sequence as columns, remove several type: ignore (#7028)
  • add example for arr.count_match() (#7029)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @coinflip112, @datapythonista, @jakob-keller, @moritzwilksch, @ritchie46, @stinodego, @universalmind303 and @zundertj

Python Polars 0.16.7

19 Feb 18:14
f3f8961
Compare
Choose a tag to compare

🚀 Performance improvements

  • add arr.count_match expression and optimize arr.sum for List<Boolean> (#7023)
  • optimize selection_to_pyexpr_list (#7020)
  • avoid unnecessary function calls in LazyFrame.with_columns() (#7019)
  • remove O^2 behavior in melt (#7003)
  • Improve performance of expr_to_lit_or_expr for arguments of type Expr by ~80% (#6967)
  • improve vec_hash perf for boolean and utf8 (#6963)
  • don't pack utf8 columns in grouptuples ~5-15% (#6959)
  • don't pack integer keys in determining ~8-18% group tuples. (#6956)
  • use fxhash for all integers (#6954)

✨ Enhancements

  • add arr.count_match expression and optimize arr.sum for List<Boolean> (#7023)
  • add sort for struct dtype (#7021)
  • More ergonomic coalesce args (#6989)
  • raise informative error if invalid datetime_format passed to write_csv (#7005)
  • Improve Series & Numpy arithmetic (#6983)
  • More ergonomic agg args (#6982)
  • rename parse_dates => try_parse_dates (#6987)
  • remove packaging and/or distutils dependency with a minimal version parser utility (#6972)
  • More ergonomic over args (#6986)
  • add upper_bound and lower_bound methods to Series (#6990)
  • More ergonomic col args (#6996)
  • More ergonomic sort args (#6896)
  • Make groupby agg shortcuts available in lazy (#6944)
  • add map_dict method for Series (#6946)

🐞 Bug fixes

  • reflect time zone conversion in lazy dataframe schema (#7022)
  • ensure set_sorted never panics (#7013)
  • fix struct append 0 sliced (#7012)
  • fix dtype of diff for uint8 (#7010)
  • fix coalesce supertype (#7000)
  • if given, respect dtype time zone when instantiating pl.lit value (#6999)
  • fix fill_null for categoricals (#6998)
  • dtype of pow function (#6985)
  • fix is_duplicated for utf8 dtype (#6997)
  • Remove check for path to be non-directory if use_pyarrow (#6994)
  • if given, respect dtype timeunit when instantiating pl.lit value (#6991)
  • Add packaging to runtime dependencies (#6962)
  • fix temporal logical types in pivot (#6957)
  • typo in mean unit test - changed median -> mean (#6960)
  • ensure literals are expanded in streaming (#6952)
  • str.contains strict=False took no effect (#6950)

🛠️ Other improvements

  • date-time unit tests refactor (#7002)
  • test lit series arithmetic order (#7015)
  • More test restructure (#6961)
  • Properly deprecate .struct.to_frame (#6958)
  • Properly deprecate GroupBy.agg_list (#6943)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @MatveyF, @alexander-beedie, @jakob-keller, @mslapek, @ozgrakkurt, @papparapa, @ritchie46, @sorhawell, @stinodego, @xhochy and @zundertj

Python Polars 0.16.6

16 Feb 18:32
e705e3a
Compare
Choose a tag to compare

✨ Enhancements

  • add is_duplicated/is_unique for struct dtype (#6940)
  • add is_between method for Series (#6933)
  • supported nested fixedsizebinary conversion (#6923)
  • raise error on invalid aggregation expressions (#6921)
  • provide better errors when failing to read CSV data from buffers that have advanced their read position (#6920)
  • truncate file path on error msg (#6917)
  • Parse JSON data in Utf8 to polars dtype (#6885)
  • More ergonomic groupby args (#6872)

🐞 Bug fixes

  • object to_dict (#6931)
  • respect maintain_order in groupby.apply (#6926)
  • add special fast path for elementwise expression o… (#6924)
  • fix anonymous list builder (#6916)
  • reject multithreading on excessive ',\n' fields (#6906)
  • fix regression with date => object typing in to_pandas method (#6902)
  • dispatch suffix to asof_join by (#6899)
  • improve recursive casting of nested data (#6897)
  • don't fast explode on null introducing take (#6890)
  • prevent external modules found on PYTHONPATH from bleeding into polars venv (#6888)
  • prevent conflation of unit.io tests directory with python io module (#6889)

🛠️ Other improvements

  • Bump ruff version (#6936)
  • add more nested construction tests (#6912)
  • Update Cargo.lock (#6893)
  • unify constructor logic when initialising from a sequence of dicts (#6887)
  • prevent conflation of unit.io tests directory with python io module (#6889)
  • refactor datelike as temporal, and support Time dtype in Series.to_numpy (#6881)
  • Consistently parse column name inputs (#6879)
  • Use Self type more consistently (#6882)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @adamgreg, @alexander-beedie, @josh, @jvdd, @ritchie46 and @stinodego

Python Polars 0.16.5

14 Feb 19:12
86182b8
Compare
Choose a tag to compare

🚀 Performance improvements

  • speedup quantile/median ~2x (#6861)
  • remove unneeded series allocations in groupby aggs (#6855)

✨ Enhancements

  • restore dataframe class (#6869)
  • add include_index option on init from pandas frames (#6847)
  • properly implement null array (#6817)
  • avoid panic error in strftime with invalid format (#6810)

🐞 Bug fixes

  • fix crash in write_csv when mixed tz-naive and tz-aware datetimes are present (#6828)
  • accept more types in groupby.agg (#6709)
  • Fix pl.from_dataframe() as pyarrow.interchange was not i… (#6844)
  • fix schema of functions: (#6845)
  • stabilize integer operation to minimal required dtype (#6841)
  • use explicit type-arg for PythonDataType (#6481)
  • fix numpy/datetime regression (#6835)
  • implement to_list for null dtype (#6834)
  • Raise ValueError on passing multiple expressions Numpy ufunc (#6821)
  • respect schema in ndjson (#6819)

🛠️ Other improvements

  • Fail tests on warning (#6868)
  • further improve struct expr docstrings (#6852)
  • Deprecate non-keyword args for some functions (#6851)
  • un-skip passing test (#6854)
  • parenthesise col type signature to improve hint interaction with PyCharm (#6850)
  • Deprecate positional join args (#6826)
  • Rename argsort/argsort_by to arg_sort/arg_sort_by (#6829)
  • Update dprint config excludes (#6822)
  • Fix some broken noqa comments (#6823)
  • Run mypy as part of the lint workflow (#6820)
  • various minor docstring rendering fixes (#6818)
  • fix lazy groupby docstring/rendering (#6816)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @ghuls, @josh, @kngwyu, @oysols, @ritchie46, @stinodego and @zundertj

Python Polars 0.16.4

11 Feb 10:33
c060bef
Compare
Choose a tag to compare

🚀 Performance improvements

  • remove PySequence downcast (#6803)
  • optimize arg_min/arg_max (#6799)

✨ Enhancements

  • boolean Series broadcast comparison (eq/neq) against scalar True/False (#6797)

🐞 Bug fixes

  • ensure join frame types are consistent (#6798)
  • enable empty DataFrame (and Series) init from List of Structs/Lists schema/dtype (#6795)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @igmriegel and @ritchie46

Rust Polars 0.27.0

10 Feb 17:43
c6a3df0
Compare
Choose a tag to compare

🏆 Highlights

  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)

⚠️ Breaking changes

  • error on string <-> date cmp (#6498)
  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • show where error messages originated (#6482)
  • str.strip with multiple chars (#5929)

🚀 Performance improvements

  • update string replacement codepaths following new benchmarking (#6777)
  • improve dynamic groupby performance on sorted keys (#6599)
  • faster frame-init from list of dicts (when omitting fields), and ensure fields are read according to the declared schema (#6472)
  • Improve rechunk check (#6268)
  • reuse allocated scratches in ipc writer (#6287)
  • use dedicated writer thread for sink_parquet (#6285)
  • first check rev-map on categorical equality check (#6085)
  • ensure set_at_idx is O(1) (#5977)
  • use iterator instead of loop polars_io::csv::parser::skip_condition (#5157)

✨ Enhancements

  • accept separator for pivot and to_dummies (#6780)
  • feat(rust, python) rename 'tz' to 'time_zone' in convert_time_zone and replace_time_zone (#6784)
  • rename with_time_zone to convert_time_zone and cast_time_zone to replace_time_zone (#6768)
  • support timezone in csv writer (#6722)
  • implement series abstractions for Int128Type (#6679)
  • parse timezone from Datetime (#6766)
  • formally support duration division (#6758)
  • add argmin/max for utf8 data (#6746)
  • Support an ignore_nulls param for EWM calculations. (#5749) (#6742)
  • deprecate tz_localize (#6693)
  • guarantee schema-stable col(dtype) selection (#6674)
  • better-characterise NotFound exceptions (#6670)
  • disallow with_time_zone from/to tz-naive (#6659)
  • let cast_time_zone work on tz-naive and deprecate tz-localize (#6649)
  • implement fill_null for list data (#6635)
  • expression functions should be nullable (#6629)
  • add streamable udfs (#6614)
  • is_first for struct dtype (#6595)
  • Added from_str_radix method to StringNameSpace that allows to parse strings from any base to i32 (#6570)
  • improve predicate pushdown (#6579)
  • raise error on invalid binary cmp (#6564)
  • let cast_time_zone accept None (#6539)
  • add utc parameter to strptime (#6496)
  • add meta 'has_multiple_outputs', 'is_regex_projec… (#6500)
  • error on string <-> date cmp (#6498)
  • show where error messages originated (#6482)
  • faster frame-init from list of dicts (when omitting fields), and ensure fields are read according to the declared schema (#6472)
  • allow expr in str.contains (#6443)
  • add float formatting option (#6432)
  • allow expressions as arguments in str.ends_with (#6361)
  • accept expr in str.starts_with (#6355)
  • add strict parameter to decoding expressions (#6342)
  • allow unordered struct creating from anyvalues (#6321)
  • parse abbrev month name (#6314)
  • add dt.combine for combining date and time components (#6121)
  • add sink_ipc (#6286)
  • ensure ooc sort works ooc with all-constant values (#6235)
  • The 1 billion row sort (#6156)
  • optionally treat missing UTF8 values as the empty string at CSV parse-time (#6203)
  • When moving error out of LogicalPlan, leave behind String with error message instead of None (#6199)
  • generalize the cloud storage builders (#5972)
  • Implement DataFrame.unique(keep="none") (#6169)
  • add arr.take expression (#6116)
  • allow extend_constant to work with date literals (#6114)
  • allow nested categorical cast (#6113)
  • add a rounded_corners modifier to pl.Config.set_tbl_formatting (#6108)
  • Get polars to compile to wasm target (#6050)
  • add search_sorted for arrays and utf8 dtype (#6083)
  • improve error message when writing nested data to… (#6040)
  • updated default table format from "UTF8_FULL" to "UTF8_FULL_CONDENSED" (#5967)
  • str.strip with multiple chars (#5929)
  • support glob in parquet object_storage (#5928)
  • read decimal as f64 (#5938)
  • improve query plan scan formatting (#5937)
  • allow all null cast (#5933)
  • truncate by calendar weeks (#5759)
  • merge sorted dataframes (#5817)
  • impl hex and base64 for binary (#5892)
  • streaming parquet from object_stores (#5871)

🐞 Bug fixes

  • always rechunk if n_chunks > n_rows (#6786)
  • fix ndjson empty array parsing (#6785)
  • make some list expressions aware of groupby context (#6776)
  • use explicit drop function node (#6769)
  • don't set sorted flag if we reverse sort the left … (#6772)
  • handle edge-case with string-literal replacement when the replace value looks like a capture group (#6765)
  • respect skip_rows in glob parsing csv (#6754)
  • Improve error message in DataFrame constructor (#6715)
  • arrow map dtype conversion (#6732)
  • dedicated rename implementation. (#6688)
  • return correct display/repr names for NaN-related expressions (#6721)
  • strftime with time zone directive (#6673)
  • improve error message in date_range with invalid units (#6671)
  • remove uses of rayon global thread pool (#6682)
  • true-divide output type (#6665)
  • fix(rust, python) cast to and from fixed offsets (#6602)
  • raise error on string numeric arithmetic (#6601)
  • partially assert sortedness in groupby dynamic (#6593)
  • fix(rust, python); raise oob if negative index given to take (#6590)
  • fix predicate pushdown key check (#6577)
  • fix schema of apply with many inputs on empty df (#6571)
  • let lhs determine struct order in supertype (#6572)
  • fix(rust, python) validate utc, fmt, and tz-aware in strptime (#6550)
  • add strptime to filter boundary (#6560)
  • list eval all null array (#6545)
  • implement ser/de for BinaryChunked (#6543)
  • raise if tz_localize called on UTC-aware (#6526)
  • make concat_list group aware (#6527)
  • error on invalid expanding expression (#6521)
  • create from dicts directly as struct categorical (#6520)
  • fix oob in arr.get by expressions (#6519)
  • fix cse schema (#6518)
  • panic when max_len -1 is reached (#6494)
  • Formalize list aggregation difference between groupbys, selection and window functions (#6487)
  • fix(rust, python) validate tz in with_time_zone (#6417)
  • faster frame-init from list of dicts (when omitting fields), and ensure fields are read according to the declared schema (#6472)
  • use consistent floor division for floats/ints (#6460)
  • split semi/anti join optimization (#6459)
  • fix doc comment in ParallelStrategy (#6444)
  • fix projection pushdown on double semi join (#6440)
  • cumulative_eval ensure output dtype is respected (#6435)
  • auto-detect %+ as tz-aware (#6434)
  • correct error message in cast_time_zone (#6411)
  • only use float simd on specific alignment (#6427)
  • no early escape when window is equal to len in rolling_float (#6408)
  • raise error on invalid sort_by argument (#6382)
  • take offset into account with str.explode (#6384)
  • Return empty batch for pl.read_csv_batched().next_… (#6381)
  • implement ser/de for StructChunked (#6359)
  • series of empty structs (#6347)
  • don't cast nulls before trying normal cast (#6339)
  • expand all nested wildcards in functions (#6334)
  • fix groupby rolling by_key if groups are empty (#6333)
  • parse abbrev month name (#6314)
  • disallow alias in inline join expressions (#6312)
  • feature flag "get_sink" ipc (#6306)
  • block proj-pd and pred-pd on swapping rename (#6303)
  • convert nested dictionary with i64 keys (#6299)
  • fix panic dynamic_groupby on empty dataframe (#6294)
  • Parse negative dates with polars parser (#6256)
  • Add list inner dtype when printing Series (#6233)
  • fix when then otherwise with arity and aggregation… (#6224)
  • pass name to value counts in aggregation (#6221)
  • don't set fast_explode on list of structs (#6220)
  • explode of empty nullable list (#6190)
  • fix empty streaming joins (#6149)
  • fix streaming joins where the join order has been … (#6143)
  • write tz-aware datetimes to csv (#6135)
  • Print error message on mmap IPC file only in verbose mode (#6098)
  • fix invalid dtype in chunked array after struct cast (#6093)
  • don't run cse cache_states if no projections found (#6087)
  • Update read_csv error message (#6082)
  • propogate nulls in binary arithmetic/aggregation (#6076)
  • deal with unnest schema expansion in projection pd (#6063)
  • correct output dtype for cummin/cumsum/cummax (#6062)
  • block streaming on literal series/range (#6058)
  • ndjson struct inference (#6049)
  • deal with empty structs (#6039)
  • fix aggregation that filters out all data (#6036)
  • fix diff overflow (#6033)
  • keep column names in is_null/is_not_null (#6032)
  • keep name when sorting categorical in lexial order (#6029)
  • properly set null anyvalue if categorical is neste… (#6025)
  • make weekday tz-aware (#5989)
  • fix categorical in struct anyvalue issue (#5987)
  • fix invalid boolean simplification (#5976)
  • allow empty sort on any dtype (#5975)
  • properly deal with categoricals in streaming queries (#5974)
  • don't panic on ignored context (#5958)
  • don't allow named expression in arr.eval (#5957)
  • fix panic in join expressions (#5954)
  • block ordered predicates before explode (#5951)
  • adhere to schema in arr.eval of empty list (#5947)
  • fix arrow nested null conversion (#5946)
  • allow None in arr.slice length (#5934)
  • fix time to duration cast (#5932)
  • error on addition with datetime/time (#5931)
  • don't create categoricals in streaming (#5926)
  • object filter should keep single chunk (#5913)
  • csv, read escaped "" as missing (#5912)
  • fix pivot of signed integers (#5909)
  • fix latest oob in streaming convertion (#5902)
  • fix date + duration offsets outside of nanosecond datetime bounds (#5889)
  • adapt k to len in topk (#5888)

🛠️ Other improvements

  • propagate error in date_range with invalid time zone (#6759)
  • update arrow to 0.16 (#6748)
  • remove unreachable path in write_anyvalue (#6727)
  • add groupby_dynamic to docs (#6725)
  • chore(rust) disallow chunked d...
Read more

Python Polars 0.16.3

10 Feb 17:13
b5a1e74
Compare
Choose a tag to compare

✨ Enhancements

  • add update method to ldf/df (#6787)
  • accept separator for pivot and to_dummies (#6780)
  • feat(rust, python) rename 'tz' to 'time_zone' in convert_time_zone and replace_time_zone (#6784)
  • Allow other expressions for default arg in map_dict (#6781)
  • minor ergonomic affordance; allow pl.concat from generator expression (#6779)
  • rename with_time_zone to convert_time_zone and cast_time_zone to replace_time_zone (#6768)
  • Add map_dict expression. (#5899)
  • support timezone in csv writer (#6722)
  • default to 1d interval in date_range (#6771)
  • parse timezone from Datetime (#6766)
  • Add option to use PyArrow backed-extension arrays when … (#6756)
  • formally support duration division (#6758)
  • add argmin/max for utf8 data (#6746)
  • Improve numpy support: conversion of numpy arrays with … (#6738)
  • Improved assert equal messages (#6737)
  • Support an ignore_nulls param for EWM calculations. (#5749) (#6742)
  • scan_ds predicate pushdown for string cmp (#6734)
  • don't require pyarrow for utf8 -> numpy conversion (#6733)
  • More ergonomic with_columns args (#6686)
  • feat(python):Add return_as_string arg to DF.glimpse; default=False (#6678)
  • better-characterise NotFound exceptions (#6670)
  • disallow with_time_zone from/to tz-naive (#6659)
  • More ergonomic select args (#6667)
  • let cast_time_zone work on tz-naive and deprecate tz-localize (#6649)
  • improved exceptions on attempt to use invalid schema/dtypes (#6653)

🐞 Bug fixes

  • always rechunk if n_chunks > n_rows (#6786)
  • fix ndjson empty array parsing (#6785)
  • make some list expressions aware of groupby context (#6776)
  • use explicit drop function node (#6769)
  • don't set sorted flag if we reverse sort the left … (#6772)
  • handle edge-case with string-literal replacement when the replace value looks like a capture group (#6765)
  • respect skip_rows in glob parsing csv (#6754)
  • Improve error message in DataFrame constructor (#6715)
  • arrow map dtype conversion (#6732)
  • respect 'None' in from_dicts (#6726)
  • dedicated rename implementation. (#6688)
  • return correct display/repr names for NaN-related expressions (#6721)
  • strftime with time zone directive (#6673)
  • typing for Series methods that can return None (#6690)
  • ensure that iter_rows always returns all values from all chunks/batches in accelerated codepath (#6708)
  • Support numpy ufunc when expression not first arg (#6675)
  • Raise ValueError on adding float to Series of dtype date (#6677)
  • remove uses of rayon global thread pool (#6682)
  • true-divide output type (#6665)
  • improve behaviour of dict-expansion (scalars) when mixed with numpy arrays (#6663)
  • Preserve Expr name in is_between (#6661)
  • Tiny improvement of Field repr (#6640)

🛠️ Other improvements

  • Update mypy to version 1.0.0 (#6744)
  • integrate ignore_nulls into EWM parametric tests (#6751)
  • redirect tz_localize (#6749)
  • Reorganize benchmark test folder (#6695)
  • Split long test modules (namespaces) (#6668)
  • Use pytest marker for slow tests (#6642)
  • unify nan_to_null and nan_to_none parameter names, expose to DataFrame init, add test coverage (#6637)
  • update extend_constant docs/typing (and test coverage) (#6646)

Thank you to all our contributors for making this release possible!
@AnatolyBuga, @MarcoGorelli, @MatveyF, @alexander-beedie, @ghuls, @jgmartin, @phaile2, @plaflamme, @ritchie46, @sorhawell, @stinodego, @yuntai and @zundertj