Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Add a low-friction sql method for DataFrame and LazyFrame #15783

Merged
merged 5 commits into from
Apr 22, 2024

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented Apr 19, 2024

Introduces a new sql() method for both DataFrame and LazyFrame. Makes use of SQLContext internally, streamlining the usage of ad-hoc SQL on individual frames.

Features

  • Automatically registers the calling frame as "self" in the SQL context.
  • Plenty of examples and docstring detail.
from datetime import date
import polars as pl

df = pl.DataFrame({
    "a": [1, 2, 3],
    "b": ["zz", "yy", "xx"],
    "c": [date(1999, 12, 31), date(2010, 10, 10), date(2077, 8, 8)],
})

df.sql("SELECT a::float4, b, EXTRACT(year FROM c) AS c_year FROM self WHERE a > 1")
# shape: (2, 3)
# ┌─────┬─────┬────────┐
# │ a   ┆ b   ┆ c_year │
# │ --- ┆ --- ┆ ---    │
# │ f32 ┆ str ┆ i32    │
# ╞═════╪═════╪════════╡
# │ 2.0 ┆ yy  ┆ 2010   │
# │ 3.0 ┆ xx  ┆ 2077   │
# └─────┴─────┴────────┘

Also

  • Recognises additional PostgreSQL float casts: ::float4 (32bit) and ::float8 (64bit) and ::float(n) (adaptive).
  • Allows the full range of ::timestamp(n) precision modifiers (1-6).
  • Fixes an (unlikely) edge-case in table name registration.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Apr 19, 2024
@alexander-beedie alexander-beedie changed the title feat: Add a low-friction sql DataFrame/LazyFrame method feat(python): Add a low-friction sql method for DataFrame and LazyFrame Apr 19, 2024
@alexander-beedie alexander-beedie removed the rust Related to Rust Polars label Apr 19, 2024
Copy link

codspeed-hq bot commented Apr 19, 2024

CodSpeed Performance Report

Merging #15783 will not alter performance

Comparing alexander-beedie:frame-level-sql-query (1bd92d3) with main (937fd46)

Summary

✅ 22 untouched benchmarks

@stinodego
Copy link
Member

Makes a lot of sense to be able to execute SQL queries directly on a frame! Very happy with this addition.

I'll put on my nitpick hat and go through this a bit later, but definitely a good move in my opinion👍

@alexander-beedie
Copy link
Collaborator Author

alexander-beedie commented Apr 19, 2024

Fixing the failing test and further ensuring float-type parsing is even more conformant with PostgreSQL - been reading their float1 documentation in more detail and spotted some notes about "float(n)" definition 👀

Footnotes

  1. https://www.postgresql.org/docs/current/datatype-numeric.html#DATATYPE-FLOAT

@alexander-beedie alexander-beedie added the A-sql Area: Polars SQL functionality label Apr 19, 2024
Copy link

codecov bot commented Apr 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.39%. Comparing base (d11da5e) to head (1bd92d3).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15783      +/-   ##
==========================================
- Coverage   81.37%   80.39%   -0.98%     
==========================================
  Files        1379     1264     -115     
  Lines      176843   165428   -11415     
  Branches     2543        0    -2543     
==========================================
- Hits       143908   133001   -10907     
+ Misses      32452    32427      -25     
+ Partials      483        0     -483     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I don't think it contains anything controversial, so I will go on and merge this. ;)

Love the functionality!

@ritchie46 ritchie46 merged commit a078d0c into pola-rs:main Apr 22, 2024
27 checks passed
@alexander-beedie alexander-beedie deleted the frame-level-sql-query branch April 22, 2024 07:09
@alexander-beedie alexander-beedie added the highlight Highlight this PR in the changelog label Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql Area: Polars SQL functionality enhancement New feature or an improvement of an existing feature highlight Highlight this PR in the changelog python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants