From 57a9346e10d89a1722d99b531b3e460302c98483 Mon Sep 17 00:00:00 2001 From: Charis <26616127+charislam@users.noreply.github.com> Date: Thu, 1 Sep 2022 10:44:03 -0400 Subject: [PATCH] Revert "[TimescaleDB 2.8] Update `time_bucket` (#1524)" (#1533) This reverts commit 08b6e80636a9bfd7dac3b4751a8cf36472d4488f. --- api/time_bucket.md | 42 +++--- api/time_bucket_ng.md | 31 +++-- .../time-buckets/about-time-buckets.md | 121 +++++++++++++++--- 3 files changed, 145 insertions(+), 49 deletions(-) diff --git a/api/time_bucket.md b/api/time_bucket.md index 4f24e4dea125..f73220be5986 100644 --- a/api/time_bucket.md +++ b/api/time_bucket.md @@ -18,29 +18,38 @@ function. Unlike `date_trunc`, it allows for arbitrary time intervals instead of second, minute, and hour intervals. The return value is the bucket's start time. +`TIMESTAMPTZ` arguments are bucketed by the time in UTC, so the alignment of +buckets is on UTC time. One consequence of this is that daily buckets are +aligned to midnight UTC, not local time. If you want buckets aligned by local +time, cast the `TIMESTAMPTZ` input to `TIMESTAMP`, which converts the value to +local time, before you pass it to `time_bucket`. For an example, see the sample +use in this section. + Note that daylight savings time boundaries means that the amount of data aggregated into a bucket after such a cast can be irregular. For example, if the `bucket_width` is 2 hours, the number of UTC hours bucketed by local time on daylight savings time boundaries can be either three hours or one hour. + +Month, year, and timezones are not supported by the `time_bucket` +function. If you need to use month, year, or timezone arguments, try the +experimental [`time_bucket_ng`](/api/latest/hyperfunctions/time_bucket_ng/) +function instead. + + ## Required arguments for interval time inputs |Name|Type|Description| |-|-|-| |`bucket_width`|INTERVAL|A PostgreSQL time interval for how long each bucket is| -|`ts`|TIMESTAMP or TIMESTAMPTZ|The timestamp to bucket| - -If you use months as an interval for `bucket_width`, you cannot combine it with -a non-month component. For example, `1 month` and `3 months` are both valid -bucket widths, but `1 month 1 day` and `3 months 2 weeks` are not. +|`ts`|TIMESTAMP|The timestamp to bucket| ## Optional arguments for interval time inputs |Name|Type|Description| |-|-|-| -|`timezone`|TEXT|The timezone for calculating bucket start and end times. Can only be used with `TIMESTAMPTZ`. Defaults to UTC.| |`offset`|INTERVAL|The time interval to offset all time buckets by. A positive value shifts bucket start and end times later. A negative value shifts bucket start and end times earlier. `offset` must be surrounded with double quotes when used as a named argument, because it is a reserved key word in PostgreSQL.| -|`origin`|TIMESTAMP|Buckets are aligned relative to this timestamp. Defaults to midnight on January 3, 2000, for buckets that don't include a month or year interval, and to midnight on January 1, 2000, for month, year, and century buckets.| +|`origin`|TIMESTAMP|Buckets are aligned relative to this timestamp| ## Required arguments for integer time inputs @@ -119,14 +128,11 @@ GROUP BY five_min ORDER BY five_min DESC LIMIT 10; ``` -Bucket temperature values to calculate the average monthly temperature. Set the -timezone to 'Europe/Berlin' so bucket start and end times are aligned to -midnight in Berlin. - -```sql -SELECT time_bucket('1 month', ts, 'Europe/Berlin') AS month_bucket, - avg(temperature) AS avg_temp -FROM weather -GROUP BY month_bucket -ORDER BY month_bucket DESC LIMIT 10; -``` + +If you are upgrading from a version earlier than 1.0.0, the default origin is +moved from 2000-01-01 (Saturday) to 2000-01-03 (Monday) between versions 0.12.1 +and 1.0.0. This change was made to make `time_bucket` compliant with the ISO +standard for Monday as the start of a week. This should only affect multi-day +calls to `time_bucket`. The old behavior can be reproduced by passing +`2000-01-01` as the origin parameter to `time_bucket`. + diff --git a/api/time_bucket_ng.md b/api/time_bucket_ng.md index 3bb9ae065e93..49c67c5afec0 100644 --- a/api/time_bucket_ng.md +++ b/api/time_bucket_ng.md @@ -8,21 +8,30 @@ api: license: apache type: function experimental: true - deprecated: true hyperfunction: type: bucket --- -import DeprecationNotice from "versionContent/_partials/_deprecated.mdx"; - ## timescaledb_experimental.time_bucket_ng() Experimental -The `time_bucket_ng()` function is an experimental version of the -[`time_bucket()`][time_bucket] function. It introduced some new capabilities, -such as monthly buckets and timezone support. Those features are now part of the -regular `time_bucket()` function. +The `time_bucket_ng()` (next generation) experimental function is an updated +version of the original [`time_bucket()`][time_bucket] function. While +`time_bucket` works only with small units of time, `time_bucket_ng()` +supports years and months in addition to small units of time. - + +Experimental features could have bugs! They might not be backwards compatible, +and could be removed in future releases. When this function is no longer experimental, +you will need to delete and rebuild any continuous aggregate that uses it. +Use experimental features at your own risk and we do not recommend to use +any experimental feature in a production environment. + + +|Functionality|time_bucket()|time_bucket_ng()| +|-|-|-| +|Buckets by seconds, minutes, hours, days and weeks|YES|YES| +|Buckets by months and years|NO|YES| +|Timezones support|NO|YES| The `time_bucket()` and `time_bucket_ng()` functions are similar, but not @@ -32,8 +41,10 @@ Firstly, `time_bucket_ng()` doesn't work with timestamps prior to `origin`, while `time_bucket()` does. Secondly, the default `origin` values differ. `time_bucket()` uses an origin -date of January 3, 2000, for buckets shorter than a month. `time_bucket_ng()` -uses an origin date of January 1, 2000, for all bucket sizes. +date of January 3, 2000, because that date is a Monday. This works better with +weekly buckets. `time_bucket_ng()` uses an origin date of January 1, 2000, because +it is the first day of the month and the year. This works better with monthly +or annual aggregates. ### Required arguments diff --git a/timescaledb/how-to-guides/time-buckets/about-time-buckets.md b/timescaledb/how-to-guides/time-buckets/about-time-buckets.md index 74b466052946..162fa8fe0600 100644 --- a/timescaledb/how-to-guides/time-buckets/about-time-buckets.md +++ b/timescaledb/how-to-guides/time-buckets/about-time-buckets.md @@ -4,8 +4,9 @@ excerpt: Learn how time buckets help you aggregate data by time interval keywords: [time buckets] --- -# About time buckets +import Experimental from 'versionContent/_partials/_experimental.mdx'; +# About time buckets The [`time_bucket`][time_bucket] function allows you to aggregate data into buckets of time, for example: 5 minutes, 1 hour, or 3 days. It's similar to PostgreSQL's [`date_trunc`][date_trunc] function, but it gives you more @@ -17,14 +18,13 @@ roll up data for analysis or downsampling. For example, you can calculate rollups as needed, or pre-calculate them in [continuous aggregates][caggs]. This section explains how time bucketing works. For examples of the -`time_bucket` function, see the section on +`time_bucket` function, see the section on [using time buckets][use-time-buckets]. ## How time bucketing works - Time bucketing groups data into time intervals. With `time_bucket`, the interval length can be any number of microseconds, milliseconds, seconds, minutes, hours, -days, weeks, months, years, or centuries. +days, or weeks. `time_bucket` is usually used in combination with `GROUP BY` to aggregate data. For example, you can calculate the average, maximum, minimum, or sum of values @@ -35,8 +35,14 @@ within a bucket. alt="Diagram showing time-bucket aggregating data into daily buckets, and calculating the daily sum of a value" /> -### Origin + +`time_bucket` doesn't support months, years, or timezones. The experimental +function `time_bucket_ng` adds support for these intervals and parameters. To +learn more, see the section on +[`time_bucket_ng`](#experimental-function-time-bucket-ng). + +### Origin The origin determines when time buckets start and end. By default, a time bucket doesn't start at the earliest timestamp in your data. There is often a more logical time. For example, you might collect your first data point at `00:37`, @@ -57,44 +63,117 @@ for the beginning of the bucket. alt="Diagram showing how time buckets are calculated from the origin" /> +The default origin for `time_bucket` is January 3, 2000. For integer time +values, the default origin is 0. + For example, say that your data's earliest timestamp is April 24, 2020. If you bucket by an interval of two weeks, the first bucket doesn't start on April 24, which is a Friday. It doesn't start on April 20, which is the immediately preceding Monday. It starts on April 13, because you can get to April 13, 2020, -by counting in two-week increments from January 3, 2000, which is the default -origin in this case. - -#### Default origins +by counting in two-week increments from January 3, 2000. -For intervals that don't include months or years, the default origin is January -3, 2000. For month, year, or century intervals, the default origin is January 1, -2000. For integer time values, the default origin is 0. +#### Choice of origin +In TimescaleDB 1.0 and above, the default origin for `time_bucket` is January 3, +2000. That date is a Monday, which allows week-based buckets to begin on Monday +by default. This behavior is compliant with the ISO standard for Monday as the +start of a week. -These choices make the time ranges of time buckets more intuitive. Because -January 3, 2000, is a Monday, weekly time buckets start on Monday. This is -compliant with the ISO standard for calculating calendar weeks. By contrast, -monthly and yearly time buckets use January 1, 2000, as an origin. This allows -them to start on the first day of the calendar month or year. +In prior versions, the default origin was January 1, 2000. `time_bucket_ng` also +uses January 1, 2000. That date is more natural for counting months and years. If you prefer another origin, you can set it yourself using the [`origin` parameter][origin]. For example, to start weeks on Sunday, set the origin to Sunday, January 2, 2000. ### Timezones - The origin time depends on the data type of your time values. If you use `TIMESTAMP`, by default, bucket start times are aligned with `00:00:00`. Daily and weekly buckets start at `00:00:00`. Shorter buckets start at a time that you can get to by counting in bucket increments from `00:00:00` -on the origin date. +on January 3, 2000. If you use `TIMESTAMPTZ`, by default, bucket start times are aligned with -`00:00:00 UTC`. To align time buckets to another timezone, set the `timezone` -parameter. +`00:00:00 UTC`. To get buckets aligned to local time, cast the `TIMESTAMPTZ` to +`TIMESTAMP` before passing it to `time_bucket`. + + +Casting `TIMESTAMPTZ` to `TIMESTAMP` works outside of continuous aggregates. For +example, you can use it in a stand-alone `SELECT` statement to perform a +one-time calculation. It does not work within continuous aggregates. To learn +more, see the section on [time in continuous aggregates](/timescaledb/latest/how-to-guides/continuous-aggregates/time/). + + +### Time_bucket in continuous aggregates +Time buckets are commonly used to create [continuous aggregates][caggs]. +Continuous aggregates add some limitations to what you can do with +`time_bucket`. + +Continuous aggregates don't allow functions that depend on a local timezone +setting. That is, you cannot cast `TIMESTAMPTZ` to `TIMESTAMP` within a +continuous aggregate definition. To learn more and find a workaround, see the +section on [time in continuous aggregates][time-cagg]. + +Continuous aggregates also don't allow named parameters. + +## Experimental function: time_bucket_ng +The experimental function [`time_bucket_ng`][time_bucket_ng] adds new features, +including support for months, years, and timezones. + + + +### Months and years +In addition to the time units supported by `time_bucket`, `time_bucket_ng` also +supports months and years. For example, you can bucket data into 3-month or +5-year intervals. + +### Origin +By default, `time_bucket_ng` uses Saturday, January 1, 2000 for its origin. This +differs from `time_bucket`. Because `time_bucket_ng` supports months and years, +January 1 provides a more natural starting date for counting intervals. + +Unlike `time_bucket`, `time_bucket_ng` doesn't support dates before the origin. +In other words, by default, you cannot use `time_bucket_ng` with data from +before the year 2000. If you need to go farther back in time, you can change the +origin by setting the [`origin` parameter][origin-ng]. + +### Timezones +`time_bucket_ng` adds support for timezones. By setting the `timezone` +parameter, you can align bucket start times to local time, even if the time +values are in `TIMESTAMPTZ` form. That means you can start daily buckets at +midnight local time rather than UTC time. + +### Time_bucket_ng in continuous aggregates +Time buckets are commonly used to create [continuous aggregates][caggs]. +Continuous aggregates add some limitations to what you can do with +`time_bucket_ng`. For example, continuous aggregates don't allow named +parameters. + +Here are the `time_bucket_ng` features supported by continuous aggregates: + +|Feature|Available in continuous aggregate|TimescaleDB version| +|-|-|-| +|Buckets by seconds, minutes, hours, days, and weeks|✅|2.4.0 and later| +|Buckets by months and years|✅|2.6.0 and later| +|Timezones|✅|2.6.0 and later| +|Custom origin|✅|2.7.0 and later| + +## Time_bucket compared to time_bucket_ng +There are several differences between `time_bucket` and `time_bucket_ng`: + +|Feature|`time_bucket`|`time_bucket_ng`| +|-|-|-| +|Bucket by microseconds, milliseconds, seconds, hours, minutes, days, and weeks|✅|✅| +|Bucket by months and years|❌|✅| +|Bucket `TIMESTAMPTZ` values according to local time using the `timezone` parameter|❌|✅| +|Origin|January 3, 2000|January 1, 2000| +|Bucket dates before the origin|✅|❌ Work around this by changing the origin.| [caggs]: /timescaledb/:currentVersion:/how-to-guides/continuous-aggregates/ [date_trunc]: https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC +[origin-ng]: /api/:currentVersion:/hyperfunctions/time_bucket_ng/#optional-arguments [origin]: /api/:currentVersion:/hyperfunctions/time_bucket/#optional-arguments-for-interval-time-inputs +[time-cagg]: /timescaledb/:currentVersion:/how-to-guides/continuous-aggregates/time/ [time_bucket]: /api/:currentVersion:/hyperfunctions/time_bucket/ +[time_bucket_ng]: /api/:currentVersion:/hyperfunctions/time_bucket_ng/ [use-time-buckets]: /timescaledb/:currentVersion:/how-to-guides/time-buckets/use-time-buckets/