Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QL: Eliminate internal type DATETIME_NANOS #68220

Merged
merged 19 commits into from
Feb 10, 2021

Conversation

matriv
Copy link
Contributor

@matriv matriv commented Jan 29, 2021

Moving towards grouping of data types in the field caps API
the internal data type DATETIME_NANOS, introduced for date_nanos
support, is eliminated.

Relates: #67722
Follows: #67666

Moving towards grouping of data types in the field caps API
the internal data type `DATETIME_NANOS` introduced for `date_nanos`
supports is eliminated.

Relates: elastic#67722
Follows: elastic#67666
@elasticmachine elasticmachine added Team:QL (Deprecated) Meta label for query languages team labels Jan 29, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@matriv matriv requested review from costin, astefan and bpintea and removed request for costin January 29, 2021 15:06
@elastic elastic deleted a comment from elasticmachine Jan 29, 2021
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the DATE_NANOS type be removed as well, per the PR description?
The main feedback that I have is for this PR to land after adding the transition to fields API.

if (values instanceof String) {
return DateUtils.asDateTimeWithNanos(values.toString(), zoneId());
try {
return DateUtils.asDateTimeWithNanos(values.toString(), zoneId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd wait for the fields API from @astefan to land before going for this one.
Hopefully that eliminates the internal handling of data format and thus this try/catch as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess fields API will make this uniform, but just in case, an alternative might be to check the presence of the T or . in the string and based on that call the right parser.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious to see the perf difference of try/catch and trying to check the string:

Benchmark                              Mode  Cnt        Score      Error  Units
BenchmarkDateParse.testWithContains   thrpt   25  1631774,366 ± 6016,080  ops/s
BenchmarkDateParse.testWithException  thrpt   25   629767,727 ± 3843,439  ops/s

and it's quite big!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use the contains("-") since it's always there for Datetime stings and it's at the beginning of the string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it's quite big!

I wonder if indexing the string wouldn't actually make a further difference: contains takes each character at a time, but the datetime format is fixed, so - should always be at .charAt(4). Either way, I think it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, yeah, I rushed here, need to revisit:

  • a timestamp can start with - (negative millis).
  • .charAt(4) is also not safe, as there are possible dates like: +14953-02-.... or -134-05-....

@matriv
Copy link
Contributor Author

matriv commented Feb 2, 2021

Shouldn't the DATE_NANOS type be removed as well, per the PR description?
The main feedback that I have is for this PR to land after adding the transition to fields API.

No problem to wait until the fields API transition.

What do you mean DATE_NANOS? There's no such data type introduced in QL/SQL.

@mark-vieira
Copy link
Contributor

@elasticmachine update branch

@costin
Copy link
Member

costin commented Feb 3, 2021

What do you mean DATE_NANOS? There's no such data type introduced in QL/SQL.

I meant org.elasticsearch.xpack.ql.type.DataTypes.DATETIME_NANOS

@matriv
Copy link
Contributor Author

matriv commented Feb 3, 2021

What do you mean DATE_NANOS? There's no such data type introduced in QL/SQL.

I meant org.elasticsearch.xpack.ql.type.DataTypes.DATETIME_NANOS

https://github.com/elastic/elasticsearch/pull/68220/files#diff-f31cd8869d3c81accd160cec419b0b8ae9308c0dcb4b5c9d38afedc0de6321a6L43

@matriv matriv requested a review from costin February 5, 2021 12:36
@matriv
Copy link
Contributor Author

matriv commented Feb 10, 2021

@elasticmachine run elasticsearch-ci/2

@matriv
Copy link
Contributor Author

matriv commented Feb 10, 2021

We don't need to extract date_nanos for EQL as we don't have a project phase as in SQL.
Could be needed if we implement support for date_nanos for sequence matching: Opened #68812 to track this.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. I've left one comment about a method that seems to have sneaked in at merge time.
Also, in ExpressionTests do we still need the testCastToDatetimeNanosNotSupported() test?

Comment on lines 67 to 70
private static boolean hasDocValues(DataType dataType) {
return dataType == KEYWORD || dataType == DATETIME;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed.

@matriv
Copy link
Contributor Author

matriv commented Feb 10, 2021

@astefan Nice catch, removing the test, and indeed I missed this merge mistake with the hasDocValues() method, removing.

@matriv matriv requested a review from astefan February 10, 2021 13:55
Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@astefan
Copy link
Contributor

astefan commented Feb 10, 2021

@elasticmachine run elasticsearch-ci/2

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More lines removed than added - that's nice.
Left a comment regarding the history in Github otherwise LGTM.

}
// We ask @timestamp (or the defined alternative field) to be returned as `epoch_millis`
// when matching sequence to avoid parsing into ZonedDateTime objects for performance reasons.
return parseEpochMillisAsString(values.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this improvement.


private static NumberType numberType(DataType dataType) {
return NumberType.valueOf(dataType.esType().toUpperCase(Locale.ROOT));
return values;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the left side removed due to the use of fields API? Just double checking why the change appears to belong to this PR in Github.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a merge mistake, it's reverted already.

@matriv matriv merged commit 45677a3 into elastic:master Feb 10, 2021
@matriv matriv deleted the merge-datetime_nanos branch February 10, 2021 17:15
matriv added a commit that referenced this pull request Feb 10, 2021
Moving towards grouping of data types in the field caps API
the internal data type `DATETIME_NANOS` introduced for `date_nanos`
support is eliminated.

Relates: #67722
Follows: #67666

(cherry picked from commit 45677a3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants