Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax support for PARTITIONED BY/CLUSTERED BY in INSERT queries #12163

Merged
merged 54 commits into from
Feb 8, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
29969da
Initial commit, build working
LakshSingla Jan 4, 2022
6d3382f
Merge branch 'master' into calcite-ext-prototype
LakshSingla Jan 4, 2022
f70aad9
Add for SqlSelect and OrderBy
LakshSingla Jan 4, 2022
7b2c884
Integrate the changes such that set context works
LakshSingla Jan 6, 2022
f406334
Merge branch 'master' into calcite-ext-prototype
LakshSingla Jan 6, 2022
c9d9bc9
Fix nullptr in DruidPlanner
LakshSingla Jan 6, 2022
ab57bc0
Add files for SqlInsert
LakshSingla Jan 17, 2022
7a3e479
Revert setcontext changes, only insert changes being exposed
LakshSingla Jan 17, 2022
d791669
Cleanup code, ClusterBy will create a new SqlNode of type OrderBy
LakshSingla Jan 17, 2022
5b741a6
Cleanup code, revert unnesecary changes
LakshSingla Jan 17, 2022
2ddad66
Merge branch 'master' into calcite-ext-insert
LakshSingla Jan 17, 2022
1845ddf
Remove contextMap from ParsedNodes
LakshSingla Jan 17, 2022
2817a85
Add getter for partitionBy
LakshSingla Jan 17, 2022
7d63e44
Checkstyle fix
LakshSingla Jan 17, 2022
a22eb0e
Add generated files to ignorelist of forbiddenapis
LakshSingla Jan 17, 2022
1119dd0
indentation fix, add license to insert.ftl
LakshSingla Jan 18, 2022
50ce3fb
Add testcases which plan using modified syntax
LakshSingla Jan 18, 2022
472291c
Handle LIMIT/OFFSET clauses, add more testcases
LakshSingla Jan 18, 2022
6c2fcf2
Exclude generated classes from spotbug/forbiddenapis
LakshSingla Jan 19, 2022
e31ebcc
Merge branch 'master' into calcite-ext-insert
LakshSingla Jan 19, 2022
0b7dec6
Remove version from mvn-dep plugin, add sl4j dep
LakshSingla Jan 19, 2022
f78606c
Pass the partition by granularity in the querycontext
LakshSingla Jan 24, 2022
4768209
Merge branch 'master' into calcite-ext-insert
LakshSingla Jan 26, 2022
c290a63
Address review comments, check on the granularity passed
LakshSingla Jan 26, 2022
4f4cb77
Improve testcase for cluster by
LakshSingla Jan 26, 2022
8371400
Fix build, add CLUSTER to kwds as well
LakshSingla Jan 26, 2022
94d9175
fix format string
LakshSingla Jan 27, 2022
585cd61
Make CLUSTER reserved keyword
LakshSingla Jan 27, 2022
1e54241
POM changes - move plugin version to parent, scope=provided for slf4j
LakshSingla Jan 28, 2022
4330703
Rename cluster to clustered, partition to partitioned
LakshSingla Jan 28, 2022
0ae5570
Disallow ORDER BY in SELECT inside INSERT queries
LakshSingla Jan 28, 2022
37728f2
Indent ftl properly, try cleanup around Insert and DruidSqlInsert
LakshSingla Jan 28, 2022
4e63d27
checkstyle fix
LakshSingla Jan 28, 2022
1b62e6c
Update the partition by clause
LakshSingla Feb 2, 2022
960870a
Update the error messages in parsing class
LakshSingla Feb 2, 2022
8717d0b
semicolon fix, javadoc
LakshSingla Feb 2, 2022
72ae0bd
Add test cases for the util class
LakshSingla Feb 2, 2022
5cbcb35
checkstyle fix
LakshSingla Feb 2, 2022
38aa941
Partitioned by now a required clause
LakshSingla Feb 3, 2022
7df0610
Update testcases, minor review comments
LakshSingla Feb 3, 2022
d8d57f4
fix test cases
LakshSingla Feb 3, 2022
d36bc2b
add tests, cleanup error message
LakshSingla Feb 3, 2022
eaaad27
Change the value of insert segment granularity in querycontext
LakshSingla Feb 3, 2022
b90ab16
minor fixes
LakshSingla Feb 3, 2022
e335214
intellij inspections
LakshSingla Feb 4, 2022
2cef674
Exclude druidsqlinsert from codecov
LakshSingla Feb 4, 2022
8d58136
Exclude druidsqlinsert from codecov2
LakshSingla Feb 4, 2022
973d3fa
Missed testcase
LakshSingla Feb 7, 2022
dc2b45e
DruidSqlInsert.unparse working
LakshSingla Feb 7, 2022
32244a6
Fix error messages
LakshSingla Feb 7, 2022
5017257
Revert codecov exclude
LakshSingla Feb 7, 2022
5b79694
reduce severity for parsing error log
LakshSingla Feb 7, 2022
43b7388
intellij, test errors
LakshSingla Feb 8, 2022
ad78bba
unused import
LakshSingla Feb 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix error messages
  • Loading branch information
LakshSingla committed Feb 7, 2022
commit 32244a6c4fa626369592dae6255da77aa7937a35
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
import org.apache.druid.java.util.common.StringUtils;
import org.apache.druid.java.util.common.granularity.Granularity;
import org.apache.druid.java.util.common.granularity.PeriodGranularity;
import org.apache.druid.java.util.common.logger.Logger;
import org.apache.druid.segment.column.ColumnHolder;
import org.apache.druid.sql.calcite.expression.TimeUnits;
import org.apache.druid.sql.calcite.expression.builtin.TimeFloorOperatorConversion;
Expand All @@ -38,6 +39,9 @@

public class DruidSqlParserUtils
{

private static final Logger log = new Logger(DruidSqlParserUtils.class);

/**
* Delegates to {@code convertSqlNodeToGranularity} and converts the exceptions to {@link ParseException}
* with the underlying message
Expand All @@ -48,6 +52,7 @@ public static Granularity convertSqlNodeToGranularityThrowingParseExceptions(Sql
return convertSqlNodeToGranularity(sqlNode);
}
catch (Exception e) {
log.error(e, StringUtils.format("Unable to convert %s to a valid granularity.", sqlNode.toString()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this error occurs, it's due to user error on one particular query, so it shouldn't be an ERROR that appears in the server logs. That should be reserved for really important stuff that cluster operators need to worry about.

However, there should be some way to get the stack trace, and we generally don't want to return it directly to users of the SQL API.

So in a situation like this I'd do log.debug instead of log.error, so server admins can see the stack trace if they really need to, but won't see it by default. That balances all of the relevant desires: server admins shouldn't see "user error" stuff by default, and users shouldn't see stack traces as part of normal error messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with the correct level, thanks for the explanation.

throw new ParseException(e.getMessage());
}
}
Expand All @@ -74,7 +79,8 @@ public static Granularity convertSqlNodeToGranularityThrowingParseExceptions(Sql
public static Granularity convertSqlNodeToGranularity(SqlNode sqlNode) throws ParseException
{

final String genericParseFailedMessageFormatString = "Unable to parse the granularity from %s. Expected HOUR, DAY, MONTH, YEAR, ALL TIME, FLOOR function or %s function";
final String genericParseFailedMessageFormatString = "Encountered %s after PARTITIONED BY. "
+ "Expected HOUR, DAY, MONTH, YEAR, ALL TIME, FLOOR function or %s function";

if (!(sqlNode instanceof SqlCall)) {
throw new ParseException(StringUtils.format(
Expand All @@ -91,15 +97,15 @@ public static Granularity convertSqlNodeToGranularity(SqlNode sqlNode) throws Pa
"FLOOR".equalsIgnoreCase(operatorName)
|| TimeFloorOperatorConversion.SQL_FUNCTION_NAME.equalsIgnoreCase(operatorName),
StringUtils.format(
"PARTITIONED BY clause can only parse FLOOR and %s functions.",
"PARTITIONED BY clause only supports FLOOR(__time TO <unit> and %s(__time, period) functions",
TimeFloorOperatorConversion.SQL_FUNCTION_NAME
)
);

List<SqlNode> operandList = sqlCall.getOperandList();
Preconditions.checkArgument(
operandList.size() == 2,
StringUtils.format("Invalid number of arguments passed to %s in PARTIITONED BY clause", operatorName)
StringUtils.format("%s in PARTITIONED BY clause must have two arguments", operatorName)
);


Expand All @@ -120,15 +126,15 @@ public static Granularity convertSqlNodeToGranularity(SqlNode sqlNode) throws Pa
SqlNode granularitySqlNode = operandList.get(1);
Preconditions.checkArgument(
granularitySqlNode.getKind().equals(SqlKind.LITERAL),
"Second argument to TIME_FLOOR in PARTITIONED BY clause should be string representing a period"
"Second argument to TIME_FLOOR in PARTITIONED BY clause must be a period like 'PT1H'"
);
String granularityString = SqlLiteral.unchain(granularitySqlNode).toValue();
Period period;
try {
period = new Period(granularityString);
}
catch (IllegalArgumentException e) {
throw new ParseException(StringUtils.format("Unable to create period from %s", granularitySqlNode.toString()));
throw new ParseException(StringUtils.format("%s is an invalid period string", granularitySqlNode.toString()));
}
return new PeriodGranularity(period, null, null);

Expand All @@ -138,15 +144,16 @@ public static Granularity convertSqlNodeToGranularity(SqlNode sqlNode) throws Pa
// granularitySqlNode.getKind().equals(SqlKind.INTERVAL_QUALIFIER)
Preconditions.checkArgument(
granularitySqlNode instanceof SqlIntervalQualifier,
"Second argument to the FLOOR function in PARTITIONED BY clause cannot be converted to a valid granularity"
"Second argument to the FLOOR function in PARTITIONED BY clause is not a valid granularity. "
+ "Please refer to the documentation of FLOOR function"
);
SqlIntervalQualifier granularityIntervalQualifier = (SqlIntervalQualifier) granularitySqlNode;

Period period = TimeUnits.toPeriod(granularityIntervalQualifier.timeUnitRange);
Preconditions.checkNotNull(
period,
StringUtils.format(
"Unable to convert %s to valid granularity for ingestion",
"%s is not a valid granularity for ingestion",
granularityIntervalQualifier.timeUnitRange.toString()
)
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,14 +122,6 @@ public void testGetGranularityFromFloor() throws ParseException

public static class FloorToGranularityConversionTestErrors
{
/**
* incorrect node
* incorrect function name
* Incorrect number of arguments
* incorrect first argument
* incorrect second argument
*/

/**
* Tests clause like "PARTITIONED BY 'day'"
*/
Expand All @@ -139,12 +131,10 @@ public void testConvertSqlNodeToGranularityWithIncorrectNode()
SqlNode sqlNode = SqlLiteral.createCharString("day", SqlParserPos.ZERO);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> {
DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode);
}
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals(
"Unable to parse the granularity from 'day'. Expected HOUR, DAY, MONTH, YEAR, ALL TIME, FLOOR function or TIME_FLOOR function",
"Encountered 'day' after PARTITIONED BY. Expected HOUR, DAY, MONTH, YEAR, ALL TIME, FLOOR function or TIME_FLOOR function",
e.getMessage()
);
}
Expand All @@ -161,13 +151,11 @@ public void testConvertSqlNodeToGranularityWithIncorrectFunctionCall()
final SqlNode sqlNode = SqlStdOperatorTable.CEIL.createCall(args);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> {
DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode);
}
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals(
StringUtils.format(
"PARTITIONED BY clause can only parse FLOOR and %s functions.",
"PARTITIONED BY clause only supports FLOOR(__time TO <unit> and TIME_FLOOR(__time, period) functions",
TimeFloorOperatorConversion.SQL_FUNCTION_NAME
),
e.getMessage()
Expand All @@ -185,30 +173,77 @@ public void testConvertSqlNodeToGranularityWithIncorrectNumberOfArguments()
final SqlNode sqlNode = SqlStdOperatorTable.FLOOR.createCall(args);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> {
DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode);
}
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals("Invalid number of arguments passed to FLOOR in PARTIITONED BY clause", e.getMessage());
Assert.assertEquals("FLOOR in PARTITIONED BY clause must have two arguments", e.getMessage());
}

/**
* Tests clause like "PARTITIONED BY (timestamps TO DAY)"
* Tests clause like "PARTITIONED BY FLOOR(timestamps TO DAY)"
*/
@Test
public void testConvertSqlNodeToGranularityWithWrongIdentifier()
public void testConvertSqlNodeToGranularityWithWrongIdentifierInFloorFunction()
{
final SqlNodeList args = new SqlNodeList(SqlParserPos.ZERO);
args.add(new SqlIdentifier("timestamps", SqlParserPos.ZERO));
args.add(new SqlIntervalQualifier(TimeUnit.DAY, null, SqlParserPos.ZERO));
final SqlNode sqlNode = SqlStdOperatorTable.FLOOR.createCall(args);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> {
DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode);
}
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals("First argument to FLOOR in PARTITIONED BY clause can only be __time", e.getMessage());
}

/**
* Tests clause like "PARTITIONED BY TIME_FLOOR(timestamps, 'PT1H')"
*/
@Test
public void testConvertSqlNodeToGranularityWithWrongIdentifierInTimeFloorFunction()
{
final SqlNodeList args = new SqlNodeList(SqlParserPos.ZERO);
args.add(new SqlIdentifier("timestamps", SqlParserPos.ZERO));
args.add(SqlLiteral.createCharString("PT1H", SqlParserPos.ZERO));
final SqlNode sqlNode = TimeFloorOperatorConversion.SQL_FUNCTION.createCall(args);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals("First argument to TIME_FLOOR in PARTITIONED BY clause can only be __time", e.getMessage());
}

/**
* Tests clause like "PARTITIONED BY FLOOR(__time to ISOYEAR)"
*/
@Test
public void testConvertSqlNodeToGranularityWithIncorrectIngestionGranularityInFloorFunction()
{
final SqlNodeList args = new SqlNodeList(SqlParserPos.ZERO);
args.add(new SqlIdentifier("__time", SqlParserPos.ZERO));
args.add(new SqlIntervalQualifier(TimeUnit.ISOYEAR, null, SqlParserPos.ZERO));
final SqlNode sqlNode = SqlStdOperatorTable.FLOOR.createCall(args);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals("ISOYEAR is not a valid granularity for ingestion", e.getMessage());
}

/**
* Tests clause like "PARTITIONED BY TIME_FLOOR(__time, 'abc')"
*/
@Test
public void testConvertSqlNodeToGranularityWithIncorrectIngestionGranularityInTimeFloorFunction()
{
final SqlNodeList args = new SqlNodeList(SqlParserPos.ZERO);
args.add(new SqlIdentifier("__time", SqlParserPos.ZERO));
args.add(SqlLiteral.createCharString("abc", SqlParserPos.ZERO));
final SqlNode sqlNode = TimeFloorOperatorConversion.SQL_FUNCTION.createCall(args);
ParseException e = Assert.assertThrows(
ParseException.class,
() -> DruidSqlParserUtils.convertSqlNodeToGranularityThrowingParseExceptions(sqlNode)
);
Assert.assertEquals("'abc' is an invalid period string", e.getMessage());
}
}
}