Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix union result when mix signed/unsigned columns #7112

Merged
merged 16 commits into from
Jul 30, 2018
Merged

*: fix union result when mix signed/unsigned columns #7112

merged 16 commits into from
Jul 30, 2018

Conversation

lysu
Copy link
Contributor

@lysu lysu commented Jul 19, 2018

What have you changed? (mandatory)

fix #7075

  • make logical_plan_builder infer unsigned_flag right
  • mark resultType need be zero if it's an unsigned num but < 0
  • union builder also consider unsigned_flag to determine whether add cast projection
  • make cast expression cast zero if type marked in step 2 for int/bigint/double/float/decimal

What is the type of the changes? (mandatory)

  • Bug fix (non-breaking change which fixes an issue)

How has this PR been tested? (mandatory)

  • unit tests
  • integration tests

Does this PR affect documentation (docs/docs-cn) update? (mandatory)

No

Does this PR affect tidb-ansible update? (mandatory)

No

Does this PR need to be added to the release notes? (mandatory)

fix union result when mix signed/unsigned columns

Refer to a related PR or issue link (optional)

#7075

Add a few positive/negative examples (optional)

see #7075 and unit test


This change is Reviewable

@lysu lysu added type/bugfix This PR fixes a bug. component/expression labels Jul 19, 2018
// This logic need be combined with buildProjection4Union.
if a.Tp == mysql.TypeNewDecimal {
// For Decimal result be unsigned when all union children be unsigned.
resultTp.Flag &= b.Flag & mysql.UnsignedFlag
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -39,7 +39,8 @@ type FieldType struct {
Charset string
Collate string
// Elems is the element list for enum and set type.
Elems []string
Elems []string
BelowZeroBeZero bool
Copy link
Contributor Author

@lysu lysu Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need this "black flag"... because we can not cast < 0 to zero directly in chunk_executor, orz

because select cast(-1 as unsigned) need be MAX_UINT64..
but for union or insert need be 0

and this flag is the most convenient way to get there, IMHO

@@ -39,7 +39,8 @@ type FieldType struct {
Charset string
Collate string
// Elems is the element list for enum and set type.
Elems []string
Elems []string
NegativeUnsignedBeZero bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. add a comment for this
  2. Do we really need this, this is set to true only when building union

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes~ because in union context cast(negative unsigned be zero) is differ to cast in normal(negative unsigned be int.Max)...so we need a flag.

we can add it in field_type or cast expression/executor, and it seems previous way's modification is less.

yes, it is not very comfortable for mee too, or have any other good way ~ ^ ^ ?

tk.MustQuery("select id, i, b, d, dd from t2 union select id, i, b, d, dd from t1 order by id").
Check(testkit.Rows("1 0 0 0 -1", "2 1 1 1.1 1"))
tk.MustQuery("select id, i from t2 union select id, cast(i as unsigned int) from t1 order by id").
Check(testkit.Rows("1 18446744073709551615", "2 1"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the same as MySQL's?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes~ mysql, cast will got int.Max

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about unsigned double union int? we need a case to cover union different types.

Copy link
Contributor Author

@lysu lysu Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emma, added and find a bug....o.o~

PTAL~thx

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -40,6 +40,9 @@ type FieldType struct {
Collate string
// Elems is the element list for enum and set type.
Elems []string
// NegativeUnsignedBeZero indicates that unsigned type with negative number will be cast 0 instead of Big number.
// it's only used in union implicitly cast.
NegativeUnsignedBeZero bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take more time to find a way to avoid adding this field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move this field to the cast function.

@lysu
Copy link
Contributor Author

lysu commented Jul 23, 2018

/run-all-tests

@lysu
Copy link
Contributor Author

lysu commented Jul 24, 2018

@coocood @zz-jason @XuHuaiyu PTAL thx~

@@ -1593,6 +1659,24 @@ func (b *builtinCastJSONAsDurationSig) evalDuration(row types.Row) (res types.Du
return
}

type castContext int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it used in BuildCastFunction4Insert, build a special tmp ctx to void modify BuildCastFunction, castXXFunctionClass or getFunction

@@ -647,7 +655,7 @@ func (b *planBuilder) buildProjection4Union(u *LogicalUnionAll) {
dstType := unionSchema.Columns[i].RetType
srcType := srcCol.RetType
if !srcType.Equal(dstType) {
exprs[i] = expression.BuildCastFunction(b.ctx, srcCol, dstType)
exprs[i] = expression.BuildCastFunction4Insert(b.ctx, srcCol, dstType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like BuildCastFunction4Insert is only used for Union, maybe name the function to BuildCastFunction4Union is better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emma, but maybe it will use for insert, insert cast logic is the same as this one, but now insert values is datum based cast, or we change to insert in future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change it in the future. I believe it will take us a long time to reach that future 😂

@@ -612,6 +612,14 @@ func (b *planBuilder) buildDistinct(child LogicalPlan, length int) LogicalPlan {
// joinFieldType finds the type which can carry the given types.
func joinFieldType(a, b *types.FieldType) *types.FieldType {
resultTp := types.NewFieldType(types.MergeFieldType(a.Tp, b.Tp))
// This logic need be combined with buildProjection4Union.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd better not embed this function to buildProjection4Union. this function is focused on deciding the data type for each column of the union.

Copy link
Contributor

@CaitinChen CaitinChen Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic will be intelligible when it is associated with the buildProjection4Union logic.

@@ -612,6 +612,14 @@ func (b *planBuilder) buildDistinct(child LogicalPlan, length int) LogicalPlan {
// joinFieldType finds the type which can carry the given types.
func joinFieldType(a, b *types.FieldType) *types.FieldType {
resultTp := types.NewFieldType(types.MergeFieldType(a.Tp, b.Tp))
// This logic need be combined with buildProjection4Union.
if a.Tp == mysql.TypeNewDecimal {
// For Decimal result be unsigned when all union children be unsigned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about: "the decimal type could be unsigned only when all the decimals to be united are unsigned" @CaitinChen We need your help to refine this comment 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CaitinChen need help~~

Copy link
Contributor

@CaitinChen CaitinChen Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decimal type will be unsigned only when all the decimals to be united are unsigned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zz-jason @lysu I think Jason's may be right, but I don't know why you use "could be".

@@ -426,25 +426,34 @@ func (c *castAsJSONFunctionClass) getFunction(ctx sessionctx.Context, args []Exp

type builtinCastIntAsIntSig struct {
baseBuiltinFunc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about defining another struct to wrap baseBuiltinFunc and insertContext, and use this struct for every builtinCastXAsYSig:

type baseBuiltinCastFunc struct {
	baseBuiltinFunc
	inUnionAll bool
}

If so, we only need to change the base or bf when constructing builtinCastXAsYSigs, and the code can be cleaner.

@@ -456,20 +465,26 @@ func (b *builtinCastIntAsRealSig) evalReal(row types.Row) (res float64, isNull b
if !mysql.HasUnsignedFlag(b.args[0].GetType().Flag) {
res = float64(val)
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about:

if !mysql.HasUnsignedFlag(b.args[0].GetType().Flag) {
	...
} else if b.insertContext && val < 0 {
	...
} else {
	...
}

@@ -873,9 +914,13 @@ func (b *builtinCastDecimalAsIntSig) evalInt(row types.Row) (res int64, isNull b
}

if mysql.HasUnsignedFlag(b.tp.Flag) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about:

if !mysql.HasUnsignedFlag(b.tp.Flag) {
	...
} else if b.insertContext && val < 0 {
	...
} else {
	...
}

res = 0
} else {
res, err = types.StrToInt(sc, val)
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil?

Copy link
Contributor Author

@lysu lysu Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this old logic is by design #3894 #645

Copy link
Contributor Author

@lysu lysu Jul 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.....it seems be a old bug.
select cast(-1 as signed) should not return a warning but master branch does.
I think it should be 05bf3bfd4d19705639a1350eacd5587bcb36fb47 ?

zz-jason
zz-jason previously approved these changes Jul 24, 2018
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// the decimal type could be unsigned only when all the decimals to be united are unsigned
resultTp.Flag &= b.Flag & mysql.UnsignedFlag
} else {
// Other types result will be unsigned only if first child be unsigned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-decimal results will be unsigned when the first SQL statement result in the union is unsigned.

resultTp := types.NewFieldType(types.MergeFieldType(a.Tp, b.Tp))
// This logic need be understood with buildProjection4Union logic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic will be intelligible when it is associated with the buildProjection4Union logic.

resultTp := types.NewFieldType(types.MergeFieldType(a.Tp, b.Tp))
// This logic need be understood with buildProjection4Union logic.
if a.Tp == mysql.TypeNewDecimal {
// the decimal type could be unsigned only when all the decimals to be united are unsigned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decimal type will be unsigned only when all the decimals to be united are unsigned.

@lysu lysu added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 24, 2018
@lysu
Copy link
Contributor Author

lysu commented Jul 25, 2018

@coocood @XuHuaiyu PTAL thx~

jackysp
jackysp previously approved these changes Jul 25, 2018
@lysu
Copy link
Contributor Author

lysu commented Jul 26, 2018

@coocood @XuHuaiyu PTAL thx~

// baseBuiltinCastFunc will be contained in every struct that implement cast builtinFunc
type baseBuiltinCastFunc struct {
baseBuiltinFunc
inUnionAll bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1593,6 +1626,26 @@ func (b *builtinCastJSONAsDurationSig) evalDuration(row types.Row) (res types.Du
return
}

// inCastContext is session key type that indicates whether executing in special cast context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a more detail explaination for special?

b.inUnion = from.inUnion
}

func newBaseBuiltinCastFunc(builtinFunc baseBuiltinFunc, inUnionAll bool) baseBuiltinCastFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ inUnionAll/ inUnion

func newBaseBuiltinCastFunc(builtinFunc baseBuiltinFunc, inUnionAll bool) baseBuiltinCastFunc {
return baseBuiltinCastFunc{
baseBuiltinFunc: builtinFunc,
inUnion: inUnionAll,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

tk.MustQuery("select id, i, b, d, dd from t2 union select id, i, b, d, dd from t1 order by id").
Check(testkit.Rows("1 0 0 0 -1", "2 1 1 1.1 1"))
tk.MustQuery("select id, i from t2 union select id, cast(i as unsigned int) from t1 order by id").
Check(testkit.Rows("1 18446744073709551615", "2 1"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about unsigned double union int? we need a case to cover union different types.

@lysu
Copy link
Contributor Author

lysu commented Jul 27, 2018

/run-all-tests tidb-test=pr/594

@lysu
Copy link
Contributor Author

lysu commented Jul 30, 2018

/run-common-test tidb-test=pr/594
/run-integration-common-test tidb-test=pr/594
/run-mybatis-test tidb-test=pr/594

@lysu
Copy link
Contributor Author

lysu commented Jul 30, 2018

PTAL @coocood @XuHuaiyu

Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lysu lysu added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 30, 2018
Copy link
Member

@coocood coocood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood coocood merged commit a801259 into pingcap:master Jul 30, 2018
@lysu lysu deleted the dev-fix-union-unsign branch September 27, 2018 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/expression status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

union result is not differ from mysql
6 participants