Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: print an expensive query log when a query is out of memQuota #10799

Merged
merged 10 commits into from
Jun 18, 2019

Conversation

XuHuaiyu
Copy link
Contributor

What problem does this PR solve?

Print an expensive query log when a query is out of memory quota.

What is changed and how it works?

  1. Bind a hook for Action, and this hook will be triggered when a sql's memory usage is out of quota.
  2. Print an expensive query log no matter the action is log or cancel since expensive query log is more detailed.

Check List

Tests

  • Manual test (add detailed scripts or steps below)
  1. Change the MemQuotaQuery to 1<<15 (32KB), and run select * from t, then we get:
 [WARN] [expensivequery.go:150] [expensive_query] [cost_time=0.066961588s] [stats=t:409052062136664065] [conn_id=1] [user=root] [database=test] [table_ids="[44]"] [txn_start_ts=409052736881426432] [mem_max="32.4697265625 KB"] [sql="select * from t"]
  1. Change the OOMAction to cancel and MemQuotaQuery to 1<<15(32KB), and run select * from t, then we get:
[WARN] [expensivequery.go:150] [expensive_query] [cost_time=0.070405408s] [stats=t:409052763423506432] [conn_id=1] [user=root] [database=test] [table_ids="[44]"] [txn_start_ts=409052764014379008] [mem_max="32.4697265625 KB"] [sql="select * from t"]
[2019/06/13 15:54:20.453 +08:00] [ERROR] [coprocessor.go:616] ["copIteratorWork meet panic"] [r="\"Out Of Memory Quota!\\n\\\"select * from t\\\"{\\n  \\\"quota\\\": 32 KB\\n  \\\"consumed\\\": 32.4697265625 KB\\n  \\\"TableReaderDistSQLTracker\\\"{\\n    \\\"quota\\\": 32 GB\\n    \\\"consumed\\\": 32.4697265625 KB\\n  }\\n}\\n\""] ["stack trace"="github.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask.func1\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:618\nruntime.gopanic\n\t/Users/xuhuaiyu/.gvm/gos/go1.12/src/runtime/panic.go:522\ngithub.com/pingcap/tidb/util/memory.(*PanicOnExceed).Action\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/util/memory/action.go:90\ngithub.com/pingcap/tidb/util/memory.(*Tracker).Consume\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/util/memory/tracker.go:161\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).sendToRespCh\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:553\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleCopResponse\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:824\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTaskOnce\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:683\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:626\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).run\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:465"] [stack="github.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask.func1\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:616\nruntime.gopanic\n\t/Users/xuhuaiyu/.gvm/gos/go1.12/src/runtime/panic.go:522\ngithub.com/pingcap/tidb/util/memory.(*PanicOnExceed).Action\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/util/memory/action.go:90\ngithub.com/pingcap/tidb/util/memory.(*Tracker).Consume\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/util/memory/tracker.go:161\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).sendToRespCh\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:553\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleCopResponse\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:824\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTaskOnce\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:683\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:626\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).run\n\t/Users/xuhuaiyu/Development/GOPATH/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:465"]

Code changes

  • Has exported function/method change
  • Has exported variable/fields change

Side effects

N/A

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

@XuHuaiyu XuHuaiyu added type/enhancement The issue or PR belongs to an enhancement. sig/execution SIG execution labels Jun 13, 2019
@XuHuaiyu XuHuaiyu added the priority/P1 The issue has P1 priority. label Jun 13, 2019
@codecov
Copy link

codecov bot commented Jun 13, 2019

Codecov Report

Merging #10799 into master will increase coverage by 0.0187%.
The diff coverage is 25%.

@@               Coverage Diff                @@
##             master     #10799        +/-   ##
================================================
+ Coverage   80.8285%   80.8473%   +0.0187%     
================================================
  Files           419        419                
  Lines         88611      88635        +24     
================================================
+ Hits          71623      71659        +36     
+ Misses        11760      11750        -10     
+ Partials       5228       5226         -2

ddl/ddl_test.go Outdated
SetHook(h Callback)
// SetInterceptoror sets the interceptor.
SetInterceptoror(h Interceptor)
}

// SetHook implements DDL.SetHook interface.
// SetLogHook implements DDL.SetLogHook interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment and Function name is not consistent...
I think it can't pass this check #10787

Copy link
Contributor

@qw4990 qw4990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XuHuaiyu XuHuaiyu added the status/LGT1 Indicates that a PR has LGTM 1. label Jun 17, 2019
default:
sc.MemTracker.SetActionOnExceed(&memory.LogOnExceed{})
action := &memory.LogOnExceed{ConnID: ctx.GetSessionVars().ConnectionID}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return an unsupported oom action error?

executor/executor.go Show resolved Hide resolved
@@ -490,8 +490,9 @@ func (s *Server) ShowProcessList() map[uint64]*util.ProcessInfo {
if atomic.LoadInt32(&client.status) == connStatusWaitShutdown {
continue
}
pi := client.ctx.ShowProcess()
rs[pi.ID] = pi
if pi := client.ctx.ShowProcess(); pi != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bugfix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, #10740
But I'm not sure how to add a test case for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do release 2.1 and 3.0 have this issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only master branch has this issue now.
Before refactoring the expensive log, this function will only be called when we use show processlist, the pi can be promised to be not null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this PR needs to be cherry-picked to release-3.0, so we'd better use another PR to fix it?

Copy link
Contributor

@SunRunAway SunRunAway Jun 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an UT about it?
And we should backport this UT to old releases.

util/memory/action.go Show resolved Hide resolved
@XuHuaiyu XuHuaiyu requested a review from zz-jason June 17, 2019 07:55
@@ -490,8 +490,9 @@ func (s *Server) ShowProcessList() map[uint64]*util.ProcessInfo {
if atomic.LoadInt32(&client.status) == connStatusWaitShutdown {
continue
}
pi := client.ctx.ShowProcess()
rs[pi.ID] = pi
if pi := client.ctx.ShowProcess(); pi != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do release 2.1 and 3.0 have this issue?

executor/executor.go Show resolved Hide resolved
@@ -1303,11 +1303,15 @@ func ResetContextOfStmt(ctx sessionctx.Context, s ast.StmtNode) (err error) {
}
switch config.GetGlobalConfig().OOMAction {
case config.OOMActionCancel:
sc.MemTracker.SetActionOnExceed(&memory.PanicOnExceed{})
action := &memory.PanicOnExceed{ConnID: ctx.GetSessionVars().ConnectionID}
action.SetLogHook(domain.GetDomain(ctx).ExpensiveQueryHandle().LogOnQueryExceedMemQuota)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to add this log. When the query runs out of memory quota, it panics. The upper layer will recover that goroutine and log the panic reason and sql, which is "out of memory quota".

Copy link
Contributor Author

@XuHuaiyu XuHuaiyu Jun 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are two different logs.
Expensive query log can provide us more info about the query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change the panic message to the content we want to log?

@XuHuaiyu XuHuaiyu requested a review from zz-jason June 18, 2019 02:22
ddl/ddl_test.go Outdated
@@ -46,7 +46,7 @@ type DDLForTest interface {
SetInterceptoror(h Interceptor)
}

// SetHook implements DDL.SetHook interface.
// SetHook implements DDL.SetLogHook interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should not be changed.

@XuHuaiyu XuHuaiyu requested a review from alivxxx June 18, 2019 07:31
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XuHuaiyu
Copy link
Contributor Author

/run-all-tests

@@ -17,6 +17,7 @@ import (
"context"
"sync"

"fmt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this upward.

Copy link
Contributor

@alivxxx alivxxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/P1 The issue has P1 priority. sig/execution SIG execution status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants