Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory_quota: some goroutines panic because of invalid memory address or nil pointer dereference #18547

Closed
ChenPeng2013 opened this issue Jul 14, 2020 · 9 comments
Assignees
Labels
challenge-program help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. severity/major sig/execution SIG execution type/bug The issue is confirmed as a bug.

Comments

@ChenPeng2013
Copy link
Contributor

ChenPeng2013 commented Jul 14, 2020

Description

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

mem-quota-query = 1073741824
oom-action = "cancel"
oom-use-tmp-storage = false

tiup bench tpch --sf=1 prepare
tiup bench tpch --sf=1 --check=true run --threads 2 --time=300s

2. What did you expect to see? (Required)

3. What did you see instead (Required)

[2020/07/14 08:26:00.176 +00:00] [ERROR] [misc.go:90] ["panic in the recoverable goroutine"] [r="\"Out Of Memory Quota![conn_id=11]\""] ["stack trace"="github.com/pingcap/tidb/util.WithRecovery.func1\n\t/root/tidb/util/misc.go:92\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\ngithub.com/pingcap/tidb/util/memory.(*PanicOnExceed).Action\n\t/root/tidb/util/memory/action.go:96\ngithub.com/pingcap/tidb/util/memory.(*Tracker).Consume\n\t/root/tidb/util/memory/tracker.go:226\ngithub.com/pingcap/tidb/util/chunk.(*List).Add\n\t/root/tidb/util/chunk/list.go:120\ngithub.com/pingcap/tidb/executor.(*outerWorker).buildTask\n\t/root/tidb/executor/index_lookup_join.go:407\ngithub.com/pingcap/tidb/executor.(*indexHashJoinOuterWorker).buildTask\n\t/root/tidb/executor/index_lookup_hash_join.go:335\ngithub.com/pingcap/tidb/executor.(*indexHashJoinOuterWorker).run\n\t/root/tidb/executor/index_lookup_hash_join.go:314\ngithub.com/pingcap/tidb/executor.(*IndexNestedLoopHashJoin).startWorkers.func1\n\t/root/tidb/executor/index_lookup_hash_join.go:161\ngithub.com/pingcap/tidb/util.WithRecovery\n\t/root/tidb/util/misc.go:95"]
[2020/07/14 08:26:00.202 +00:00] [ERROR] [misc.go:90] ["panic in the recoverable goroutine"] [r="\"invalid memory address or nil pointer dereference\""] ["stack trace"="github.com/pingcap/tidb/util.WithRecovery.func1\n\t/root/tidb/util/misc.go:92\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:695\ngithub.com/pingcap/tidb/util/chunk.(*List).NumChunks\n\t/root/tidb/util/chunk/list.go:71\ngithub.com/pingcap/tidb/util/chunk.(*iterator4List).Begin\n\t/root/tidb/util/chunk/iterator.go:190\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).doJoinUnordered\n\t/root/tidb/executor/index_lookup_hash_join.go:564\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).handleTask\n\t/root/tidb/executor/index_lookup_hash_join.go:556\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).run\n\t/root/tidb/executor/index_lookup_hash_join.go:452\ngithub.com/pingcap/tidb/executor.(*IndexNestedLoopHashJoin).startWorkers.func2\n\t/root/tidb/executor/index_lookup_hash_join.go:186\ngithub.com/pingcap/tidb/util.WithRecovery\n\t/root/tidb/util/misc.go:95"]
[2020/07/14 08:26:00.207 +00:00] [ERROR] [misc.go:90] ["panic in the recoverable goroutine"] [r="\"invalid memory address or nil pointer dereference\""] ["stack trace"="github.com/pingcap/tidb/util.WithRecovery.func1\n\t/root/tidb/util/misc.go:92\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:695\ngithub.com/pingcap/tidb/util/chunk.(*List).NumChunks\n\t/root/tidb/util/chunk/list.go:71\ngithub.com/pingcap/tidb/util/chunk.(*iterator4List).Begin\n\t/root/tidb/util/chunk/iterator.go:190\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).doJoinUnordered\n\t/root/tidb/executor/index_lookup_hash_join.go:564\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).handleTask\n\t/root/tidb/executor/index_lookup_hash_join.go:556\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).run\n\t/root/tidb/executor/index_lookup_hash_join.go:452\ngithub.com/pingcap/tidb/executor.(*IndexNestedLoopHashJoin).startWorkers.func2\n\t/root/tidb/executor/index_lookup_hash_join.go:186\ngithub.com/pingcap/tidb/util.WithRecovery\n\t/root/tidb/util/misc.go:95"]
[2020/07/14 08:26:00.229 +00:00] [ERROR] [misc.go:90] ["panic in the recoverable goroutine"] [r="\"invalid memory address or nil pointer dereference\""] ["stack trace"="github.com/pingcap/tidb/util.WithRecovery.func1\n\t/root/tidb/util/misc.go:92\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:967\nruntime.panicmem\n\t/usr/local/go/src/runtime/panic.go:212\nruntime.sigpanic\n\t/usr/local/go/src/runtime/signal_unix.go:695\ngithub.com/pingcap/tidb/util/chunk.(*List).NumChunks\n\t/root/tidb/util/chunk/list.go:71\ngithub.com/pingcap/tidb/util/chunk.(*iterator4List).Begin\n\t/root/tidb/util/chunk/iterator.go:190\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).doJoinUnordered\n\t/root/tidb/executor/index_lookup_hash_join.go:564\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).handleTask\n\t/root/tidb/executor/index_lookup_hash_join.go:556\ngithub.com/pingcap/tidb/executor.(*indexHashJoinInnerWorker).run\n\t/root/tidb/executor/index_lookup_hash_join.go:452\ngithub.com/pingcap/tidb/executor.(*IndexNestedLoopHashJoin).startWorkers.func2\n\t/root/tidb/executor/index_lookup_hash_join.go:186\ngithub.com/pingcap/tidb/util.WithRecovery\n\t/root/tidb/util/misc.go:95"]

4. Affected version (Required)

Release Version: v4.0.2-26-g3c4a93226
Edition: Community
Git Commit Hash: 3c4a93226793b4b9cef0bd7579c33c42efda5b20
Git Branch: release-4.0
UTC Build Time: 2020-07-14 07:51:42
GoVersion: go1.14.1
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

5. Root Cause Analysis

SIG slack channel

#sig-exec

Score

  • 900

Mentor

@ChenPeng2013 ChenPeng2013 added the type/bug The issue is confirmed as a bug. label Jul 14, 2020
@fzhedu
Copy link
Contributor

fzhedu commented Jul 15, 2020

/label component/executor

@ti-srebot ti-srebot added the sig/execution SIG execution label Jul 15, 2020
@lzmhhh123 lzmhhh123 added challenge-program help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Oct 30, 2020
@hanlins
Copy link
Contributor

hanlins commented Dec 1, 2020

/pick-up

@ti-challenge-bot
Copy link

Pick up success.

@hanlins
Copy link
Contributor

hanlins commented Dec 2, 2020

Reproduced the issue locally. The issue is caused by context cancellation handling. Originally just return nil when the task is cancelled, and caused the nil pointer panic later. This issue seems has been fixed in f5fa3e7. If needed I could add some tests to prevent from regression, or backport the fix to previous releases, otherwise the issue has been technically fixed.

@hanlins
Copy link
Contributor

hanlins commented Dec 2, 2020

Fixed by #19235 @lzmhhh123 .

@hanlins
Copy link
Contributor

hanlins commented Dec 2, 2020

I've reproduced the issue locally and with the fix mentioned above, the issue is gone. @ChenPeng2013 can you verify if this issue is still there in latest master? If not then we can resolve this issue and backport the fix if necessary.

@ti-challenge-bot
Copy link

@hanlins You did not submit PR within 7 days, so give up automatically.

@ti-challenge-bot ti-challenge-bot bot removed the picked label Dec 8, 2020
@wshwsh12
Copy link
Contributor

wshwsh12 commented Dec 8, 2020

The issue has been fixed by #19235, and the fix pr has ported in release v4.0.4. I think we can close the issue now.

@wshwsh12 wshwsh12 closed this as completed Dec 8, 2020
@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
challenge-program help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. severity/major sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

7 participants