Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

long-time operation or many VMs may lead a meta db problem #1131

Closed
seokho-son opened this issue Jun 8, 2022 · 8 comments · Fixed by #1140
Closed

long-time operation or many VMs may lead a meta db problem #1131

seokho-son opened this issue Jun 8, 2022 · 8 comments · Fixed by #1140
Labels
bug Something isn't working

Comments

@seokho-son
Copy link
Member

What happened
8개 MCIS, VM 100개 이상을 제어하는 상황에서,

MCIS의 개별 VM을 삭제한 후,

  • meta DB의 값이 지워지는 상황 발생.
  • 키는 유지됨.
http://localhost:1323/tumblebug/ns/ns01
Response body: {   "id": "",   "name": "",   "description": "" }

동일한 현상이 발생하는 MCIS가 있었음.

How to reproduce it (as minimally and precisely as possible)

  • 재현해보지 않음
  • 유사 환경 조성하여 다양한 테스트 필요
  • MCIS에서 특정 VM 삭제 테스트

Anything else we need to know?
:

Environment

  • Source version or branch: Head
  • OS: ubuntu
  • Others:

Proposed solution
:

Any other context
:

@seokho-son seokho-son added the bug Something isn't working label Jun 8, 2022
@seokho-son
Copy link
Member Author

// UpdateVmInfo is func to update VM Info
func UpdateVmInfo(nsId string, mcisId string, vmInfoData TbVmInfo) {
	key := common.GenMcisKey(nsId, mcisId, vmInfoData.Id)

	// Check existence of the key. If no key, no update.
	keyValue, err := common.CBStore.Get(key)
	if keyValue == nil || err != nil {
		return
	}

	vmTmp := TbVmInfo{}
	json.Unmarshal([]byte(keyValue.Value), &vmTmp)

	if !reflect.DeepEqual(vmTmp, vmInfoData) {
		val, _ := json.Marshal(vmInfoData)
		err = common.CBStore.Put(key, string(val))
		if err != nil {
			common.CBLog.Error(err)
		}
	}
}

error handling 보완 필요.

@seokho-son
Copy link
Member Author

// GenMcisKey is func to generate a key used in keyValue store
func GenMcisKey(nsId string, mcisId string, vmId string) string {

if vmId != "" {
	return "/ns/" + nsId + "/mcis/" + mcisId + "/vm/" + vmId
} else if mcisId != "" {
	return "/ns/" + nsId + "/mcis/" + mcisId
} else if nsId != "" {
	return "/ns/" + nsId
} else {
	return ""
}

}
error handling 보완 필요.

@seokho-son
Copy link
Member Author

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x47d44b]

goroutine 1033235 [running]:
github.com/cloud-barista/cb-tumblebug/src/core/mcis.GetVmStatusAsync(0xc000e5bc20, 0xc001c6fc12, 0x4, 0xc003a0810e, 0x12, 0xc00184c664, 0x16, 0xc0000dd440, 0x0, 0x0)
	/home/son/go/src/github.com/cloud-barista/cb-tumblebug/src/core/mcis/manageInfo.go:907 +0x28f
created by github.com/cloud-barista/cb-tumblebug/src/core/mcis.GetMcisStatus
	/home/son/go/src/github.com/cloud-barista/cb-tumblebug/src/core/mcis/manageInfo.go:533 +0x6a9
Makefile:12: recipe for target 'run' failed
make: *** [run] Error 2

@seokho-son
Copy link
Member Author

VM이 많이 포함된 MCIS에서,

내부 VM 정보를 삭제하면,

MCIS 상태 조회 기능에서 오류 발생 가능.

MCIS 상태 조회는 기본적으로 MCIS에 포함된 VM 리스트를 조회하여,
개별 VM에 대해 상태 조회를 하고 MCIS 오브젝트의 정보를 업데이트하는데,

MCIS 상태 조회가 길어지면,
처음 조회하여 가지고 있던 VM list의 수가 상태 조회 중에 변경될 가능성 높음. (VM 삭제 등의 요청 발생하는 경우)

결과적으로는 없는 VM에 대한 처리 요청이 발생할 수 있으며,
에러처리가 잘 안되어 있는 부분에서 nil reference error 및 panic이 발생하는 것으로 추측.

@seokho-son
Copy link
Member Author

seokho-son commented Jun 16, 2022

MCIS 수를 300개까지 올려보면,

Spider와의 connection 이슈로, hitting the default limit of 1024 "open files 발생함.

관련 솔루션:

https://blog.sensecodons.com/2022/04/golang-httpclient-and-too-many-open.html

세션 확인하기

echo "cb-tb,cb-sp" > ss-tb.log; for run in {1..10000}; do cbtb=$(ss -anp | grep tumblebug | wc -l); cbsp=$(ss -anp | grep spider | wc -l); echo "${cbtb},${cbsp}" >> ss-tb.log; sleep 1; done

@seokho-son
Copy link
Member Author

seokho-son commented Jun 16, 2022

미봉책: change max file descriptors

https://stackoverflow.com/questions/32325343/go-tcp-too-many-open-files-debug

ulimit 설정이 ubuntu 18.* 에서 오류가 있을 수 있음
해결: https://superuser.com/questions/1200539/cannot-increase-open-file-limit-past-4096-ubuntu/1200818#_=_

@seokho-son
Copy link
Member Author

@seokho-son
Copy link
Member Author

고루틴 버퍼드 채널 사용 또는
Rate 제한도 방안이 될 수 있음.

https://dev.to/godoylucase/rate-limiting-your-goroutines-1om1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant