-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After rke2 v1.27.4 arm64 installed, cni pod init error #4737
Comments
The install failed, but you can run |
Do you know what the problem is? @brandond |
I'm not sure why iptables would segfault on your hardware; I suspect perhaps your processor model lacks something the binary expects. What is the output of |
I use the arm64 virtualmachine,here is the cpu info: # lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 2
Model: 0
BogoMIPS: 200.00
NUMA node0 CPU(s): 0,1
NUMA node1 CPU(s): 2,3
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
# cat /proc/cpuinfo
processor : 0
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
processor : 1
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
processor : 2
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
processor : 3
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
|
Which VM platform are you running it in? Can you provide steps to reproduce? This works for me on multiple physical arm64 platforms. |
My vm running on OpenStack, my physical compute node system is CentOS7.8, cpu use HUAWEI Kunpeng 920 5220, # Physical node info
# arch
aarch64
# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (AltArch)
# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 2
NUMA node(s): 2
Model: 0
CPU max MHz: 2600.0000
CPU min MHz: 200.0000
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 32768K
NUMA node0 CPU(s): 0-31
NUMA node1 CPU(s): 32-63
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
# cat /proc/cpuinfo
processor : 0
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
...
# dmidecode -t processor
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.2.0 present.
Handle 0x001B, DMI type 4, 48 bytes
Processor Information
Socket Designation: CPU01
Type: Central Processor
Family: ARM
Manufacturer: HiSilicon
ID: 10 D0 1F 48 00 00 00 00
Signature: Implementor 0x48, Variant 0x1, Architecture 15, Part 0xd01, Revision 0
Version: HUAWEI Kunpeng 920 5220
Voltage: 0.9 V
External Clock: 100 MHz
Max Speed: 2600 MHz
Current Speed: 2600 MHz
Status: Populated, Enabled
Upgrade: Unknown
L1 Cache Handle: 0x0018
L2 Cache Handle: 0x0019
L3 Cache Handle: 0x001A
Serial Number: 6B73215401A03324
Asset Tag: To be filled by O.E.M.
Part Number: To be filled by O.E.M.
Core Count: 32
Core Enabled: 32
Thread Count: 32
Characteristics:
64-bit capable
Multi-Core
Execute Protection
Enhanced Virtualization
Power/Performance Control |
OpenStack use Train release, nova libvirt related configuration # cat /etc/nova/nova.conf
libvirt]
connection_uri = qemu:///system
cpu_mode = host-passthrough
virt_type = kvm The following is the xml of the virtual machine # virsh list
Id Name State
-----------------------------------
30 ubuntu-arm running
38 instance-00001ce6 running
43 instance-00001ce5 running
52 instance-00001d33 running
54 instance-00001d4a running
# virsh dumpxml 54
<domain type='kvm' id='54'>
<name>instance-00001d4a</name>
<uuid>5db3a62c-bd34-4dce-b642-290e3df6db1f</uuid>
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="0.0.0-1.el7"/>
<nova:name>centos78v.arm.bjat.qianxin-inc.cn</nova:name>
<nova:creationTime>2023-09-07 06:38:29</nova:creationTime>
<nova:flavor name="kc1.large.2">
<nova:memory>8192</nova:memory>
<nova:disk>0</nova:disk>
<nova:swap>0</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>4</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="be803e337dbb423097ab049b5af4df95">admin</nova:user>
<nova:project uuid="e93293733175465bbc00ccdf40a6f7b0">polaris-dev</nova:project>
</nova:owner>
</nova:instance>
</metadata>
<memory unit='KiB'>8388608</memory>
<currentMemory unit='KiB'>8388608</currentMemory>
<vcpu placement='static'>4</vcpu>
<cputune>
<shares>4096</shares>
<vcpupin vcpu='0' cpuset='30'/>
<vcpupin vcpu='1' cpuset='5'/>
<vcpupin vcpu='2' cpuset='41'/>
<vcpupin vcpu='3' cpuset='43'/>
<emulatorpin cpuset='5,30,41,43'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='0-1'/>
<memnode cellid='0' mode='strict' nodeset='0'/>
<memnode cellid='1' mode='strict' nodeset='1'/>
</numatune>
<resource>
<partition>/machine</partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>RDO</entry>
<entry name='product'>OpenStack Compute</entry>
<entry name='version'>0.0.0-1.el7</entry>
<entry name='serial'>5db3a62c-bd34-4dce-b642-290e3df6db1f</entry>
<entry name='uuid'>5db3a62c-bd34-4dce-b642-290e3df6db1f</entry>
<entry name='family'>Virtual Machine</entry>
</system>
</sysinfo>
<os>
<type arch='aarch64' machine='virt-rhel7.6.0'>hvm</type>
<loader readonly='yes' type='pflash'>/usr/share/AAVMF/AAVMF_CODE.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/instance-00001d4a_VARS.fd</nvram>
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
<apic/>
<gic version='3'/>
</features>
<cpu mode='host-passthrough' check='none'>
<topology sockets='2' cores='2' threads='1'/>
<numa>
<cell id='0' cpus='0-1' memory='4194304' unit='KiB'/>
<cell id='1' cpus='2-3' memory='4194304' unit='KiB'/>
</numa>
</cpu>
<clock offset='utc'>
<timer name='pit' tickpolicy='delay'/>
<timer name='rtc' tickpolicy='catchup'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none' discard='unmap'/>
<auth username='cinder'>
<secret type='ceph' uuid='fa197221-4a80-4976-a7c1-156b5fb7076e'/>
</auth>
<source protocol='rbd' name='cinder.volumes.hdd/volume-0806b304-dd14-44fc-a333-fec13a2e0826'>
<host name='10.57.37.52' port='6789'/>
<host name='10.57.37.53' port='6789'/>
</source>
<target dev='sda' bus='scsi'/>
<iotune>
<total_bytes_sec>60000000</total_bytes_sec>
<total_iops_sec>500</total_iops_sec>
</iotune>
<serial>0806b304-dd14-44fc-a333-fec13a2e0826</serial>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
<controller type='usb' index='0' model='qemu-xhci'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</controller>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x8'/>
<alias name='pci.1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x9'/>
<alias name='pci.2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='3' port='0xa'/>
<alias name='pci.3'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0xb'/>
<alias name='pci.4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
</controller>
<controller type='pci' index='5' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='5' port='0xc'/>
<alias name='pci.5'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
</controller>
<controller type='pci' index='6' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='6' port='0xd'/>
<alias name='pci.6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
</controller>
<interface type='ethernet'>
<mac address='fa:16:3c:24:e3:7b'/>
<target dev='tap047da446-bd'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/6'/>
<log file='/var/lib/nova/instances/5db3a62c-bd34-4dce-b642-290e3df6db1f/console.log' append='off'/>
<target type='system-serial' port='0'>
<model name='pl011'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/6'>
<source path='/dev/pts/6'/>
<log file='/var/lib/nova/instances/5db3a62c-bd34-4dce-b642-290e3df6db1f/console.log' append='off'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='tablet' bus='usb'>
<alias name='input0'/>
<address type='usb' bus='0' port='1'/>
</input>
<input type='keyboard' bus='usb'>
<alias name='input1'/>
<address type='usb' bus='0' port='2'/>
</input>
<graphics type='vnc' port='5904' autoport='yes' listen='0.0.0.0'>
<listen type='address' address='0.0.0.0'/>
</graphics>
<video>
<model type='virtio' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</video>
<memballoon model='virtio'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+0:+0</label>
<imagelabel>+0:+0</imagelabel>
</seclabel>
</domain> |
I faced the same issue in version v1.28.1+rke2r1 on arm64, but is works fine in x86 machine. |
Which operating system is used? |
The issue seems related to the iptables installed on the machine. Could you check if the iptables binary is build for arm64? |
CENTOS_MANTISBT_PROJECT="CentOS-7" The version of iptables: |
I also reproduced this issue without kubernetes
|
@rbrtbnfgl our hardened-kubernetes image actually includes iptables binaries from k3s-root: https://github.com/rancher/image-build-kubernetes/blob/master/Dockerfile#L58-L61 I suspect we need to bump this to v0.12.2 or newer for the 64k page size fix. |
We closed this in k3s after community validation, and the fix is the same here, so I am going to close it out here with the same reasoning per k3s-io/k3s#7335 (comment). If this does not resolve the issue, please let me know and we can work towards a better fix and getting an environment where we can reproduce it. Thank you! |
@rancher-max |
It's a Docker ARG, which is a variable passed in to the Dockerfile at build time... that is how Docker args work. Can you confirm which iptables binary specifically is segfaulting? I suspect there may be another binary embedded somewhere that is not usable on your platform. |
@brandond Here is a detailed binary comparison
it turns out k3s-root-arm (version: v0.12.1)'s binary xtables-nft-multi md5sum is the same as rke2 (version:v1.28.2+rke2r1) image, and they all got Segmentation fault |
Hmm. https://github.com/rancher/image-build-kubernetes/releases/tag/v1.28.2-rke2r1-build20230913 shows that it was built against rancher/image-build-kubernetes@c29ac4f which has the updated version... I'll have to see if that is perhaps also set elsewhere. |
oh, derp - we also define it here... and this one takes precedence |
Will need to be tested once we have hardened-kubernetes images tagged for 1.28.3 |
Validated on master branch with RC
Config.yaml:
Cluster Configuration:
Testing Steps Copy config.yaml
|
Environmental Info:
RKE2 Version:
# rke2 -v rke2 version v1.25.9+rke2r1 (842d05e64bcbf78552f1db0b32700b8faea403a0) go version go1.19.8 X:boringcrypto
OS info:
k8s Version:
Install result:
Pod log:
Describe the problem:
Steps To Reproduce:
systemctl stop firewalld.service systemctl stop NetworkManager INSTALL_RKE2_ARTIFACT_PATH=/root/rke2-artifacts sh install.sh systemctl enable rke2-server.service systemctl start rke2-server.service
Expected behavior:
expect the server node to be ready
Other attempts:
I use the same install step on ubuntu-arm64-22.04, it works
The text was updated successfully, but these errors were encountered: