Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc build updates #614

Merged
merged 4 commits into from
Nov 27, 2019
Merged

Misc build updates #614

merged 4 commits into from
Nov 27, 2019

Conversation

gal-pressman
Copy link
Contributor

Add EFA to redhat's udev rules and add an explicit udev trigger.
Cbuild is modified to support for Amazon Linux 1 & 2 packaging.

buildlib/cbuild Outdated
@@ -652,7 +652,9 @@ if "check_output" in dir(subprocess):
for ln in subprocess.check_output(["rpmspec","-P",{tspec_file!r}]).splitlines():
if ln.startswith(b"Source:"):
tarfn = ln.strip().partition(b' ')[2].strip();
os.symlink({tarfn!r},os.path.join(b"SOURCES",tarfn));
dst = os.path.join(b"SOURCES",tarfn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this even possible? This is run inside the container in a clean filesystem under an empty tmpdir, something else has gone quite wrong if this can faile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happens when using the centos6 spec file:
Source: rdma-core.tgz

It names the tar rdma-core.tgz, and then the script tries to create a symlink with the same name.
Other spec files have the version/git_ver added to the tar name so they work fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

centos6 doesn't even run this code since its python is so old it lacks 'check_output'

It this because the aws1 image has a newer python and uses the centos6 spec file? The commit message should be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, sorry for the confusion.
Do you prefer that I squash this fix with the AmazonLinux commit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved the commit message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be written more like

if tarfn != {tarfn!r}:
    os.symlink({tarfn!r}, dst)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@@ -171,6 +171,65 @@ class centos7_epel(centos7):
res.lines.append("RUN ln -s /usr/bin/cmake3 /usr/local/bin/cmake && ln -sf /usr/bin/python3.4 /usr/bin/python3");
return res;

class amazonlinux1(YumEnvironment):
docker_parent = "amazonlinux:1";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need all three? How different is this actually from Centos?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not part of the AmazonLinux distro team so I really can't say how different they are from CentOS.
Why not have all three? Is there a goal to reduce the number of images?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes a long time to build all these images for testing purposes, if they are pointless I prefer not to have them all. ie checking the same EPEL modified packages against amazon and c7 is probably redundant.

I'd suggest the latest image only..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll remove amazonlinux2 and keep amazonlinux1 and amazonlix2_epel (which will be renamed to amazonlinux2).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up removing amazonlinux2_epel.

@@ -10,3 +10,4 @@ SUBSYSTEM=="module", KERNEL=="mlx*", ACTION=="add", TAG+="systemd", ENV{SYSTEMD_
SUBSYSTEM=="module", KERNEL=="iw_*", ACTION=="add", TAG+="systemd", ENV{SYSTEMD_WANTS}="rdma.service"
SUBSYSTEM=="module", KERNEL=="be2net", ACTION=="add", TAG+="systemd", ENV{SYSTEMD_WANTS}="rdma.service"
SUBSYSTEM=="module", KERNEL=="enic", ACTION=="add", TAG+="systemd", ENV{SYSTEMD_WANTS}="rdma.service"
SUBSYSTEM=="module", KERNEL=="efa", ACTION=="add", TAG+="systemd", ENV{SYSTEMD_WANTS}="rdma.service"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does efa also need the stuff in kernel-boot/ to be updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this rules is just badly designed, it should look more like

ACTION=="remove", GOTO="rdma_ulp_modules_end"
SUBSYSTEM!="infiniband", GOTO="rdma_ulp_modules_end"TAG+="systemd", ENV{SYSTEMD_WANTS}+="rdma.service"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you prefer to handle it then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this rules is just badly designed, it should look more like

ACTION=="remove", GOTO="rdma_ulp_modules_end"
SUBSYSTEM!="infiniband", GOTO="rdma_ulp_modules_end"TAG+="systemd", ENV{SYSTEMD_WANTS}+="rdma.service"

I'm not sure this will have the same functionality, as some of the modules listed in this file are part of the network stack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left this change untouched for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason to start rdma.service is if there is a rdma device, so doing it just because a net related module is loaded is already a bug.

RH really needs to fix this stuff. Fedora should be using kernel-boot and it is very unfortunate that RH8 shipped using this junky stuff. @dledford

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll change this commit to your suggested fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to leave it broken and work with RH to get an acceptable outcome, ie use kernel boot or fixup this file.

Add EFA to the list of hardware associated kernel modules in order to
trigger rdma.service load in systemd.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Make sure to explicitly trigger the shipped udev rules when installing
the rdma-core package.

Signed-off-by: Gal Pressman <galpress@amazon.com>
'python',
'python-argparse',
'python-docutils',
'python27-docutils',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python-docutils didn't work so I added the python27 one. I'll test without the former one and change accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed python-docutils.

buildlib/cbuild Outdated

class amazonlinux2(YumEnvironment):
docker_parent = "amazonlinux:2";
pkgs = amazonlinux1.pkgs | {'systemd-devel'};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be pkgs = centos7.pkgs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

In case the tar already exists with the same name there's no need to
create a symbolic link for it. This resolves an issue where the symlink
call would fail due to FileExistsError exception.

This is observed when packaging AmazonLinux1 that uses centos6 spec file
which uses the name rdma-core.tgz.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Add Amazon Linux 1/2 to the cbuild script and the building section in
the README.

Signed-off-by: Gal Pressman <galpress@amazon.com>
@gal-pressman
Copy link
Contributor Author

Kind reminder

@rleon rleon merged commit ab93ae0 into linux-rdma:master Nov 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants