Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile error on gcc 11 (Fedora 34) #4604

Closed
fhuberts opened this issue Mar 8, 2021 · 27 comments · Fixed by #4611
Closed

Compile error on gcc 11 (Fedora 34) #4604

fhuberts opened this issue Mar 8, 2021 · 27 comments · Fixed by #4611

Comments

@fhuberts
Copy link
Contributor

fhuberts commented Mar 8, 2021

I'm on Fedora 33 x64, doing a Fedora 34 x64 mock build for john 1.9.0-Jumbo-1-464-gc9825e688.

F34 has gcc 11:

# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.0.0 20210210 (Red Hat 11.0.0-0) (GCC) 

The rpm spec file uses these build commands:

cd src
./configure --disable-rexgen --enable-asan
make clean
make %{?_smp_mflags}

The error:

gcc -DAC_BUILT -mavx2 -DJOHN_AVX2 -c -g  -Og -fsanitize=address -m64 -I/usr/local/include -DARCH_LITTLE_ENDIAN=1 -DPREFER_FLOCK  -Wall -fno-omit-frame-pointer -Wno-deprecated-declarations -Wunused-but-set-variable -std=gnu89 -Wdate-time -D_POSIX_SOURCE -D_GNU_SOURCE -D_XOPEN_SOURCE=600  -fopenmp  -pthread -I/usr/local/include -DCL_SILENCE_DEPRECATION -funroll-loops argon2_encoding_plug.c -o argon2_encoding_plug.o
In file included from argon2_core_plug.c:32:
blake2.h:112:5: error: size of array element is not a multiple of its alignment
  112 |     blake2b_state S[4][1];
      |     ^~~~~~~~~~~~~
blake2.h:113:5: error: size of array element is not a multiple of its alignment
  113 |     blake2b_state R[1];
      |     ^~~~~~~~~~~~~
make[1]: *** [Makefile:1592: argon2_core_plug.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory '/builddir/build/BUILD/john-1.9.0.jumbo.1/src'
make: *** [Makefile:190: default] Error 2

Steps to reproduce

  • Install a Fedora 33 machine.

  • Install mock

  • You can reproduce it by using the repo

https://gitlab.com/fhuberts/rpmsUpstream.git

at commit 315a54e.

  • This repo needs a checkout of
https://gitlab.com/fhuberts/rpmbuilder.git

alongside it.

  • You have rpmbuilder and rpmsUpstream in the same directory.

  • Next you do

cd ./rpmsUpstream/fedora/john
  • And then
../buildRPMs  -M /etc/mock/fedora-34-x86_64.cfg rpm.spec
@claudioandre-br claudioandre-br added the RFC / discussion Help or comments wanted label Mar 8, 2021
@claudioandre-br
Copy link
Member

I've been aware of this for a few months. One can reproduce it using a regular Ubuntu 21.04 + gcc-snapshot (in a PPA, I guess) + .configure/make.

To me it is not clear if the alignment is a C standard /requirement. In fact, I would say the opposite: it is up to the compiler. Anyway, let's hear others.

@magnumripper
Copy link
Member

magnumripper commented Mar 8, 2021

So this is the offending code (pre-processed a little)

  __attribute__ ((aligned(64))) typedef struct __blake2b_state {
    uint64_t h[8];
    uint64_t t[2];
    uint64_t f[2];
    uint8_t  buf[2 * BLAKE2B_BLOCKBYTES];
    size_t   buflen;
    uint8_t  last_node;
  } blake2b_state;

  __attribute__ ((aligned(64))) typedef struct __blake2bp_state {
    blake2b_state S[4][1];   // <--- fails here
    blake2b_state R[1];
    uint8_t buf[4 * BLAKE2B_BLOCKBYTES];
    size_t  buflen;
  } blake2bp_state;

I'm not sure I even understand the "error: size of array element is not a multiple of its alignment" at all? What the heck do they mean?

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

Looks like a gcc bug to me, unless gcc developers seriously discussed and decided to impose this new rule for some good reason? This is a development version of gcc, so I'm not too surprised it's buggy. As I recall, there were also compile issues by gcc 10.0, which were gone in 10.1.

@fhuberts Are you Fedora package maintainer? Is it fine for you that the package doesn't build until gcc is fixed, or is a workaround needed meanwhile? Also, why do you enable ASan there? I assume all of this is just for testing, and will not be a production build, correct?

Thank you for reporting this!

@solardiz solardiz added notes/external issues portability and removed RFC / discussion Help or comments wanted labels Mar 9, 2021
@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

BTW, we have an open issue to update the Argon2 implementation to latest upstream, which will probably bring revised BLAKE2 code in.

Does upstream https://github.com/P-H-C/phc-winner-argon2 build with gcc 11 fine or does it fail similarly? With SIMD enabled?

@fhuberts
Copy link
Contributor Author

fhuberts commented Mar 9, 2021

I'm not a Fedora maintainer, I build this package for my own purposes.
The asan I will disable for production builds, tnx ;-)

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

gcc version 11.0.0 20210210 (Red Hat 11.0.0-0) (GCC)

I doubt this is actually upstream gcc snapshot of that date. My guess is it might be some Fedora fork of gcc-11 started at an earlier date, with partial backports from later snapshots to fix specific bugs but not the bug we're running into here. Something similar to what we see here: libdfp/libdfp#158

BTW, there isn't an upstream snapshot dated 20210210, the nearest are 20210207 and 20210214. Of course, it is also entirely possible Fedora developers took a git snapshot of another date.

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

FWIW, I cannot build gcc 11-20210307 (and the previous snapshot) on the same system and using the same build script that built 10.1 fine, now failing like this:

config.status: creating config.h
config.status: executing default commands
make[2]: Leaving directory `/home/gcc/build/gcc-11-20210307/objdir'
make[1]: *** [stage1-bubble] Error 2

There's nothing obviously causing this error in the preceding messages. I'm not going to spend more time on this just yet. It could be some change between 10 and 11, or it could be a difference between releases and snapshots. There isn't a gcc 11 release yet.

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

FWIW, I've just built gcc 10.2.0 (latest release) and built jumbo with it (latest from this repo as of now) - no issues.

@fhuberts Meanwhile, you can try changing the 64 to 8 here: __attribute__ ((aligned(64))) typedef struct __blake2bp_state {. This should let the build complete. This will likely also result in some formats crashing at runtime.

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

@claudioandre-br @magnumripper BTW, yes, we have gcc 10.2.0 on "super" now - feel free to use it (see /etc/motd).

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

@fhuberts Can you perhaps report this bug against Fedora 34's gcc package? If they're cherry-picking patches from gcc-11 series, they'll probably want to pick whatever patch fixes this as well, or update to latest upstream snapshot.

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

I've been aware of this for a few months. One can reproduce it using a regular Ubuntu 21.04 + gcc-snapshot (in a PPA, I guess) + .configure/make.

@claudioandre-br Is this still the case? I'd expect gcc upstream to fix this at some point in gcc-11 development. I doubt they'll make a gcc 11 release with that issue still in there. So maybe latest gcc-snapshot in PPA already has this fixed?

@claudioandre-br
Copy link
Member

Is this still the case?

Yes

gcc --version
gcc (Ubuntu 20210225-1ubuntu1) 11.0.0 20210225 (experimental) [master revision 4028d01a050:8aeb92daee4:a6baafcac5308be1a5d92c0b2a179495b7a24b52]
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

In file included from argon2_core_plug.c:32:
blake2.h:112:5: error: size of array element is not a multiple of its alignment
  112 |     blake2b_state S[4][1];
      |     ^~~~~~~~~~~~~
blake2.h:113:5: error: size of array element is not a multiple of its alignment
  113 |     blake2b_state R[1];
      |     ^~~~~~~~~~~~~

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

Oh. Looks like this is an intentional change on part of gcc developers.

commit 50bc94898fac1bd9cc1dabf227208fb5d369c4c4
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Oct 23 10:05:17 2020 +0200

    stor-layout: Reject forming arrays with elt sizes not divisible by elt alignment [PR97164]
    
    As mentioned in the PR, since 2005 we reject if array elements are smaller
    than their alignment (i.e. overaligned elements), because such arrays don't
    make much sense, only their first element is guaranteed to be aligned as
    user requested, but the next element can't be.
    The following testcases show something we've been silent about but is
    equally bad, the 2005 case is just the most common special case of that
    the array element size is not divisible by the alignment.  In those arrays
    too only the first element is guaranteed to be properly aligned and the
    second one can't be.
    
    This patch rejects those cases too, but keeps the existing wording for the
    old common case.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97164
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43783
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36093
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24944

We'll probably need to adapt to the new rules, then.

@solardiz solardiz added this to the Definitely 1.9.0-jumbo-2 milestone Mar 9, 2021
@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

So, once again: Does upstream https://github.com/P-H-C/phc-winner-argon2 build with gcc 11 fine or does it fail similarly? With SIMD enabled?

@fhuberts
Copy link
Contributor Author

fhuberts commented Mar 9, 2021

Filed it also at Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1937076

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

Filed it also at Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1937076

Oh, after my latest findings above, I wouldn't, but that's OK - maybe once this gcc change is known to break builds of too many packages, it'd end up getting reconsidered. I don't currently have a preference - I see reasons both for and against the change.

@fhuberts
Copy link
Contributor Author

fhuberts commented Mar 9, 2021

They already reported the same root cause :-(

@solardiz
Copy link
Member

solardiz commented Mar 9, 2021

They already reported the same root cause :-(

And that's a good thing.

@fhuberts
Copy link
Contributor Author

fhuberts commented Mar 9, 2021

There is a nice explanation on the Fedora bug, by the author of the root cause.

@claudioandre-br
Copy link
Member

FWIW, after the workaround, gcc 11 produces a lot of warnings. Basically at pkcs12 and ripemd128/256.

Listing LTS or recent distros, build is clean using gcc 5.4, 7.5, 9.3, 10.2.1, and clang version: 11.0.1 (it is a short list).

@fhuberts
Copy link
Contributor Author

thanks for merging!

@fhuberts
Copy link
Contributor Author

fhuberts commented Mar 16, 2021

AFAICS the warnings in ripemd don't seem related to this fix. the ones in pkcs do seem related.
gcc 11 is a lot stricter about a lot of things.
I haven't investigated.

@solardiz
Copy link
Member

AFAICS the warnings in ripemd don't seem related to this fix. the ones in pkcs do seem related.

Can someone post those warnings in here or to a new issue, please?

@fhuberts
Copy link
Contributor Author

f34-buildlog-john.txt

@solardiz
Copy link
Member

@fhuberts Thanks! What revision from this repo does this correspond to? Your build log mentions g77aab40c4, but there's no such commit here.

@fhuberts
Copy link
Contributor Author

@fhuberts Thanks! What revision from this repo does this correspond to? Your build log mentions g77aab40c4, but there's no such commit here.

I have cloned https://github.com/openwall/john.git and that commit is there and in my tree (strip the 'g')

@magnumripper
Copy link
Member

Your build log mentions g77aab40c4, but there's no such commit here.

Oh, but there is! 77aab40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants