Hi,
I was trying to bump the espresso package to v4.1.1 and came across multiple issues in the %check phase of the builds on Rawhide, F31 and EPEL8.
On Rawhide and F31, some tests seem to receive an error in the MPI_INIT phase. Error: *** An error occurred in MPI_Init *** on a NULL communicator This happened with OpenMPI, but not MPICH. I was able to workaround these failures by retriggering the builds a couple of times, which isn't a great solution.
Upstream issues on that topic: Rawhide: https://github.com/espressomd/espresso/issues/3306 F31: https://github.com/espressomd/espresso/issues/3305
On EPEL8 however, things are much worse - multiple tests fail on each arch in MPI_INIT incl. a couple of segfaults: https://github.com/espressomd/espresso/issues/3307 I wasn't able to reproduce these issues in a CentOs8 container in docker.
Any ideas and help would be appreciated.
Unrelated, but still annoying I saw a Fatal Python error: init_sys_streams: can't initialize sys standard streams on s390x (F31), which got triggered by setting the somewhat unrelated CMAKE_SKIP_RPATH=ON option. Any suggestions on that would be more than welcome. Upstream issue: https://github.com/espressomd/espresso/issues/3316
Christoph
On 11/15/19 3:07 PM, Christoph Junghans wrote:
Hi,
I was trying to bump the espresso package to v4.1.1 and came across multiple issues in the %check phase of the builds on Rawhide, F31 and EPEL8.
On Rawhide and F31, some tests seem to receive an error in the MPI_INIT phase. Error: *** An error occurred in MPI_Init *** on a NULL communicator This happened with OpenMPI, but not MPICH. I was able to workaround these failures by retriggering the builds a couple of times, which isn't a great solution.
Upstream issues on that topic: Rawhide: https://github.com/espressomd/espresso/issues/3306 F31: https://github.com/espressomd/espresso/issues/3305
I would suggest bringing this up with the openmpi folks. They are very helpful.
On EPEL8 however, things are much worse - multiple tests fail on each arch in MPI_INIT incl. a couple of segfaults: https://github.com/espressomd/espresso/issues/3307 I wasn't able to reproduce these issues in a CentOs8 container in docker.
I commented in the issue as well, but there are issues with UCX and openmpi in RHEL8.1.