Hi Jon,

First of all, thanks for doing this.

Now on to some picky feedback:
- The only current difference between the ARMV6 and the ARMV7 versions is the use of cp15
  or DMB in the standalone barrier functions. Could this not be done through some sort of
  macro instead?
- If we're playing around in here, might as well add support for (32-bit) ARMv8?
- Can you not set ompi_cv_asm__arch to the target file name rather than copying around?

Together, this could end up with a config patchset something like (freehanded, not tested):
---
        armv8*|armv7*)
            ompi_cv_asm_arch="ARM"
            OPAL_ASM_SUPPORT_64BIT=1
            OPAL_ASM_ARM_VERSION=7
            AC_DEFINE_UNQUOTED([OPAL_ASM_ARM_VERSION], [$OPAL_ASM_ARM_VERSION],
                               [What ARM assembly version to use])
            OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "=&r"(ret)'
            ;;

        armv6*)
            ompi_cv_asm_arch="ARM"
            OPAL_ASM_SUPPORT_64BIT=0
            OPAL_ASM_ARM_VERSION=6
            AC_DEFINE_UNQUOTED([OPAL_ASM_ARM_VERSION], [$OPAL_ASM_ARM_VERSION],
                               [What ARM assembly version to use])
            OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "=&r"(ret)'
            ;;

        armv5*linux*|armv4*linux*)
            # uses Linux kernel helpers for some atomic operations
            ompi_cv_asm_arch="ARMV5"
            OPAL_ASM_SUPPORT_64BIT=0
            OPAL_ASM_ARM_VERSION=5
            AC_DEFINE_UNQUOTED([OPAL_ASM_ARM_VERSION], [$OPAL_ASM_ARM_VERSION],
                               [What ARM assembly version to use])
            OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "=&r"(ret)'
            ;;
---
whilst containing one less source file, and not doing any copying as part of the configure step.

I realise this would require at least touching generate-asm.pl, but it might be possible
to get away with mostly gas macros and the end result would be a lot neater.

/
    Leif