On (11/08/15 15:04), Nikolai Kondrashov wrote:
Hi Pavel,
It's a good idea to have such a test, thank you! Please see my comments
regarding the script (as you requested) below.
On 08/11/2015 01:46 PM, Pavel Reichl wrote:
>diff --git a/src/tests/whitespace_test.sh b/src/tests/whitespace_test.sh
>new file mode 100755
We don't really need an extension in an executable file name. It will make it
that bit difficult to change the implementation language if we ever need that.
We don't put ".elf" on binary files, after all.
It's not a big deal, of course, just ranting a little.
>index
0000000000000000000000000000000000000000..4f34bdf981996440b89a3e7e5f99ad3b6b3eae34
>--- /dev/null
>+++ b/src/tests/whitespace_test.sh
>@@ -0,0 +1,13 @@
>+#!/bin/sh
Did you intend /bin/sh here? Most people expect Bash to run their scripts so
it might be safer to just specify it explicitly if you don't really need to
support the vanilla shell. It might prevent some unexpected failures on
systems where /bin/sh isn't Bash (e.g. Debian with Dash installed) in the
future, even though it works fine now.
>+
>+# example: EXCLUDE_FILES="path1\|path2"
>+EXCLUDE_FILES="po/ca.po"
The name of the variable is somewhat confusing: it's not just files, it's a
regex, so there's more syntax, and care must be taken to escape special
characters. How about "PATH_EXCLUDE_REGEX", or similar? E.g. the regex above
will also match "po/catpoo".
>+N=$(git grep -n -I -P '^.*[ ]+$' -- $(git rev-parse --show-toplevel) | grep
-v $EXCLUDE_FILES | wc -l)
The "^.*" combination in the regex is a noop and can be removed for
readability. There is no need to use '[]' around the space. Matching just
space would skip trailing tabs, the general whitespace pattern '\s' might be
better here. So the pattern can become '\s+$'.
We don't support running under a path with whitespace, but it's still better
to quote the "git rev-parse" command substitution. I.e. write this instead:
"$(git rev-parse --show-toplevel)"
The grep pattern is not anchored to anything, which means that we're likely to
have unintended matches. E.g. a Makefile.am line with trailing whitespace,
mentioning matching file will get ignored. You can anchor it like this:
grep -v "^\\($EXCLUDE_FILES\\):"
>+
>+if [ "$N" -ne 0 ]; then
>+ echo "Trailing whitespace found:"
>+ git grep -n -I -P '^.*[ ]+$' -- $(git rev-parse --show-toplevel) | grep
-v $EXCLUDE_FILES
>+ exit 1
>+fi
>+
>+exit 0
Nobody likes duplicated code, and in this case there are several ways to avoid
that. I like the one using awk, which combined with the above comments can
look like this:
#!/bin/bash
# An AWK regex matching tracked file paths to be excluded from the search.
# Example: '.*\.po|README'
PATH_EXCLUDE_REGEX='.*\.po'
git grep -n -I -P '\s+$' -- "$(git rev-parse --show-toplevel)" |
awk -- "
BEGIN {
found = 0
}
! /^($PATH_EXCLUDE_REGEX):/ {
if (!found) {
print \"Trailing whitespace found:\"
found = 1
}
print
}
END {
exit found
}
"
Note that you don't need the "exit" command or checking the exit status
after
this command, if it is the last one in the script. Its exit status will be
used as the script's exit status.
BTW, shouldn't we take care to not use git, or to not run this test in case
we're running outside git tree, e.g. from unpacked tarball?
+1
LS