fix a txt/dat file with soccer data using awk and sort

Jon LaBadie jonfu at jgcomp.com
Sun Oct 4 05:01:26 UTC 2015


On Sat, Oct 03, 2015 at 06:23:38PM -0800, Antonio Olivares wrote:
> Dear fedora users,
> 
> I have a file table.dat with team data ie, Wins Loses Draws Goals For, Goals Against, Total Points as follows:
> 
> $ cat table.dat 
> Team    W       L       D       GF      GA      DIF     PTS
> Team1   3       2       1       13      17
> Team2   2       3       1       14      13
> Team3   6       0       0       28      13
> Team4   0       6       0       5       23
> Team5   0       0       0       0       0
> $ awk '{print $1 "\t" $2 "\t" $3 "\t" $4 "\t" $5 "\t" $6 "\t" $7 "\t" $8 "\t" $5-$6, "\t" $2*3+$3*0+$4*1}' table.dat
> Team    W       L       D       GF      GA      DIF     PTS     0       0
> Team1   3       2       1       13      17                      -4      10
> Team2   2       3       1       14      13                      1       7
> Team3   6       0       0       28      13                      15      18
> Team4   0       6       0       5       23                      -18     0
> Team5   0       0       0       0       0                       0       0
> bash-4.3$
> 
> I can get the DIF by subtracting the 5th - 6th and get the goal differential, and the points by multiplying the Wins by 3 and the loses by 0 and the ties by 1 and get the points.  I am not expert, but instead of using a spreadsheet I would like to use awk as the example shows, but I would like the DIF to be under DIF and the points under PTS, how can I accomplish this?  Also if it were possible which I do not see why not?  is how can I sort the teams by the ones higher in the table?  

Here is a shot at it.  Assumptions include the team names are
longer than shown so I left room for up to 15 chars and for
the sorting to work as I have it, the names can not have spaces.

awk '
BEGIN { SortCmd = "sort -nr -k 8" }

NR == 1 {
	printf "%-15s %5s %5s %5s %6s %6s %6s %6s\n",
	        "TEAM", "W", "L", "D", "GF", "GA", "DIF", "PTS"
}

NR > 1 {
	dif = $5 - $6
	pts = $2 * 3 + $4
	printf "%-15s %5d %5d %5d %6d %6d %6d %6d\n",
		$1, $2, $3, $4, $5, $6, dif, pts | SortCmd
}
' datafile

> 
> For example, I would like to do something like:
> 
> http://www.premierleague.com/en-gb/matchday/league-table.html/
> 
> http://www.mlssoccer.com/standings
> 
> http://www.mediotiempo.com/tabla_general.php?id_liga=1
> 
> but only using awk/sed/sort no spreadsheet, no database only nice unix/linux/bsd tools
> 
> Also add a variation, if the teams tie in regulation, then overtime kicks(extra time) and/or penalty kicks to determine a winner.  If the team wins in overtime or penalty kicks the winning team earns two points and the loser earns one point only 
> 
> Team    W       L       D       GF      GA      OT/PKS   DIF     PTS
> Team1   3       2       1       13      17         
> Team2   2       3       1       14      13
> Team3   6       0       0       28      13
> Team4   0       6       0       5       23
> Team5   0       0       0       0       0
> 
> Here Team1 ties with Team2 and they go into overtime and remain tied in Overtime.  After the overtime, they go into Penalty Kicks.  Team2 beats Team1 in PKS and earns two points, in the overall PTS 
> 
> team1 earns 10 total pts, and team2 should have 8 pts.  But the awk command on top gives 7 points because it does not take into account PKS.  
> 
> awk '{print $1 "\t" $2 "\t" $3 "\t" $4 "\t" $5 "\t" $6 "\t" $7 "\t" $8 "\t" $5-$6, "\t" $2*3+$3*0+$4*1}' table.dat
> 
> how can it be done so that the table prints out correctly and in the OT/PKS line, the team that wins gets a 1:0 and the losing team gets a 0:1 and each time they tie and go in to OT, a running tally gets going 2:0 or 1:1 depending if they split the games.  
> 
> Thank you in advance for suggestions and advice.  I am discovering awk that one can do math to lists and tables it is awesome.  I did not know this, I just used sed -i 's|*|x|g' file to replace text x with *. 
> 

I'm unclear about what is needed for your second variation.
It seems that the input data should have numbers in the PKS
column indicating how many extra points they should receive.
In that case, simply adjust the terms calculating "pts"
(something like pts = $2 * 3 + $4 + $7) and the corresponding
arguments to printf.

-- 
Jon H. LaBadie                 jonfu at jgcomp.com


More information about the users mailing list