Koji host selection/load balancing

Anthony Towns atowns at redhat.com
Tue Jun 7 08:05:30 UTC 2011


Hi,

The host selection algorithm in koji/daemon.py (TaskManager.checkRelAvail) is designed to choose a host that's in the top half of hosts by most available capacity. I think it doesn't quite work as might be expected though -- in the case where you have five hosts all with a capacity of 3.0, you could have the following behaviour:

  A takes job, capacities: [3.0, 3.0, 3.0, 3.0, 2.0] (median = 3.0)
  B takes job, capacities: [3.0, 3.0, 3.0, 2.0, 2.0] (median = 3.0)
  C takes job, capacities: [3.0, 3.0, 2.0, 2.0, 2.0] (median = 2.0)
  A takes job, capacities: [3.0, 3.0, 2.0, 2.0, 1.0] (median = 2.0)
  B takes job, capacities: [3.0, 3.0, 2.0, 1.0, 1.0] (median = 2.0)
  C takes job, capacities: [3.0, 3.0, 1.0, 1.0, 1.0] (median = 1.0)
  A takes job, capacities: [3.0, 3.0, 1.0, 1.0] (median = 3.0)
  D takes job, capacities: [3.0, 2.0, 1.0, 1.0] (median = 2.0)
  D takes job, capacities: [3.0, 1.0, 1.0, 1.0] (median = 1.0)
  B takes job, capacities: [3.0, 1.0, 1.0] (median = 1.0)
  C takes job, capacities: [3.0, 1.0] (median = 3.0)

So eleven jobs get divided up as:

  3 on A
  3 on B
  3 on C
  2 on D
  0 on E

This is because the >=median test doesn't actually ensure this host is in the top half, just that there's a host in the top half that's no better than this one. I think a better test would be ==best or >median, in which case you'd get:

  A takes job, capacities: [3.0, 3.0, 3.0, 3.0, 2.0]
  B takes job, capacities: [3.0, 3.0, 3.0, 2.0, 2.0]
  C takes job, capacities: [3.0, 3.0, 2.0, 2.0, 2.0]
  D takes job, capacities: [3.0, 2.0, 2.0, 2.0, 2.0]
  E takes job, capacities: [2.0, 2.0, 2.0, 2.0, 2.0]

Patch attached for consideration.

Cheers,
aj

-- 
Anthony Towns <atowns at redhat.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: checkRelAvail-patch
Type: text/x-patch
Size: 1338 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/buildsys/attachments/20110607/c761d1dc/attachment.bin 


More information about the buildsys mailing list