Well, regarding the "based on something", you can hand off a list of packages to
createrepo_c with --pkglist, and avoid the need to download files with --update +
--skip-stat. Unfortunately that doesn't help you with the package file management.
In a vacuum --baseurl would help here because you could have one root directory, however
in reality it breaks repository mirroring because any mirror be telling clients to fetch
the packages from the source-of-truth.
I'm not 100% sure how --basedir works, the description is a bit vague.
Another option is to use something like Pulp which stores all the information required for
metadata generation inside Postgresql and thus can do so without ever touching the
packages / headers again. That approach isn't necessarily free of downsides either,
but it does abstract the whole file management problem.