On Wed, Dec 4, 2019 at 4:58 PM Andrew Engelbrecht
<andrew(a)fsf.org> wrote:
>
> Hello Pagure devel list,
>
> Sorry if this isn't quite the right place for this question.
>
> The FSF is looking into hosting our own Web based SCM service, possibly
> using Pagure. We would like to how the pagure.io team handles spam,
> abuse, and overly large repos on your site. We expect that this could be
> an issue, and that it would be critical for keeping our new site alive
> and well. Any input on this topic would be very helpful.
>
There are two issues with "large" Pagure systems that you might be
concerned with:
* Gitolite configuration regeneration is slow with lots of
repositories. In the default configuration, Gitolite is the backend
used for managing Git ACLs. Pagure has an alternative integrated
backend that is much more performant. This will likely be better for
heavier instances. We are exploring changing Pagure's default from
Gitolite to the integrated backend to simplify
installation/configuration and improve performance. The Pagure servers
run by Fedora (
src.fedoraproject.org,
git.centos.org, pagure.io) don't
use Gitolite anymore for this reason. This is configurable in
/etc/pagure/pagure.cfg (as installed by the Fedora, Mageia, or
openSUSE packages) as documented[1].
* Very large repos tend to be mostly fine in Pagure. The slowest
actions are currently listing commits and computing the various stats.
This is because that data is more-or-less processed on demand, and
really large repos like the Linux kernel tend to be a struggle to
process on-demand. Even GitHub and GitLab struggle with some of this,
the difference is that we don't have any background tasks that do this
and cache the results regularly, so it's more obvious. It does work
and doesn't cause the backend to die, it just takes longer than your
average repository to compute them.
I also found this page about the forge evaluations on LibrePlanet[2],
and I think I can provide some answers to your questions there (as I
run a few Pagure instances myself and I'm the maintainer of the pagure
package in Fedora, Mageia, and openSUSE).
> There is only a top level namespace, so there might be some conflicts in terms of who
reserves a repo name first. With the import/export functionality, at least a fork does not
need to be stuck under /fork.
This is actually configurable. You can set this in
/etc/pagure/pagure.cfg with the USER_NAMESPACE option as
documented[3].
> Very bad usability with no js from site (gnu criteria A): login, logout, create issue
etc. Would need to investigate adding functionality without js. Could package js into an
extension, but this is not ideal for users.
At least for login and logout, this depends on the auth backend used.
Fedora's instances all leverage OpenID Connect through Ipsilon[4], and
our theme does include JavaScript. You could just as easily support a
JS-free login with Ipsilon. Or, if you want to use Pagure's local auth
backend, there is no JavaScript in the register/login/logout flow that
I recall. As for issues/PRs/etc. it would be possible to restore
JS-free usability. This is an RFE for this[5], but no work has been
done. Pull requests to improve this are welcome!
I've tested /login /logout on all auth backend during our CSP hardening
process during last july and iirc , openid|fas is the unique auth
backend that uses js during the login process to hide a redirection
form, but is not actually necessary: it's just a visual thing
>
>> Ease of deployment, upgrading, debugging
>
> The basic setup of Pagure is pretty easy. I've documented quick-start
> processes in the Fedora, Mageia, and openSUSE packages[6][7][8] that
> do reliably work. These are just simplified versions of pingou's blog
> post about setting up Pagure on a Banana Pi[9], adapted for the distro
> and the newest versions of Pagure. The pagure documentation does a
> good job of explaining deployment strategies[10] in better detail.
> Upgrade procedures are mostly just update the files and restart the
> services. In cases where database migrations or other things are
> required, they are documented[11].
>
>> Resource requirements / performance under load
>
> A heavily loaded system will want to probably configure celery to have
> separate queues for processing various types of tasks[12]. This will
> help mitigate issues with the user experience. In addition, you will
> want to have the Redis server on a separate server[13] and a remote
> SQL server[14] to eliminate the contention between Pagure and the
> database servers. Fedora uses a separate PostgreSQL for its instances,
> while CentOS uses a separate MariaDB, so either works. I personally
> use PostgreSQL. You may also want to consider using
> repoSpanner[15][16] to have distributed storage for Git. This is a
> somewhat extreme option, but it is an option for scaling out Pagure.
>
>> How featureful are is are administration interface/tools (GUI or CLI)
>
> The pagure-admin CLI tool is the primary administrative interface, and
> provides a relatively intuitive way to do administrative actions for a
> Pagure server. I honestly don't have to use it much, but the
> admin-centric functionality is present there.
>
> Hopefully this additional information helps!
>
>
> [1]: