Web application frameworks and the future

Toshio Kuratomi a.badger at gmail.com
Thu Jun 28 20:23:08 UTC 2012


"""
In the beginning there was cgi.  And everything was slow but simple.  And
lo, one day we began to crave faster speeds, MVC, and other features that
plain cgi did not provide.  And thus we entered the age of web
frameworks....
"""

At last week's infrastructure meeting, I brought up the fact that we seem to
have a proliferation of web application frameworks for the new apps that we
are creating.  In some ways this is good as it lets us experiment with new
technologies as a group and lets us fit the needs of a specific application
or programmer's style with the framework.  However, it has downsides as
well; mostly in the realm of ongoing maintenance of the apps.  We need to
take a moment to figure out where we want to go with this.

== Some issues ==

* Retaining group knowledge of many different application frameworks
  even when the original author stops being an active contributor
* Maintaining the packages in EPEL and Infrastructure for these
  * Maintaining some knowledge of the frameworks' code and involvement with
    their upstreams to fix bugs in the frameworks themselves.
* Deployment of multiple frameworks that may have conflicting deps.
* Deployment of multiple frameworks taking up more memory on the servers.

We think to some extent we currently have ways to manage the deployment
problems:

* Separate app servers for individual apps.  As long as we have an inflow of
  hardware resources we can continue to separate out applications onto
  different machines instead of running them all on app* as our first
  generation of apps was.  This would be an ongoing expense.  We should
  continue to allocate at least two servers to each application so that we
  can do things like reboots and updates transparently to the users.
* Openshift.  Hosting applications on a cloud service like openshift allows
  us to separate out applications and parcel out memory as a resource
  differently than if we're managing multiple apps on a single host.

While these factors do change the game as far as hardware allocation is
concerned, it doesn't help our manpower resources.  As we spin up more hosts
for each web application, we need sysadmin time to spin those hosts up.  As
we deploy to openshift we need to figure out how we're going to integrate
configuration and deployment to those hosts into our existing puppet
configurations (I don't think that any of our current openshift deployed
services are puppet managed) and how we're going to manage load balancing
and failover.

== Where are we now? ==

.. note:: I would like this section to be an inventory of everything that
    we're deploying and writing but I don't have a complete picture.  If you
    have more things, feel free to update this on the wiki page:
    https://fedoraproject.org/wiki/Infrastructure_Services_Survey


TG1 => Turbogears1, SQLAlchemy and genshi/mako
Old TG1 => TurboGears1, SQLObject and kid
TG2 => TurboGears2
Pyramid => Curent successor to TG2 but a break from the current TG1 style;
           may have a new layer built on top of it at a later date that is
           more TG-ish.
Flask => Easy to get started with and wrap your head around. Great for small
         projects.  Not a huge stack of deps.

Application        Host       Framework   Notes
-----------        ----       ---------   -----
bodhi              app*       old TG1     has a pyramid branch
bodhi              releng*    old TG1     has a pyramid branch
busmon             ?          TG2/moksha  Not yet deployed
copr(2)            ?          flask       not yet deployed. Loosely,
                                          "buildsys for fedorapeople repos"
datagrepper        ?          flask?      Not yet deployed
dataviewer         ?          flask?      Not yet deployed
dpsearch           ?          perl/C      Not yet deployed testing on
                                          search01-dev
elections          app*       TG1         has a TG2 branch and ianweller
                                          trying a flask branch for
                                          comparison
fas                fas*       TG1
fedorabadges       ?          pyramid     Not yet deployed
fedoracommunity    app07?     TG2/moksha  Only runs on RHEL5.  We're
                                          retiring this pending on
                                          datanommer being deployed or we
                                          get tired of keeping app07.  (Is
                                          the version of moksha here old as
                                          well?)
fedorahosted-reg   openshift? flask       not yet deployed
freemedia          app*       php         In Puppet. Looks like it would be
                                          very simple to port to something
                                          lightweight like Flask if we
                                          wanted to get away from PHP.
fudcon-reg         openshift  flask       registration application for
                                          fudcon.  Not currently configured
                                          in puppet, load balanced, etc.
koji               koji*      custom      was mod_python.  plans to move to
                                          mod_wsgi.  (Current status?)
mirrorlist-server  app*       custom      lightweight, mod_wsgi process.  No
                                          real framework
mirrormanager      app*       old TG1     has an older TG2 branch
packagedb          app*       TG1
packages           packages*  TG2
pager              app*, noc* CGI
raffle             app*       TG2         Disposable -- no promises to keep
                                          maintaining have been made
smolt              value*     TG1         We're planning to get rid of this
                                          in favor of census on openshift
                                          (Are we still running the process
                                          on app* even though it isn't
                                          actively serving pages?)
tagger             packages*  TG2



We deploy but do not code for:
Application        Host       Framework   Notes
-----------        ----       ---------   -----
askbot             ask*       django      Uses openid login
darkserver         darkserver django
insight            insight*   drupal/php  I'm not sure the level of coding
                                          that we do on this.
gitweb(-caching)   pkgs*      cgi?        thinking of replacing with cgit
                   hosted*
hg?                hosted*    cgi?
loggerhead         hosted*    mod_wsgi
mailman webui      hosted*    python cgi  mailman web frontend for
                   collab*                lists.fp.o and lists.fh.o
mediawiki          app*       php
reviewboard        hosted*    django      we've talked about moving this to
                                          openshift and/or app servers
trac               hosted*    mod_wsgi    genshi templates


Deployed but only for our sysadmins: collectd, nagios, awstats

== Some analysis ==

Right now we're deploying against the following frameworks for applications
in our critical path:

* TG1
* mod_wsgi/mod_python

We also have a few additional applications that are not currently critical
to creating Fedora but are value adds that we've worked hard on.  These
applications are written against

* TG1
* TG2
* flask

The new applications that we're writing seem to be written against:
* TG2
* flask
* pyramid

== Some thoughts ==

=== Openshift ===

Although openshift is attractive from a hardware-provisioning perspective,
we haven't figured out how to manage configs for it for any of our currently
deployed services.  So, for instance, if there was evidence that one of our
openshift instances had been compromised we wouldn't have the benefit of
configs checked into puppet to refer to and to help us reconstruct that
instance.  We probably also don't have these hosts as part of our backups
(don't know if openshift manages backups for us).  We should figure out
disaster recovery for these hosts before we go too much further here.
We also don't currently have any openshift hosts working in a load balanced
fashion so, for instance, doing an update of an app could require user
visible downtime.

If we're going to use openshift for deploying production apps, we should
come up with answers for these tasks.

=== Getting rid of TG1 ===

At some point I want to get rid of the TG1 stack.  Upstream is in
maintenance-only mode for it.  And increasingly, they are moving to the
somewhat incompatible TG-1.5.x stack for their maintenance while
simultaneously pushing people to write their apps for TG2 or pyramid.  While
TG1.1 "just works" for us right now, we're eventually going to run up against
things that upstream isn't handling (whether bugfixes in the TG-1.1.x
branch, security fixes, or porting of the stack to new versions of dependent
libraries).  While the maintenance burden of the TG1.1 stack is low at this
time, it's just going to get higher over time.

In order to port away from the TG1 stack, I want to figure out what we
should be porting to.  Last year we thought that should be TG2 because
moksha was intrinsically linked to TG2 and we were deploying on
fedoracommunity which needed moksha.  Now, neither of those is true.
(moksha can now run on other frameworks besides TG2.  fedoracommunity is
going away in the future.)  However, there's no clear successor.

=== Plethora of frameworks ===

We're writing and deploying apps written against an ever expanding number of
frameworks.  I am a bit afraid of this.  While it is nice to know that we
have exactly the right tool for the job among the many choices of framework,
I think that maintaining apps written in a variety of frameworks is going to
cause us pain as frameworks die off or change radically and current
contributors move on to other things.  With that in mind I think we should
commit to using only a few frameworks in our coding for infrastructure and
those frameworks will serve to be where we concentrate on gathering our
experience, what we write new apps against, what we design our
infrastructure to support, and what we port our apps to as time goes on.

From browsing the list of frameworks we're currently deploying:

Django has a good track record of making new releases with clear porting
guides for making changes in your old code on run on the new versions.
However, it is conceptually something of an application server (like JBoss),
not a pure framework like Turbogears.  At the least, this would require some
thought on our part on how to deploy and code for it.

Flask seems to be lighter weight in terms of its deps and in terms of its
learning curve.  It's pretty easy to run a flask app in openshift.  If we
were to choose just two frameworks, it might make sense to choose flask as
an entry level framework for smaller applications and one other framework
with lots of bells and whistles for things that need those features.

TurboGears2 is still developed upstream.  Some of the main developers have
moved on to work on pyramid but others are continuing to work on TG2.
Upstream has committed to doing the necessary work to port TG2 to python3
but much of the TG2 underlying stack is in maintenance mode so the TG2 devs
have had to do some of that work themselves.

Pyramid is a merging of certain segments of the zope community and the
pylons community.  If pylons has a successor, this is it.  Since TG2 was
built on pylons, pyramid might be the next logical step (or a web framework
built on top of pyramid).

== Final thoughts ==

My primary goal is to decide what framework to port our old TG1 code to so
that we can stop maintaining the TG1 stack before upstream stops working on
it at all.  My secondary concern is that we stop growing the other stacks
that we're maintaining and concentrate on one or two which will make
mainenance easier.  Can we choose two frameworks right now that will suit
our needs?  It seems that flask can serve a niche and maybe should be one of
them.  What should our bells and whistles framework be?  TG2 or pyramid or
something else entirely?

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20120628/a3c72fe2/attachment.sig>


More information about the infrastructure mailing list