[fedmsg] Proposal on a replay mechanism

Ralph Bean rbean at redhat.com
Tue Jul 9 01:08:24 UTC 2013


On Mon, Jul 08, 2013 at 03:39:56PM +0200, Simon Chopin wrote:
> Hi,
> 
> As some of you might know, I am the student working on adapting fedmsg
> for Debian as part of Google Summer of Code program.
> 
> One of the requirements for fedmsg to be part of Debian infrastructure
> is to be resilient in case a network link drops, as we have services
> dispatched all over the world. Currently, if a client drops out, it has
> no way of catching up on what happened when it was offline.
> 
> To solve this, I was thinking of the following: all the endpoints that
> must be able to replay some messages should provide two URLs, say
> tcp://foo.bar:3000 and tcp+pair://foo.bar:3001, the later listening in
> for PAIR-type[1] connexions. The clients on the simple URL are like the
> current clients, but the PAIR socket allow the other clients to request
> the missing messages.
> 
> The query would come on the $prefix.replay.$topic topic (say,
> org.fedoraproject.dev.replay.buildsys.build.state.change), and specify
> the IDs to resend, or a time interval (for manual queries), and the
> answer(s) would come on the same topic.
> 
> To be able to detect a missing message, the "i" field would have to be
> topic-bound instead of being at the endpoint level.
> 
> Thoughts?
> 
> Cheers,
> Simon
> 
> [1] https://learning-0mq-with-pyzmq.readthedocs.org/en/latest/pyzmq/patterns/pair.html

Hi Simon, thanks for taking this up.

I like the idea of the special replay topic.  That makes for a pretty
clean API for requesting replay of messages.  FWIW, a patch was just
introduced in git that adds a "uuid" field to every message in
addition to "i".  That could be used to request specific messages.

One problem I see is in the implementation details.  How long is an
endpoint expected to hold on to its old messages before discarding
them?  Whereas currently, an application that gets a fedmsg hook added
doesn't retain much extra state as a result, this replay-request
proposal would require a book keeping mechanism added to every
endpoint (in our case, every mod_wsgi/httpd process, others).

Have you considered using the datagrepper API for a replay mechanism?
https://apps.fedoraproject.org/datagrepper ?  Although we haven't
implemented it in practice yet, I have been anticipating using that
more in the future.  I.e., if a consumer crashes and comes back
online, it could request a list of every message during that timespan
from the central store.

Cheers-
 -Ralph
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/messaging-sig/attachments/20130708/22d225fb/attachment.sig>


More information about the messaging-sig mailing list