Web Page Watcher

Tony Nelson tonynelson at georgeanelson.com
Mon Oct 9 21:28:38 UTC 2006


At 11:59 AM -0700 10/9/06, Paul Lemmons wrote:

>From: Tim <ignored_mailbox at yahoo.com.au>
>Date: 10/07/2006 09:55 PM
>
>> On Fri, 2006-10-06 at 09:35 -0700, Paul Lemmons wrote:
>>> For anyone who is interested, the script will watch any number of
>>> pages and will report via email if it changes. It is run via cron on
>>> whatever period you wish. Daily is probably often enough. It has two
>>> modes of watching. The first simply compares the page you are seeing
>>> now with the one you saw the last time you looked. The second method
>>> only compares the links within the page.
>>
>> Have you considered just comparing HTTP headers?

>Ok, I have considered it now and it would be fairly easy to accomplish.
>I am not sure it would be valuable though. There is a significant amount
>of data in the headers that is different every time the page is called.
>Are there particular fields that are only updated when the content of
>the page has changed? Or were you looking for something else completely?

This is easy to do in Python.  See ch. 11.3 in Mark Pilgrim's book _Dive
Into Python_ <http://diveintopython.org/toc/index.html>.  Note that the
book is for "experienced programmers"; if you aren't one, you might still
be able to benefit from the discussion of the HTTP headers.  To learn
Python, one would do well to first read the Python Tutorial
<http://docs.python.org/tut/>, which is a classic on the order of K&R.
-- 
____________________________________________________________________
TonyN.:'    The Great Writ     <mailto:tonynelson at georgeanelson.com>
      '      is no more.             <http://www.georgeanelson.com/>




More information about the users mailing list