Alright team, now that we're able to vacuum all of our postgres tables
no matter how big, it's time for us to setup a plan for vacuuming all of
our tables on a regular basis. That way when we do get a chance to
perform a vacuum full on the tables they won't fill back up with dead
tuples and allocated free space.
For those that have sudo access on db2, I've written a little program,
vacstat.py, that can help analyze our needs. The first command that
people should know about is:
sudo -u postgres vacstat.py schema
When run in this mode, vacstat will attempt to get a list of all
databases and the tables in those databases. It will compare those
lists against the copy from a previous run. If they are the same then
vacstat will be happy. If they differ, vacstat will save the new data
in a file and print a message for you to setup a vacuum policy for the
I've set up vacstat to run from cron on db2 in this mode. If we get
email from vacstat telling us that there's a new database or table,
we'll need to make sure to enter those databases or tables in our vacuum
script and then follow the directions to let vacstat know we're aware of
the new tables.
The next major mode is:
sudo -u postgres vacstat.py stattuple-start --database DBNAME
In this mode, vacstat will vacuum the database and then take samples of
how dirty the tables have gotten over the course of a day. vacstat is
currently set to take a sample immediately after vacuuming, after one
hour, after six hours, and after a day. The information is recorded
into a pickle file under /var/lib/vacstat on db2. Once we have that
information we can use it to see how quickly the dead tuples accumulate
in the table and from that come up with a plan on how frequently to
vacuum on a table-by-table basis.
I'm currently testing the latter functionality with a run on some of our
smaller databases. Once that's working I'll be collecting stats for all
of our databases.
Show replies by date