I am having this exact problem with CentOS and have been banging my head against it on and off for months.
I was able to mitigate it somewhat by doing something similar to what Moez suggested:
- In CentOS, a tempfs is a RAM-based emulated disk and is easy to mount up with the right options
- On boot, my CentOS livecd creates the tempfs and moves all of "/root"'s files over to it
- Then it sets the home directory for the root user as the tempfs directory, so operations are performed out of it, instead.
This has helped reduce filesystem thrash on the overlay you mentioned above, and it extends the life of the system but doesn't fix it completely - kind of a bandaid.
You can track when the livecd is going to explode with "dmsetup status" and watching the far right "number / number" on the live-rw mount. When the smaller number meets the larger number, your filesystem remounts as read only.
I've been trying to figure out what these numbers represent, and posted
this Superuser question, but no one seems to know.