Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
-Mike
Michael McGrath wrote:
Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
I can imagine the Configuration Management HOWTO having a section for Common Problems & General Troubleshooting, if that's what you mean?
What were you thinking about?
Kind regards,
Jeroen van Meeuwen -kanarip
On Tue, 22 Jul 2008, Jeroen van Meeuwen wrote:
Michael McGrath wrote:
Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
I can imagine the Configuration Management HOWTO having a section for Common Problems & General Troubleshooting, if that's what you mean?
What were you thinking about?
Well, lets say apache is down.
1) Is apache running? no? run service httpd configtest error? Fix config no error? Start apache, look through logs for outage cause yes? Is the network running? yes? Is it routing? Yes? blah no? restart network / add default route no? start network
Thats a skeleton view of it. Each step would include a test "example.com" for example. You can see how the above could get absolutely huge.
-Mike
Mike McGrath wrote:
On Tue, 22 Jul 2008, Jeroen van Meeuwen wrote:
What were you thinking about?
Well, lets say apache is down.
- Is apache running? no? run service httpd configtest error? Fix config no error? Start apache, look through logs for outage cause yes? Is the network running? yes? Is it routing? Yes? blah no? restart network / add default route no? start network
Thats a skeleton view of it. Each step would include a test "example.com" for example. You can see how the above could get absolutely huge.
True. We're definitely not going to do the Dummy Guide to Troubleshooting (are we?). We can however document things as:
Problem: 503 Service Temporarily Unavailable after restart of the puppetmaster service
Solution: Restart the httpd service or wait.
I don't know how far this goes... Simple issues that might occur during the implementation (eg. parts described in a HOWTO should there be any) should suffice.
Right?
Kind regards,
Jeroen van Meeuwen -kanarip
On Tue, Jul 22, 2008 at 7:46 AM, Mike McGrath mmcgrath@redhat.com wrote:
On Tue, 22 Jul 2008, Jeroen van Meeuwen wrote:
Michael McGrath wrote:
Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
I can imagine the Configuration Management HOWTO having a section for Common Problems & General Troubleshooting, if that's what you mean?
What were you thinking about?
Well, lets say apache is down.
- Is apache running? no? run service httpd configtest error? Fix config no error? Start apache, look through logs for outage cause yes? Is the network running? yes? Is it routing? Yes? blah no? restart network / add default route no? start network
I would put this as a separate living document that ties into the main one. It would need to be living because well, EL-2.1 problems are differtent from EL-5 from F-10 (why doesn't my changes to /etc/sysconfig/network-scripts seem to work?)
Thats a skeleton view of it. Each step would include a test "example.com" for example. You can see how the above could get absolutely huge.
-Mike
csi-devel mailing list csi-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/csi-devel
On Sat, 26 Jul 2008, Stephen John Smoogen wrote:
On Tue, Jul 22, 2008 at 7:46 AM, Mike McGrath mmcgrath@redhat.com wrote:
On Tue, 22 Jul 2008, Jeroen van Meeuwen wrote:
Michael McGrath wrote:
Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
I can imagine the Configuration Management HOWTO having a section for Common Problems & General Troubleshooting, if that's what you mean?
What were you thinking about?
Well, lets say apache is down.
- Is apache running? no? run service httpd configtest error? Fix config no error? Start apache, look through logs for outage cause yes? Is the network running? yes? Is it routing? Yes? blah no? restart network / add default route no? start network
I would put this as a separate living document that ties into the main one. It would need to be living because well, EL-2.1 problems are differtent from EL-5 from F-10 (why doesn't my changes to /etc/sysconfig/network-scripts seem to work?)
Yeah, Seems like I'll have to think on it some more on where to put it and exactly what should be in there.
-Mike
Mike McGrath wrote:
On Sat, 26 Jul 2008, Stephen John Smoogen wrote:
I would put this as a separate living document that ties into the main one. It would need to be living because well, EL-2.1 problems are differtent from EL-5 from F-10 (why doesn't my changes to /etc/sysconfig/network-scripts seem to work?)
Yeah, Seems like I'll have to think on it some more on where to put it and exactly what should be in there.
We could distinct between "common problems while running through this little HOWTO guide I've included in the Standard" and "(common) problems you should already be able to address", the latter being developed/maintained somewhere else, outside our Standards framework, because well... standards are standards, not dummy guides.
FWIW, I've just recently started developing the course materials for students and teachers on fedorahosted[1], since I'm doing the work anyway (I teach a little Linux within the company I work for) and I really wanted it to be public, open and usable by anyone else. This is where "why don't these changes work?" could be addressed in a form of training/exercises, without cluttering up the standard, maybe?
-Jeroen
On Mon, 28 Jul 2008, Jeroen van Meeuwen wrote:
Mike McGrath wrote:
On Sat, 26 Jul 2008, Stephen John Smoogen wrote:
I would put this as a separate living document that ties into the main one. It would need to be living because well, EL-2.1 problems are differtent from EL-5 from F-10 (why doesn't my changes to /etc/sysconfig/network-scripts seem to work?)
Yeah, Seems like I'll have to think on it some more on where to put it and exactly what should be in there.
We could distinct between "common problems while running through this little HOWTO guide I've included in the Standard" and "(common) problems you should already be able to address", the latter being developed/maintained somewhere else, outside our Standards framework, because well... standards are standards, not dummy guides.
FWIW, I've just recently started developing the course materials for students and teachers on fedorahosted[1], since I'm doing the work anyway (I teach a little Linux within the company I work for) and I really wanted it to be public, open and usable by anyone else. This is where "why don't these changes work?" could be addressed in a form of training/exercises, without cluttering up the standard, maybe?
That WORKSFORME.
-Mike
On Mon, 2008-07-21 at 19:59 -0400, Michael McGrath wrote:
Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
How about referencing a canonical place on the Wiki for each standard where people can collect that kind of thing, something like a FAQ about the standard ?
David
On Sat, 26 Jul 2008, David Lutterkort wrote:
On Mon, 2008-07-21 at 19:59 -0400, Michael McGrath wrote:
Should we include a section with each standard that includes basic problem resolution and troubleshooting?
I hesitate to include it because I'm not sure its the type of thing that would ever be complete.
How about referencing a canonical place on the Wiki for each standard where people can collect that kind of thing, something like a FAQ about the standard ?
Thats not bad, and fits the use case I had in mind which was that most places (larger anyway) have a noc as the first line of defense against outages. They'd be able to quickly make changes and update a wiki over the standard. Perhaps it'd be good to just continue focusing on the standard and keep problem resolution in the back of our minds and as we start to gain critical mass, we can start getting more people on that aspect.
-Mike
csi-devel@lists.fedorahosted.org