Hi Lukas,
I'll log a case for this issue. Mind if I ask when will the fix be ready? BTW, I've also tested with Script Server (plugin) resource type to monitor mongoDB. It doesn't work at all. Do you have any idea? I'll write a separate thread to seek advice. Please post your feedback/comment as well.
Thank you.
Regards, Charles Leow
-----Original Message----- From: rhq-users-bounces@lists.fedorahosted.org [mailto:rhq-users-bounces@lists.fedorahosted.org] On Behalf Of Lukas Krejci Sent: Tuesday, July 03, 2012 10:41 PM To: rhq-users@lists.fedorahosted.org Subject: Re: Availability scan for Process Scan
I think the issue might lie in the fact that the ProcessInfo class that the ProcessComponent relies on when gathering info about the process is not used properly.
Some time ago I was dealing with a similar problem in the Apache plugin that needs accurate process info at any given time and it, too, was not getting accurate info when apache was restarted. Unlike the ProcessComponent, though, it was getting the info from ResourceContext.getNativeProcess() call.
I fixed that in the ResourceContext class by issuing ProcessInfo.refresh() before we check for ProcessInfo.isRunning() - because isRunning() might return stale data.
Now I only fixed that in ResourceContext class which the ProcessComponent (i.e. the component responsible for handling the Process resource type) doesn't use to determine the process state.
So I think we need to fix the ProcessComponent in a similar way I did the ResourceContext.getNativeProcess().
Charles, would you mind creating a bugzilla for this so that we can track it?
https://bugzilla.redhat.com/enter_bug.cgi?product=RHQ%20Project
Thanks,
Lukas
On Tuesday, July 03, 2012 08:48:33 John Mazzitelli wrote:
What version of RHQ? (RHQ 4.4 introduced alot of changes to the availability scanning stuff).
After you get the initial availability scan, does it update fairly quickly thereafter? If so, could it be a startup issue? (maybe it took a very long time for your agents to start, register, download plugins, start the plugin container and begin avail scanning? If the boxes are heavily loaded, perhaps it takes a long time? I realize 30m would be extremely long (and I can't say I've ever heard of the agent taking 30m to do all that) but the question remains - does this only happen on startup of the agent? Or does it take 30m to report ANY availability change while the agent is running.
I haven't looked at the Process resource type and its resource component code in a while, look in the agent logs and see if there are any log messages regarding errors happening with that plugin. (you should run the agent in debug mode to see if that would be more verbose).
What about everything else about the agent? Is it working OK (all other resources respond quickly? All avail statuses and metrics coming in OK?)
----- Original Message -----
We’ve setup three nodes with RHEL5.3 and same hardware specifications. One for RHQ server and remaining two for RHQ agents to monitor mongoDB.
mongoDB is monitored using Process resource type with Pid File and PIQL query types. Each query type is setup on a different node. However we found that the availability scan for both agents does not respond in a decent time. In fact sometimes it takes more than 30 minutes to reflect the actual availability of mongoDB. We’re using default configuration in agent. The value for rhq.agent.plugins.availability-scan.period-secs is also default (30 seconds).
Anyone encounter this problem before?
rhq-users mailing list rhq-users@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/rhq-users
_______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/rhq-users