Excuse my recent SPAM. I really dig RHQ, I think it has a lot of potential but there are few things I would like it to do. Ok on to my suggestions.
PIQL: -- A way to select a process based on user, and group attributes? user, effective user, login user, group, effective group, login group and by name or uid/gid? maybe: process|user|foo process|uid|500 process|group|foo process|gid|500 process|euser|bar process|luid|500
Process resource: -- A way to return a number of processes matching a PIQL as a trait? Or maybe a trait for number of processes that match? This way an alert could be made for num process > = < some number. I.e. we have a check that looks to make sure there are not more than a certain number of java processes running. Otherwise it indicates that batch jobs are not completing. -- What about returning information that could be retrieved using uname as traits like max open file descriptors, max processes etc
Alert Engine: -- The ability to compare one attribute to another or an attribute to a trait? Or a percentage of trait? Alert when open files > 90% max open files Alert when Java Heap > 90% max heap size -- Perhaps a way of defining recovery thresholds without having to define recovery alerts. Maybe wizard to build a recovery alert that is created under the hood but hidden from the alert definition view. Maybe like having the conditions tab have a "recovery conditions" tab and "recovery damping" or recovery tab with "conditions" and "dapming" subtabs? It is cumbersome to have to create a recovery alert and then its associated alert, or create an alert then its recovery alert then go edit the original again to assign its recovery alert. (If that makes sense...)
I would love to hear other people's thoughts and whether or not these ideas might be useful others.
Regards, -Alan
I am not an expert on this, but I think some of the resource tasks you want can be done through the RHQ CLI[0]. Like creating a new resource[1] and getting a list of resources by trait[2]. (I think. I emphasize -- I am not a developer. I am just learning all this stuff, too.)
The first alerting scenario you have can already be accomplished through the UI. Most resources have a lot of metrics defined already[3]. The metrics which are collected are defined in the agent plug-in, in the META-INF/rhq-plugin.xml file. If you needed some metrics for a resource which aren't collected, you may be able to edit the resource plug-in. Likewise, if you want a resource that isn't currently in RHQ, you can create your own plug-in. There's a generator you can use to build a skeleton plugin, with details at http://rhq-project.org/display/RHQ/Plugins+-+Plugin+Generator.
As for the recovery alerts -- yeah, I agree. It would be really nice to define the recovery alert in the alert definition itself.
Cheers, Deon
[0] http://rhq-project.org/display/JOPR2/Running+the+RHQ+CLI
[1] http://docs.redhat.com/docs/en-US/JBoss_Operations_Network/100/html/API/remo...
[2] http://docs.redhat.com/docs/en-US/JBoss_Operations_Network/100/html/API/remo...
[3] http://rhq-project.org/display/JOPR2/Managed+Resources
On 12/12/2011 6:51 PM, Alan Evans wrote:
Excuse my recent SPAM. I really dig RHQ, I think it has a lot of potential but there are few things I would like it to do. Ok on to my suggestions.
PIQL: -- A way to select a process based on user, and group attributes? user, effective user, login user, group, effective group, login group and by name or uid/gid? maybe: process|user|foo process|uid|500 process|group|foo process|gid|500 process|euser|bar process|luid|500
Process resource: -- A way to return a number of processes matching a PIQL as a trait? Or maybe a trait for number of processes that match? This way an alert could be made for num process> =< some number. I.e. we have a check that looks to make sure there are not more than a certain number of java processes running. Otherwise it indicates that batch jobs are not completing. -- What about returning information that could be retrieved using uname as traits like max open file descriptors, max processes etc
Alert Engine: -- The ability to compare one attribute to another or an attribute to a trait? Or a percentage of trait? Alert when open files> 90% max open files Alert when Java Heap> 90% max heap size -- Perhaps a way of defining recovery thresholds without having to define recovery alerts. Maybe wizard to build a recovery alert that is created under the hood but hidden from the alert definition view. Maybe like having the conditions tab have a "recovery conditions" tab and "recovery damping" or recovery tab with "conditions" and "dapming" subtabs? It is cumbersome to have to create a recovery alert and then its associated alert, or create an alert then its recovery alert then go edit the original again to assign its recovery alert. (If that makes sense...)
I would love to hear other people's thoughts and whether or not these ideas might be useful others.
Regards, -Alan _______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/rhq-users
rhq-users@lists.fedorahosted.org