Hi,
I was looking at the ideas that Steve vocalized yesterday for the INI validation. To recap: Consider using XML/XSLT/Relax NG/Schematron to do ini file validation.
I did some research on the matter and here is what I am thinking. What we want to accomplish is the internal validation of the INI file when it is loaded. We do not want to fork a separate process that will validate the INI file. Though the suggested approach generally makes sense but there is no open source C implementation for the Schematron that I could find. And even if it exists it will take time to make sure it is a part of the distributions we care about. Libxml supports schema validation but I really do not see a big benefit in going this path. To validate an ini file the library (embeded into the application) would have to: 1) Read the schema file 2) Convert schema to the RelaxNG schema on the fly 3) Read the config file 4) Convert it to xml probably using XSLT 5) Use libxml to implement validation logic
This just sounds a bit too heavy. I had in mind a more customized approach 1) Read schema file 2) Read config file 3) Validate
The step 3 here is the core of the work but it is definitely comparable with the conversion work and validation logic in the XML case. The first approach at the point looks a bit too heavy both from application point of view and development point of view. Another point is that in the second approach we can do the work gradually and increase the complexity of the validation as we go. And do not forget that the option 1 is just schema validation without any conditions. I have some ideas about how we can use the conditions in the second approach more easily. Not to forget that first approach would require some time to get fluent with XSLT, RelaxNG and libxml - this would take some time while with the second approach the development can start immediately.
Here is the design that I have in mind for the INI schema definition file and logic related to it. The INI file schema definition ini file will consist of description of the experted configuration optioins in the INI file. It will consist of the sections that describe keys in the ini files and sections that describe rules. We will skip rules for now and will talk about just the sections that define keys. I think that if we implement just that we will bring a lot of value.
Any section in the schema definition file will have a unique name. This the name of the section not the name of the key. This is an important difference. Think about the name of the section as the name of the variable that holds an object that describes some key. Each key definition section would have the following possible attributes that describe the properties of the key and its value in the configuration file.
Here is an example. Say we want to describe the key "provider" in the section "backend" of the INI file. Here how it might look like in the INI schema definition:
[backendprovier] type = field <- Optional if omitted type "field" is assumed. Possible values: field, rule. Rule value not supported in v1 section = backend <- Optional, specifies what section the key belongs to. If omitted the "secref" key should be defined. See below. name = provider <- Required, must be a unique name within the section. required = yes <- Optional, if omitted "no" is assumed vtype = string <- Type of the value. Can be : int32, uint32, long32, ulong32, double, bool, list = no <- Optional, if omitted "no" is assumed, specifies if the value is the list or not sep = ,; <- Potential other symbols that can be used as list item separators - can be added later
And so on... Other interesting keys that came to mind:
min = 0 <- Min value for a numeric key max = 1000 <- Max value for a nemeric key
Alternatively we can support (later) something like:
ranges = 0, 10, 20, 30 < will mean that the value should be in range from 0 to 10 or between 20 and 30. Might come up with some better syntax...
The following keys would be interesting too: depends = backendprovier, backendsomthingelse <- This means that the key becomes required if the key backendprovier or key backendsomthingelse is defined.
Notice that "backendsomthingelse" is used as a reference to another section in the schema.
So far all this can be handled by the INI parsing without modification. We some improvements and additions we can add something like this:
depends = backendprovier[foo, bar, baz] <- That would mean that the key becomes required only in case the value of the backendprovier is foo, bar or baz.
Another interesting key that we might want to define is: seclist seclist = yes <- Would mean that the value of the key specifies sections in the ini file. Can be used in the conjunction with "secprefix" secprefix = sometext <- Would mean that the name of the sections should be constructed by prepending this value
Potentially we can also add "secprefixref" that will point to a key whose value should be added as prefix for the section names. Here is the example. The configuration like this:
[backends] domain = foo.com be = bar, baz
[foo.com/bar] one = 1
Will be described with schema definition like this:
[backend_domains] section = backends name = domain required = yes vtype = string
[backend_bs] section = backends name = domain required = yes seclist = yes <implies list=yes and vtype = string secprefixref = backend_domains
[subsection_one] secref = backend_ds name = one vtype = int32
We can get more complex as we go and find use cases we need to cover. As we go we will also create best practices about creating your INI files. For example one of the things that IMO should be prohibited in the INI files at least for now is multiple dependency on presence or value. I mean things like: value of the key X should be present if the key Y is present and the key Z has value "abc". I envision the "rules" type section for dependencies like this. I think that complexity can be added later if ever needed.
With the schema definition like this we would be able to validate the following things about configuration: 1) The value is present when it should 2) The value has expected type and range 3) The dependencies are correct
I think that covers most of the issues we need to deal with in the INI files. Steve can we check that this grammar covers what we currently have in the sssd.ini?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/07/2010 11:57 AM, Dmitri Pal wrote:
Hi,
I was looking at the ideas that Steve vocalized yesterday for the INI validation. To recap: Consider using XML/XSLT/Relax NG/Schematron to do ini file validation.
I did some research on the matter and here is what I am thinking. What we want to accomplish is the internal validation of the INI file when it is loaded. We do not want to fork a separate process that will validate the INI file. Though the suggested approach generally makes sense but there is no open source C implementation for the Schematron that I could find. And even if it exists it will take time to make sure it is a part of the distributions we care about.
I thought it was a part of libxml2, but upon further research I see I was mistaken.
Libxml supports schema validation but I really do not see a big benefit in going this path. To validate an ini file the library (embeded into the application) would have to:
- Read the schema file
- Convert schema to the RelaxNG schema on the fly
I think you missed the point that I was describing having the schema file be written directly in RelaxNG or XST, but that's irrelevant now.
- Read the config file
- Convert it to xml probably using XSLT
- Use libxml to implement validation logic
This just sounds a bit too heavy. I had in mind a more customized approach
- Read schema file
- Read config file
- Validate
This is essentially how the SSSDConfig API does things now (albeit it's a very limited validation: we only validate that it matches the expected Python type (str, int, etc.)
...
Here is the design that I have in mind for the INI schema definition file and logic related to it. The INI file schema definition ini file will consist of description of the experted configuration optioins in the INI file. It will consist of the sections that describe keys in the ini files and sections that describe rules. We will skip rules for now and will talk about just the sections that define keys. I think that if we implement just that we will bring a lot of value.
Any section in the schema definition file will have a unique name. This the name of the section not the name of the key. This is an important difference. Think about the name of the section as the name of the variable that holds an object that describes some key. Each key definition section would have the following possible attributes that describe the properties of the key and its value in the configuration file.
Here is an example. Say we want to describe the key "provider" in the section "backend" of the INI file. Here how it might look like in the INI schema definition:
[backendprovier] type = field <- Optional if omitted type "field" is assumed. Possible values: field, rule. Rule value not supported in v1 section = backend <- Optional, specifies what section the key belongs to. If omitted the "secref" key should be defined. See below. name = provider <- Required, must be a unique name within the section. required = yes <- Optional, if omitted "no" is assumed vtype = string <- Type of the value. Can be : int32, uint32, long32, ulong32, double, bool, list = no <- Optional, if omitted "no" is assumed, specifies if the value is the list or not sep = ,; <- Potential other symbols that can be used as list item separators - can be added later
And so on... Other interesting keys that came to mind:
min = 0 <- Min value for a numeric key max = 1000 <- Max value for a nemeric key
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
Alternatively we can support (later) something like:
ranges = 0, 10, 20, 30 < will mean that the value should be in range from 0 to 10 or between 20 and 30. Might come up with some better syntax...
The following keys would be interesting too: depends = backendprovier, backendsomthingelse <- This means that the key becomes required if the key backendprovier or key backendsomthingelse is defined.
Notice that "backendsomthingelse" is used as a reference to another section in the schema.
So far all this can be handled by the INI parsing without modification. We some improvements and additions we can add something like this:
depends = backendprovier[foo, bar, baz] <- That would mean that the key becomes required only in case the value of the backendprovier is foo, bar or baz.
Another interesting key that we might want to define is: seclist seclist = yes <- Would mean that the value of the key specifies sections in the ini file. Can be used in the conjunction with "secprefix" secprefix = sometext <- Would mean that the name of the sections should be constructed by prepending this value
Potentially we can also add "secprefixref" that will point to a key whose value should be added as prefix for the section names. Here is the example. The configuration like this:
[backends] domain = foo.com be = bar, baz
[foo.com/bar] one = 1
Will be described with schema definition like this:
[backend_domains] section = backends name = domain required = yes vtype = string
[backend_bs] section = backends name = domain required = yes seclist = yes <implies list=yes and vtype = string secprefixref = backend_domains
[subsection_one] secref = backend_ds name = one vtype = int32
This example is completely unparseable to my brain. Could you please try representing it with a real-world example, such as ldap_uri? Also, I don't like seclist, secprefixref and secref. They are not descriptive and it will be impossible for anyone to keep them straight.
We can get more complex as we go and find use cases we need to cover. As we go we will also create best practices about creating your INI files. For example one of the things that IMO should be prohibited in the INI files at least for now is multiple dependency on presence or value. I mean things like: value of the key X should be present if the key Y is present and the key Z has value "abc". I envision the "rules" type section for dependencies like this. I think that complexity can be added later if ever needed.
With the schema definition like this we would be able to validate the following things about configuration:
- The value is present when it should
- The value has expected type and range
- The dependencies are correct
I think that covers most of the issues we need to deal with in the INI files. Steve can we check that this grammar covers what we currently have in the sssd.ini?
See above, but we need to add, at a minimum, support for regular-expression handling. I'd also prefer it strongly if we could add support for something like +=
In other words, I want to be able to say: [id_provider] type = field vtype = multi choices = local
in the core config file, but then be able to say:
[id_provider] type = extend #implies vtype=multi choices = ldap
in the sssd-ldap.conf and have the resulting combined INI be effectively:
[id_provider] type = field vtype = multi choices = local,ldap
This way, for things like the providers, we can provide a selection list of options. Since otherwise, our only option would be [id_provider] type = field vtype = string regex = .*
Naturally, that's error-prone. - -- Stephen Gallagher RHCE 804006346421761
Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality).
As for separators I imply that space is always a valid separator. Those would be additional separators we can check. Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz
There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing logic.
[snip]
This example is completely unparseable to my brain. Could you please try representing it with a real-world example, such as ldap_uri? Also, I don't like seclist, secprefixref and secref. They are not descriptive and it will be impossible for anyone to keep them straight.
But this is what we have in sssd.conf
I am talking about the:
services = nss, pam
Services attribute is a section list meaning that the nss and pam sections need to be defined. Another example of the section list is the "domains" attribute.
domains = redhat.com
But the sections are actually named:
[domain/redhat.com] ...
So to describe it one would say: [domais_def] name = domains section = sssd seclist = yes secprefix = domain
Now imagine that "domain" is actually not a constant but a value of another key in the ini file. Then we would have to use some kind of the reference. That would be something like "secprefref". But currently it is not used so we can skip it for now.
So based on the definition of the "domains" key the sections that should be present in the INI file should be constructed as the value of the key in the actual ini file prepaneded with the word "domain". So how I can say that a key should belong to such a dynamically defined section? This is where the "secref" is going to be used. See example at the end of the mail.
We will need to come up with some better "implication" rules and describe it. For example: if vtype is defined it should be implied that this is a field. If the regex is defined then it should imply vtype = string and so on. But this we can do and build as we go. One lesson for this is to write good doc as you go.
[snip]
See above, but we need to add, at a minimum, support for regular-expression handling. I'd also prefer it strongly if we could add support for something like +=
In other words, I want to be able to say: [id_provider] type = field vtype = multi choices = local
in the core config file, but then be able to say:
[id_provider] type = extend #implies vtype=multi choices = ldap
I do not understand what you are trying to say here. I think I missed one other key that I wanted to include: values (or may be "choices" as you suggested)
choices = local, ldap
would mean that the value for the key can be only one from the list. If the list = yes for such value, it would mean that each of the values for the key in the real ini file should be one of those. For example: is schema:
[family] name = family section = building list = yes vtype = string choices = Dad, Mom, Son, Daughter
In the config file: family = Mom, Son, Son <- will be Ok
family = Mom, Sister, Son <- will will generate and error because "Sister" is not a valid value
in the sssd-ldap.conf and have the resulting combined INI be effectively:
[id_provider] type = field vtype = multi choices = local,ldap
This way, for things like the providers, we can provide a selection list of options. Since otherwise, our only option would be [id_provider] type = field vtype = string regex = .*
So I think what you are trying to say is:
(in schema)
[domais_def] name = domains section = sssd seclist = yes secprefix = domain
[id_provider] name = provider secref = domains_def vtype = string list = yes choices = local, ldap
(will mean that)
The key "provider" will have a string value from the provided choice of "local, ldap" and will belong to the section that will be dynamically constructed from the value of the key "domains" in section "sssd" using prefix "domains"
[sssd] domains = somewhere.com ...
[domain/somewhere.com] provider = ldap ...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/07/2010 02:46 PM, Dmitri Pal wrote:
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality).
As for separators I imply that space is always a valid separator. Those would be additional separators we can check.
I think it makes sense to NEVER use space as a separator, so that the use of a separator then becomes explicit and unambiguous. (And it's possible to add human-readable spaces around a separator)
Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz
There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing logic.
My concern is for the possibility of options where you specify foo bar, baz
Because with your proposal, this is always translated to: (foo, bar, baz), when what they really wanted was (foo bar, baz)
Right now in the SSSD there aren't any cases like this, but we're trying to build libini_config for general consumption. I don't think we're out of line requiring an explicit separator. Additionally, we can add another rule that whitespace surrounding a separator is not included in the value.
So name = value1\s\s\s,\svalue2 would become ('value1', 'value2') and not ('value1 ', ' value2')
[snip]
This example is completely unparseable to my brain. Could you please try representing it with a real-world example, such as ldap_uri? Also, I don't like seclist, secprefixref and secref. They are not descriptive and it will be impossible for anyone to keep them straight.
But this is what we have in sssd.conf
I am talking about the:
services = nss, pam
Services attribute is a section list meaning that the nss and pam sections need to be defined. Another example of the section list is the "domains" attribute.
domains = redhat.com
But the sections are actually named:
[domain/redhat.com] ...
So to describe it one would say: [domais_def] name = domains section = sssd seclist = yes secprefix = domain
Now imagine that "domain" is actually not a constant but a value of another key in the ini file. Then we would have to use some kind of the reference. That would be something like "secprefref". But currently it is not used so we can skip it for now.
So based on the definition of the "domains" key the sections that should be present in the INI file should be constructed as the value of the key in the actual ini file prepaneded with the word "domain". So how I can say that a key should belong to such a dynamically defined section? This is where the "secref" is going to be used. See example at the end of the mail.
We will need to come up with some better "implication" rules and describe it. For example: if vtype is defined it should be implied that this is a field. If the regex is defined then it should imply vtype = string and so on. But this we can do and build as we go. One lesson for this is to write good doc as you go.
[snip]
See above, but we need to add, at a minimum, support for regular-expression handling. I'd also prefer it strongly if we could add support for something like +=
In other words, I want to be able to say: [id_provider] type = field vtype = multi choices = local
in the core config file, but then be able to say:
[id_provider] type = extend #implies vtype=multi choices = ldap
I do not understand what you are trying to say here. I think I missed one other key that I wanted to include: values (or may be "choices" as you suggested)
choices = local, ldap
would mean that the value for the key can be only one from the list. If the list = yes for such value, it would mean that each of the values for the key in the real ini file should be one of those. For example: is schema:
[family] name = family section = building list = yes vtype = string choices = Dad, Mom, Son, Daughter
In the config file: family = Mom, Son, Son <- will be Ok
family = Mom, Sister, Son <- will will generate and error because "Sister" is not a valid value
I think you missed my point. I was talking about the case where you want to have a limited set of choices for an option, but you want the set of choices to be extensible by a secondary schema file. For example, I want to have sssd.schema to handle all the options that are available for the SSSD itself, and then I want to have a directory (sssd.schema.d) containing files like sssd-local.schema, sssd-ldap.schema.
In sssd.schema, I'd have: [id_provider] type = field vtype = multi choices =
In sssd.schema.d/sssd-local.schema, I'd have: [id_provider] type = extend choices = local
In sssd.schema.d/sssd-ldap.schema, I'd have: [id_provider] type = extend choices = ldap
Then, in sssd.conf I would be able to provide: [domain/mydomain] id_provider = ldap
But if I provided: [domain/mydomain] id_provider = nosuchprovider
I should get a parse error because that's not one of the acceptable choices.
in the sssd-ldap.conf and have the resulting combined INI be effectively:
[id_provider] type = field vtype = multi choices = local,ldap
This way, for things like the providers, we can provide a selection list of options. Since otherwise, our only option would be [id_provider] type = field vtype = string regex = .*
So I think what you are trying to say is:
(in schema)
[domais_def] name = domains section = sssd seclist = yes secprefix = domain
[id_provider] name = provider secref = domains_def vtype = string list = yes choices = local, ldap
Hopefully I made myself clear above, but just in case: 'list = yes' here would be wrong. I don't want id_provider to accept a list of values, I want it to have a single value chosen from a list that may have been constructed and extended from multiple different files.
(will mean that)
The key "provider" will have a string value from the provided choice of "local, ldap" and will belong to the section that will be dynamically constructed from the value of the key "domains" in section "sssd" using prefix "domains"
[sssd] domains = somewhere.com ...
[domain/somewhere.com] provider = ldap ...
- -- Stephen Gallagher RHCE 804006346421761
Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/
Stephen Gallagher wrote:
On 04/07/2010 02:46 PM, Dmitri Pal wrote:
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality).
As for separators I imply that space is always a valid separator. Those would be additional separators we can check.
I think it makes sense to NEVER use space as a separator, so that the use of a separator then becomes explicit and unambiguous. (And it's possible to add human-readable spaces around a separator)
Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz
There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing logic.
My concern is for the possibility of options where you specify foo bar, baz
Because with your proposal, this is always translated to: (foo, bar, baz), when what they really wanted was (foo bar, baz)
Right now in the SSSD there aren't any cases like this, but we're trying to build libini_config for general consumption. I don't think we're out of line requiring an explicit separator. Additionally, we can add another rule that whitespace surrounding a separator is not included in the value.
So name = value1\s\s\s,\svalue2 would become ('value1', 'value2') and not ('value1 ', ' value2')
I kind of disagree with the whole approach. If you want spaces inside the value you need to make the string quoted.
Something like
value = "fist value", "second value"
This is currently not supported but would be easy to add when needed. But again we are talking about the ini files and explicitly multi value lists. It is not a good practice to have spaces in the multi value lists anyways. And I do not want people to encourage doing it. In case of single value the inner spaces are preserved but leading and trailing ones are trimmed.
By default right now spaces around separators are treated as separators and trimmed. This is how it already works.
That means that: foo , , bar value will be processed as list of two items. However not long ago I added a patch that would allow one to treat this case as the list of three items: foo, <empty string> and bar And we can add a specifier for the in the schema.
My point that the rule of thumb is: * Spaces are always a separator * Spaces are always trimmed * Sequences are treated as: * Any sequence of separators is counted as one separator (option 1) * Non space separators are counted as separate separators and value between them is treated as ans empty value (option 2)
[snip]
This example is completely unparseable to my brain. Could you
please try
representing it with a real-world example, such as ldap_uri? Also, I don't like seclist, secprefixref and secref. They are not descriptive and it will be impossible for anyone to keep them straight.
But this is what we have in sssd.conf
I am talking about the:
services = nss, pam
Services attribute is a section list meaning that the nss and pam sections need to be defined. Another example of the section list is the "domains" attribute.
domains = redhat.com
But the sections are actually named:
[domain/redhat.com] ...
So to describe it one would say: [domais_def] name = domains section = sssd seclist = yes secprefix = domain
Now imagine that "domain" is actually not a constant but a value of another key in the ini file. Then we would have to use some kind of the reference. That would be something like "secprefref". But currently it is not used so we can skip it for now.
So based on the definition of the "domains" key the sections that should be present in the INI file should be constructed as the value of the key in the actual ini file prepaneded with the word "domain". So how I can say that a key should belong to such a dynamically defined section? This is where the "secref" is going to be used. See example at the end of the mail.
We will need to come up with some better "implication" rules and describe it. For example: if vtype is defined it should be implied that this is a field. If the regex is defined then it should imply vtype = string and so on. But this we can do and build as we go. One lesson for this is to write good doc as you go.
[snip]
See above, but we need to add, at a minimum, support for regular-expression handling. I'd also prefer it strongly if we
could add
support for something like +=
In other words, I want to be able to say: [id_provider] type = field vtype = multi choices = local
in the core config file, but then be able to say:
[id_provider] type = extend #implies vtype=multi choices = ldap
I do not understand what you are trying to say here. I think I missed one other key that I wanted to include: values (or may be "choices" as you suggested)
choices = local, ldap
would mean that the value for the key can be only one from the list. If the list = yes for such value, it would mean that each of the values for the key in the real ini file should be one of those. For example: is schema:
[family] name = family section = building list = yes vtype = string choices = Dad, Mom, Son, Daughter
In the config file: family = Mom, Son, Son <- will be Ok
family = Mom, Sister, Son <- will will generate and error because "Sister" is not a valid value
I think you missed my point. I was talking about the case where you want to have a limited set of choices for an option, but you want the set of choices to be extensible by a secondary schema file. For example, I want to have sssd.schema to handle all the options that are available for the SSSD itself, and then I want to have a directory (sssd.schema.d) containing files like sssd-local.schema, sssd-ldap.schema.
In sssd.schema, I'd have: [id_provider] type = field vtype = multi choices =
In sssd.schema.d/sssd-local.schema, I'd have: [id_provider] type = extend choices = local
In sssd.schema.d/sssd-ldap.schema, I'd have: [id_provider] type = extend choices = ldap
Then, in sssd.conf I would be able to provide: [domain/mydomain] id_provider = ldap
But if I provided: [domain/mydomain] id_provider = nosuchprovider
I should get a parse error because that's not one of the acceptable choices.
Wow! I think you are adding complexity out of proportion here. Something like this can be added for sure but I do not see a big use case. Why the main schema can't be modified with the extensions? I think you are going way beyond what is needed at the beginning. You are probably thinking about the third party back ends adding their configurations into the sssd.conf. I think we should start by just requiring modifications to the schema file and then if that does not work out add extensibility as you suggest.
IMO too much for a v1 of the INI validation. Sounds more like a v3 feature...
in the sssd-ldap.conf and have the resulting combined INI be
effectively:
[id_provider] type = field vtype = multi choices = local,ldap
This way, for things like the providers, we can provide a selection
list
of options. Since otherwise, our only option would be [id_provider] type = field vtype = string regex = .*
So I think what you are trying to say is:
(in schema)
[domais_def] name = domains section = sssd seclist = yes secprefix = domain
[id_provider] name = provider secref = domains_def vtype = string list = yes choices = local, ldap
Hopefully I made myself clear above, but just in case: 'list = yes' here would be wrong. I don't want id_provider to accept a list of values, I want it to have a single value chosen from a list that may have been constructed and extended from multiple different files.
Then it is just
[id_provider] name = provider secref = domains_def vtype = string choices = local, ldap
And if there is a third party provider he would have to add things to the same schema file and modify it.
choices = local, ldap, rsa <- Example of the rsa adding itself into the schema
IMO it is a trivial scripting exercise (using augeas or just plain sed+grep) to check if the rsa schema data is blended or not and blend it into the file if it is not yet.
As I said - nice idea but too much complexity out of box.
(will mean that)
The key "provider" will have a string value from the provided choice of "local, ldap" and will belong to the section that will be dynamically constructed from the value of the key "domains" in section "sssd" using prefix "domains"
[sssd] domains = somewhere.com ...
[domain/somewhere.com] provider = ldap ...
_______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/07/2010 04:02 PM, Dmitri Pal wrote:
Stephen Gallagher wrote:
On 04/07/2010 02:46 PM, Dmitri Pal wrote:
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality).
As for separators I imply that space is always a valid separator. Those would be additional separators we can check.
I think it makes sense to NEVER use space as a separator, so that the use of a separator then becomes explicit and unambiguous. (And it's possible to add human-readable spaces around a separator)
Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz
There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing logic.
My concern is for the possibility of options where you specify foo bar, baz
Because with your proposal, this is always translated to: (foo, bar, baz), when what they really wanted was (foo bar, baz)
Right now in the SSSD there aren't any cases like this, but we're trying to build libini_config for general consumption. I don't think we're out of line requiring an explicit separator. Additionally, we can add another rule that whitespace surrounding a separator is not included in the value.
So name = value1\s\s\s,\svalue2 would become ('value1', 'value2') and not ('value1 ', ' value2')
I kind of disagree with the whole approach. If you want spaces inside the value you need to make the string quoted.
Something like
value = "fist value", "second value"
This is currently not supported but would be easy to add when needed. But again we are talking about the ini files and explicitly multi value lists. It is not a good practice to have spaces in the multi value lists anyways. And I do not want people to encourage doing it. In case of single value the inner spaces are preserved but leading and trailing ones are trimmed.
By default right now spaces around separators are treated as separators and trimmed. This is how it already works.
That means that: foo , , bar value will be processed as list of two items. However not long ago I added a patch that would allow one to treat this case as the list of three items: foo, <empty string> and bar And we can add a specifier for the in the schema.
My point that the rule of thumb is:
- Spaces are always a separator
- Spaces are always trimmed
- Sequences are treated as:
- Any sequence of separators is counted as one separator (option 1)
- Non space separators are counted as separate separators and value
between them is treated as ans empty value (option 2)
[snip]
This example is completely unparseable to my brain. Could you
please try
representing it with a real-world example, such as ldap_uri? Also, I don't like seclist, secprefixref and secref. They are not descriptive and it will be impossible for anyone to keep them straight.
But this is what we have in sssd.conf
I am talking about the:
services = nss, pam
Services attribute is a section list meaning that the nss and pam sections need to be defined. Another example of the section list is the "domains" attribute.
domains = redhat.com
But the sections are actually named:
[domain/redhat.com] ...
So to describe it one would say: [domais_def] name = domains section = sssd seclist = yes secprefix = domain
Now imagine that "domain" is actually not a constant but a value of another key in the ini file. Then we would have to use some kind of the reference. That would be something like "secprefref". But currently it is not used so we can skip it for now.
So based on the definition of the "domains" key the sections that should be present in the INI file should be constructed as the value of the key in the actual ini file prepaneded with the word "domain". So how I can say that a key should belong to such a dynamically defined section? This is where the "secref" is going to be used. See example at the end of the mail.
We will need to come up with some better "implication" rules and describe it. For example: if vtype is defined it should be implied that this is a field. If the regex is defined then it should imply vtype = string and so on. But this we can do and build as we go. One lesson for this is to write good doc as you go.
[snip]
See above, but we need to add, at a minimum, support for regular-expression handling. I'd also prefer it strongly if we
could add
support for something like +=
In other words, I want to be able to say: [id_provider] type = field vtype = multi choices = local
in the core config file, but then be able to say:
[id_provider] type = extend #implies vtype=multi choices = ldap
I do not understand what you are trying to say here. I think I missed one other key that I wanted to include: values (or may be "choices" as you suggested)
choices = local, ldap
would mean that the value for the key can be only one from the list. If the list = yes for such value, it would mean that each of the values for the key in the real ini file should be one of those. For example: is schema:
[family] name = family section = building list = yes vtype = string choices = Dad, Mom, Son, Daughter
In the config file: family = Mom, Son, Son <- will be Ok
family = Mom, Sister, Son <- will will generate and error because "Sister" is not a valid value
I think you missed my point. I was talking about the case where you want to have a limited set of choices for an option, but you want the set of choices to be extensible by a secondary schema file. For example, I want to have sssd.schema to handle all the options that are available for the SSSD itself, and then I want to have a directory (sssd.schema.d) containing files like sssd-local.schema, sssd-ldap.schema.
In sssd.schema, I'd have: [id_provider] type = field vtype = multi choices =
In sssd.schema.d/sssd-local.schema, I'd have: [id_provider] type = extend choices = local
In sssd.schema.d/sssd-ldap.schema, I'd have: [id_provider] type = extend choices = ldap
Then, in sssd.conf I would be able to provide: [domain/mydomain] id_provider = ldap
But if I provided: [domain/mydomain] id_provider = nosuchprovider
I should get a parse error because that's not one of the acceptable choices.
Wow! I think you are adding complexity out of proportion here. Something like this can be added for sure but I do not see a big use case. Why the main schema can't be modified with the extensions? I think you are going way beyond what is needed at the beginning. You are probably thinking about the third party back ends adding their configurations into the sssd.conf. I think we should start by just requiring modifications to the schema file and then if that does not work out add extensibility as you suggest.
IMO too much for a v1 of the INI validation. Sounds more like a v3 feature...
Dmitri, the ENTIRE POINT of doing this is because we need to support third-party backends, and we have no advance knowledge of what configurations they will need. If this is not a v1 feature, it's not even a v1 solution.
in the sssd-ldap.conf and have the resulting combined INI be
effectively:
[id_provider] type = field vtype = multi choices = local,ldap
This way, for things like the providers, we can provide a selection
list
of options. Since otherwise, our only option would be [id_provider] type = field vtype = string regex = .*
So I think what you are trying to say is:
(in schema)
[domais_def] name = domains section = sssd seclist = yes secprefix = domain
[id_provider] name = provider secref = domains_def vtype = string list = yes choices = local, ldap
Hopefully I made myself clear above, but just in case: 'list = yes' here would be wrong. I don't want id_provider to accept a list of values, I want it to have a single value chosen from a list that may have been constructed and extended from multiple different files.
Then it is just
[id_provider] name = provider secref = domains_def vtype = string choices = local, ldap
And if there is a third party provider he would have to add things to the same schema file and modify it.
choices = local, ldap, rsa <- Example of the rsa adding itself into the schema
IMO it is a trivial scripting exercise (using augeas or just plain sed+grep) to check if the rsa schema data is blended or not and blend it into the file if it is not yet.
As I said - nice idea but too much complexity out of box.
(will mean that)
The key "provider" will have a string value from the provided choice of "local, ldap" and will belong to the section that will be dynamically constructed from the value of the key "domains" in section "sssd" using prefix "domains"
[sssd] domains = somewhere.com ...
[domain/somewhere.com] provider = ldap ...
That is completely unacceptable on many, many levels. Not the least of which is that this would be completely forbidden by RPM packaging rules. Any third-party backend would have its own package, and packaging rules forbid two files owning or modifying the same config file. That's the lion's share of the reason for drop-directory configurations: so that plugins don't need to modify the global configuration.
Futhermore, it's impossible to guarantee that if you installed two third-party backends that you can uninstall them and get back into a clean state.
No, we absolutely need to have the ability for third-party backends to have their own completely self-sufficient configuration.
- -- Stephen Gallagher RHCE 804006346421761
Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/
Stephen Gallagher wrote:
On 04/07/2010 04:02 PM, Dmitri Pal wrote:
Stephen Gallagher wrote:
On 04/07/2010 02:46 PM, Dmitri Pal wrote:
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality). As for separators I imply that space is always a valid separator. Those would be additional separators we can check.
I think it makes sense to NEVER use space as a separator, so that the use of a separator then becomes explicit and unambiguous. (And it's possible to add human-readable spaces around a separator)
Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing
logic.
My concern is for the possibility of options where you specify foo bar, baz
Because with your proposal, this is always translated to: (foo, bar, baz), when what they really wanted was (foo bar, baz)
Right now in the SSSD there aren't any cases like this, but we're
trying
to build libini_config for general consumption. I don't think we're out of line requiring an explicit separator. Additionally, we can add another rule that whitespace surrounding a separator is not included in the value.
So name = value1\s\s\s,\svalue2 would become ('value1', 'value2') and not ('value1 ', ' value2')
I kind of disagree with the whole approach. If you want spaces inside the value you need to make the string quoted.
Something like
value = "fist value", "second value"
This is currently not supported but would be easy to add when needed. But again we are talking about the ini files and explicitly multi value lists. It is not a good practice to have spaces in the multi value lists
anyways.
And I do not want people to encourage doing it. In case of single value the inner spaces are preserved but leading and trailing ones are trimmed.
By default right now spaces around separators are treated as separators and trimmed. This is how it already works.
That means that: foo , , bar value will be processed as list of two items. However not long ago I added a patch that would allow one to treat this case as the list of three items: foo, <empty string> and bar And we can add a specifier for the in the schema.
My point that the rule of thumb is:
- Spaces are always a separator
- Spaces are always trimmed
- Sequences are treated as:
- Any sequence of separators is counted as one separator (option 1)
- Non space separators are counted as separate separators and value
between them is treated as ans empty value (option 2)
[snip]
This example is completely unparseable to my brain. Could you
please try
representing it with a real-world example, such as ldap_uri? Also, I don't like seclist, secprefixref and secref. They are not descriptive and it will be impossible for anyone to keep them straight.
But this is what we have in sssd.conf I am talking about the: services = nss, pam Services attribute is a section list meaning that the nss and pam sections need to be defined. Another example of the section list is the "domains" attribute. domains = redhat.com But the sections are actually named: [domain/redhat.com] ... So to describe it one would say: [domais_def] name = domains section = sssd seclist = yes secprefix = domain Now imagine that "domain" is actually not a constant but a value of another key in the ini file. Then we would have to use some kind of the reference. That would be something like "secprefref". But currently it is not
used
so we can skip it for now. So based on the definition of the "domains" key the sections that
should
be present in the INI file should be constructed as the value of the key in the actual ini file prepaneded with the word "domain". So how I can say that a key should belong to such a dynamically defined section? This is where the "secref" is going to be used. See example at the end of the mail.
We will need to come up with some better "implication" rules and describe it. For example: if vtype is defined it should be implied that this is a field. If the regex is defined then it should imply vtype = string and so on. But this we can do and build as we go. One lesson for this is to write good doc as you go.
[snip]
See above, but we need to add, at a minimum, support for regular-expression handling. I'd also prefer it strongly if we
could add
support for something like +=
In other words, I want to be able to say: [id_provider] type = field vtype = multi choices = local
in the core config file, but then be able to say:
[id_provider] type = extend #implies vtype=multi choices = ldap
I do not understand what you are trying to say here. I think I missed one other key that I wanted to include: values
(or may
be "choices" as you suggested) choices = local, ldap would mean that the value for the key can be only one from the list. If the list = yes for such value, it would mean that each of the values for the key in the real ini file should be one of those. For example: is schema: [family] name = family section = building list = yes vtype = string choices = Dad, Mom, Son, Daughter In the config file: family = Mom, Son, Son <- will be Ok family = Mom, Sister, Son <- will will generate and error because "Sister" is not a valid value
I think you missed my point. I was talking about the case where you
want
to have a limited set of choices for an option, but you want the set of choices to be extensible by a secondary schema file. For example, I
want
to have sssd.schema to handle all the options that are available
for the
SSSD itself, and then I want to have a directory (sssd.schema.d) containing files like sssd-local.schema, sssd-ldap.schema.
In sssd.schema, I'd have: [id_provider] type = field vtype = multi choices =
In sssd.schema.d/sssd-local.schema, I'd have: [id_provider] type = extend choices = local
In sssd.schema.d/sssd-ldap.schema, I'd have: [id_provider] type = extend choices = ldap
Then, in sssd.conf I would be able to provide: [domain/mydomain] id_provider = ldap
But if I provided: [domain/mydomain] id_provider = nosuchprovider
I should get a parse error because that's not one of the acceptable choices.
Wow! I think you are adding complexity out of proportion here. Something like this can be added for sure but I do not see a big use
case.
Why the main schema can't be modified with the extensions? I think you are going way beyond what is needed at the beginning. You are probably thinking about the third party back ends adding their configurations into the sssd.conf. I think we should start by just requiring modifications to the
schema file
and then if that does not work out add extensibility as you suggest.
IMO too much for a v1 of the INI validation. Sounds more like a v3 feature...
Dmitri, the ENTIRE POINT of doing this is because we need to support third-party backends, and we have no advance knowledge of what configurations they will need. If this is not a v1 feature, it's not even a v1 solution.
in the sssd-ldap.conf and have the resulting combined INI be
effectively:
[id_provider] type = field vtype = multi choices = local,ldap
This way, for things like the providers, we can provide a selection
list
of options. Since otherwise, our only option would be [id_provider] type = field vtype = string regex = .*
So I think what you are trying to say is: (in schema) [domais_def] name = domains section = sssd seclist = yes secprefix = domain [id_provider] name = provider secref = domains_def vtype = string list = yes choices = local, ldap
Hopefully I made myself clear above, but just in case: 'list = yes'
here
would be wrong. I don't want id_provider to accept a list of values, I want it to have a single value chosen from a list that may have been constructed and extended from multiple different files.
Then it is just
[id_provider] name = provider secref = domains_def vtype = string choices = local, ldap
And if there is a third party provider he would have to add things to the same schema file and modify it.
choices = local, ldap, rsa <- Example of the rsa adding itself into the schema
IMO it is a trivial scripting exercise (using augeas or just plain sed+grep) to check if the rsa schema data is blended or not and blend it into the file if it is not yet.
As I said - nice idea but too much complexity out of box.
(will mean that) The key "provider" will have a string value from the provided choice of "local, ldap" and will belong to the section that will be dynamically constructed from the value of the key "domains" in section "sssd"
using
prefix "domains" [sssd] domains = somewhere.com ... [domain/somewhere.com] provider = ldap ...
That is completely unacceptable on many, many levels. Not the least of which is that this would be completely forbidden by RPM packaging rules. Any third-party backend would have its own package, and packaging rules forbid two files owning or modifying the same config file. That's the lion's share of the reason for drop-directory configurations: so that plugins don't need to modify the global configuration.
Futhermore, it's impossible to guarantee that if you installed two third-party backends that you can uninstall them and get back into a clean state.
No, we absolutely need to have the ability for third-party backends to have their own completely self-sufficient configuration.
Then we will just not have it at all for now. Sorry. I can't commit to doing something of this complexity in a reasonable time.
_______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
No, we absolutely need to have the ability for third-party backends to have their own completely self-sufficient configuration.
Then we will just not have it at all for now. Sorry. I can't commit to doing something of this complexity in a reasonable time.
After some more thinking and a bit of cooling. We do not add the ability to handle the 3rd party back end day one. Why? One of the reasons - it is a too big task to chew. We have a plan how to get there. Slowly! We tries several time though I asked for it to be thought through day one. It was not and thus we had more things to change.
Here we can think about it as a design goal and even spec it a bit but delay the actual implementation. Why we have to deliver the config validation complexity day one? Why can't follow the same steps as with actual back ends?
The whole packaging guidelines seems like a dark magic in this case. Anyone can edit a config file: an admin manually, a puppet manifest automatically, etc.
I do not see how the schema file is different in this case and why it can't be edited in the same way by different sources. What you are saying really seems as an artificial requirement for v1. Please example how shcema file is different from any config file that anyone can edit?
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/sssd-devel
On Wed, Apr 7, 2010 at 1:58 PM, Dmitri Pal dpal@redhat.com wrote:
No, we absolutely need to have the ability for third-party backends to have their own completely self-sufficient configuration.
Then we will just not have it at all for now. Sorry. I can't commit to doing something of this complexity in a reasonable time.
After some more thinking and a bit of cooling. We do not add the ability to handle the 3rd party back end day one. Why? One of the reasons - it is a too big task to chew. We have a plan how to get there. Slowly! We tries several time though I asked for it to be thought through day one. It was not and thus we had more things to change.
Here we can think about it as a design goal and even spec it a bit but delay the actual implementation. Why we have to deliver the config validation complexity day one? Why can't follow the same steps as with actual back ends?
The whole packaging guidelines seems like a dark magic in this case. Anyone can edit a config file: an admin manually, a puppet manifest automatically, etc.
I do not see how the schema file is different in this case and why it can't be edited in the same way by different sources. What you are saying really seems as an artificial requirement for v1. Please example how shcema file is different from any config file that anyone can edit?
And what is wrong with a a .d directory like every other piece of 'nix software in existence? The sssd.api.d directory seems like the proper way whereas 1 big schema file is a step backwards. For 1 it would be breaking both Debian Policy and Fedora Packaging Guidelines. Additionally, it would make it for difficult for 3rd party backends to integrate.
Jeff Schroeder wrote:
On Wed, Apr 7, 2010 at 1:58 PM, Dmitri Pal dpal@redhat.com wrote:
No, we absolutely need to have the ability for third-party backends to have their own completely self-sufficient configuration.
Then we will just not have it at all for now. Sorry. I can't commit to doing something of this complexity in a reasonable time.
After some more thinking and a bit of cooling. We do not add the ability to handle the 3rd party back end day one. Why? One of the reasons - it is a too big task to chew. We have a plan how to get there. Slowly! We tries several time though I asked for it to be thought through day one. It was not and thus we had more things to change.
Here we can think about it as a design goal and even spec it a bit but delay the actual implementation. Why we have to deliver the config validation complexity day one? Why can't follow the same steps as with actual back ends?
The whole packaging guidelines seems like a dark magic in this case. Anyone can edit a config file: an admin manually, a puppet manifest automatically, etc.
I do not see how the schema file is different in this case and why it can't be edited in the same way by different sources. What you are saying really seems as an artificial requirement for v1. Please example how shcema file is different from any config file that anyone can edit?
And what is wrong with a a .d directory like every other piece of 'nix software in existence? The sssd.api.d directory seems like the proper way whereas 1 big schema file is a step backwards. For 1 it would be breaking both Debian Policy and Fedora Packaging Guidelines. Additionally, it would make it for difficult for 3rd party backends to integrate.
I am not arguing against it. I am arguing against it as a v1 requirement. pam.conf, nsswitch.conf are the files that are touched by multiple different components and by admins. So how it is different? Plus the back ends are not possible now and it will be for some time and there will be some time after it till someone would actually start doing a back end. So why this complexity should be supported day one? I think it is right feature but does not need to be delivered now. It can easily be delivered 9 months from now. This is my only point. You know "start small" sort of thing. Do not try to boil an ocean :-) The question of prioritization, that is all.
On Wed, Apr 7, 2010 at 2:48 PM, Dmitri Pal dpal@redhat.com wrote: ... snip ...
I am not arguing against it. I am arguing against it as a v1 requirement. pam.conf, nsswitch.conf are the files that are touched by multiple different components and by admins. So how it is different? Plus the back ends are not possible now and it will be for some time and there will be some time after it till someone would actually start doing a back end. So why this complexity should be supported day one? I think it is right feature but does not need to be delivered now. It can easily be delivered 9 months from now. This is my only point. You know "start small" sort of thing. Do not try to boil an ocean :-) The question of prioritization, that is all.
The timeline stuff is on you and your team. It isn't necessary in the technical discussion about the future. I was just saying you're going about it wrong by thinking running grep/sed/whatever nifty shiney tool on a schema file is the best way forward.
What tools do you know of that automatically touch the nsswitch.conf? If they exist they break Fedora Packaging Policy and general packaging best-practices. Sure on fedora/rhel you could say anaconda edits the nsswitch.conf at install time or manually when authconfig is ran via the admin, but nothing should magically edit the critical system configurations. We're talking about something doing this at install time completely unattended. If there is a package that edits your nsswitch.conf or changes things with augeas in %post it is a bug. Thats what *.rpmnew are for. Ditto for the pam config. Nothing should magically edit the existing pam config ever. The pam config also seems like a perfect example of your software as the functionality for ie: an rsa backend is similar to a secureid pam file under /etc/pam.d. However, that argument is the opposite of what you're proposing so it seems strange why you brought it up.
Something says we might not be comparing apples to apples here. I value your input on this but think this conversation might be better off restarted when the technical aspects start being implemented.
Jeff Schroeder wrote:
On Wed, Apr 7, 2010 at 2:48 PM, Dmitri Pal dpal@redhat.com wrote: ... snip ...
I am not arguing against it. I am arguing against it as a v1 requirement. pam.conf, nsswitch.conf are the files that are touched by multiple different components and by admins. So how it is different? Plus the back ends are not possible now and it will be for some time and there will be some time after it till someone would actually start doing a back end. So why this complexity should be supported day one? I think it is right feature but does not need to be delivered now. It can easily be delivered 9 months from now. This is my only point. You know "start small" sort of thing. Do not try to boil an ocean :-) The question of prioritization, that is all.
The timeline stuff is on you and your team. It isn't necessary in the technical discussion about the future. I was just saying you're going about it wrong by thinking running grep/sed/whatever nifty shiney tool on a schema file is the best way forward.
What tools do you know of that automatically touch the nsswitch.conf? If they exist they break Fedora Packaging Policy and general packaging best-practices. Sure on fedora/rhel you could say anaconda edits the nsswitch.conf at install time or manually when authconfig is ran via the admin, but nothing should magically edit the critical system configurations. We're talking about something doing this at install time completely unattended. If there is a package that edits your nsswitch.conf or changes things with augeas in %post it is a bug. Thats what *.rpmnew are for. Ditto for the pam config. Nothing should magically edit the existing pam config ever. The pam config also seems like a perfect example of your software as the functionality for ie: an rsa backend is similar to a secureid pam file under /etc/pam.d. However, that argument is the opposite of what you're proposing so it seems strange why you brought it up.
Something says we might not be comparing apples to apples here. I value your input on this but think this conversation might be better off restarted when the technical aspects start being implemented.
Jeff, I value your input too. But for me it is just a non starter even if it is generally a right thing do in a long run. I have some time to do some dev work on the project. I can't commit myself to doing it if I do not see that I can actually do it in a reasonable amount of time. This is the constraint on my contributions to the project.
But back to the technical part of the discussion. Here is my thinking about the whole file editing situation. The editing of the files is prohibited by the Fedora guidelines at the install time. But did I say anything about the install time?
Would it be fine if an admin comes and manually edits the file? The answer is yes, this is what admins do. It can be sssd.conf, pam.conf or any other file. They can edit it (assuming they know what they are doing). Ok but if I am a smart admin I will probably create a script to do so. If I also have some sort of the central config management solution I will create a puppet or cfengine module that would do this editing centrally, right? Have I violated anything? Even if I did this is a natural thing people do and if it is against any guidelines then the guidelines should be considered for a re-review.
So as a third party vendor (and I worked for one for 10 years) I would just include a puppet module into my solution and say: * Install this package * Edit the files this way, here are the scripts an modules that would help you to do so. And this would fly fine with everybody.
So the whole discussion boils down to making things work nicer together and letting 3rd party vendors to do less work. Since there are no third party vendors lined up for their life to be made easier I come to conclusion that the requirement can be deferred in the first implementation of the INI validation library and can be added as we get 3rd party vendors lined up to build the plugins.
I agree that there is no sense to continue this discussion.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/07/2010 08:10 PM, Dmitri Pal wrote:
Jeff, I value your input too. But for me it is just a non starter even if it is generally a right thing do in a long run. I have some time to do some dev work on the project. I can't commit myself to doing it if I do not see that I can actually do it in a reasonable amount of time. This is the constraint on my contributions to the project.
But back to the technical part of the discussion. Here is my thinking about the whole file editing situation. The editing of the files is prohibited by the Fedora guidelines at the install time. But did I say anything about the install time?
Would it be fine if an admin comes and manually edits the file? The answer is yes, this is what admins do. It can be sssd.conf, pam.conf or any other file. They can edit it (assuming they know what they are doing). Ok but if I am a smart admin I will probably create a script to do so. If I also have some sort of the central config management solution I will create a puppet or cfengine module that would do this editing centrally, right? Have I violated anything? Even if I did this is a natural thing people do and if it is against any guidelines then the guidelines should be considered for a re-review.
So as a third party vendor (and I worked for one for 10 years) I would just include a puppet module into my solution and say:
- Install this package
- Edit the files this way, here are the scripts an modules that would
help you to do so. And this would fly fine with everybody.
So the whole discussion boils down to making things work nicer together and letting 3rd party vendors to do less work. Since there are no third party vendors lined up for their life to be made easier I come to conclusion that the requirement can be deferred in the first implementation of the INI validation library and can be added as we get 3rd party vendors lined up to build the plugins.
I agree that there is no sense to continue this discussion.
I disagree.
Dmitri, you're missing a fundamental piece of information here. You're confusing "configuration files" with "data files".
Configuration files, like the sssd.conf are files that it is expected that an admin will need to modify for their environment to function. The proper approach is to have configuration files handle values that will be site-specific. In RPM terminology, these are config files marked as noreplace (in other words, RPM upgrades won't overwrite local changes, they will simply drop sssd.conf.rpmnew in place, so an admin can manually merge differences if they are necessary, which in most cases they won't be).
Data files, on the other hand, are files that by necessity MUST be the same for all deployments. A schema file is one such. In order for the SSSD to be stable on all deployments, the bare minimum set of options must always be reliable. These are the files that RPM upgrades would INTENTIONALLY overwrite on an upgrade, to ensure that they were always in-sync with the project.
Right now, with the SSSDConfig API, we are already following this approach. We have a schema file /etc/sssd.api.conf and a drop-directory, sssd.api.d. The sssd.api.conf contains the common options that all services and domains must support. The sssd.api.d/ contains separate files for all of the internal backends that we support: sssd-local.conf, sssd-ldap.conf, sssd-simple.conf, sssd-krb5.conf. Each of these subsequent files contains the set of options available for use by that particular backend.
In theory, for a v1 approach, we could continue using this exact schema format, though it's extremely limited. Right now, it cannot describe the proper format of the value, it will only guarantee that it is an integral type, string type, or list of [int|str]
I could write a module using libini_config in C to follow this exact behavior in about two days, without adding an actual schema validation into libini_config, but my real goal here is to come up with a solution that can be packaged with libini_config and be available for other projects. We can make this a later goal, if time is constrained.
However, I would REALLY like to come up with the complete format now. I can write a custom validator JUST for the SSSD, but we need to agree on how the format should look.
I think it is very important to get the format down right from the start. We can always opt to ignore certain things at first. (For example, maybe in the first pass we will ignore the dependency checks, but we'll absolutely handle the range/regex checks)
Having a complete format doesn't mandate that the parser has to actually implement everything all at once, but it's VERY hard to change the format later on.
So at this time, I'm recommending that we solve this format question immediately, and defer implementation into the libini_config itself.
- -- Stephen Gallagher RHCE 804006346421761
Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/
Stephen Gallagher wrote:
On 04/07/2010 08:10 PM, Dmitri Pal wrote:
Jeff, I value your input too. But for me it is just a non starter even if it is generally a right thing do in a long run. I have some time to do some dev work on the project. I can't commit myself to doing it if I do not see that I can actually do it in a reasonable amount of time. This is the constraint on my contributions to the project.
But back to the technical part of the discussion. Here is my thinking about the whole file editing situation. The editing of the files is prohibited by the Fedora guidelines at the install time. But did I say anything about the install time?
Would it be fine if an admin comes and manually edits the file? The answer is yes, this is what admins do. It can be sssd.conf, pam.conf or any other file. They can edit it (assuming they know what they are doing). Ok but if I am a smart admin I will probably create a script to do so. If I also have some sort of the central config management solution I will create a puppet or cfengine module that would do this editing centrally, right? Have I violated anything? Even if I did this is a natural thing people do and if it is against any guidelines then the guidelines should be considered for a re-review.
So as a third party vendor (and I worked for one for 10 years) I would just include a puppet module into my solution and say:
- Install this package
- Edit the files this way, here are the scripts an modules that would
help you to do so. And this would fly fine with everybody.
So the whole discussion boils down to making things work nicer together and letting 3rd party vendors to do less work. Since there are no third party vendors lined up for their life to be made easier I come to conclusion that the requirement can be deferred in the first implementation of the INI validation library and can be added as we get 3rd party vendors lined up to build the plugins.
I agree that there is no sense to continue this discussion.
I disagree.
Dmitri, you're missing a fundamental piece of information here. You're confusing "configuration files" with "data files".
Configuration files, like the sssd.conf are files that it is expected that an admin will need to modify for their environment to function. The proper approach is to have configuration files handle values that will be site-specific. In RPM terminology, these are config files marked as noreplace (in other words, RPM upgrades won't overwrite local changes, they will simply drop sssd.conf.rpmnew in place, so an admin can manually merge differences if they are necessary, which in most cases they won't be).
Data files, on the other hand, are files that by necessity MUST be the same for all deployments. A schema file is one such. In order for the SSSD to be stable on all deployments, the bare minimum set of options must always be reliable. These are the files that RPM upgrades would INTENTIONALLY overwrite on an upgrade, to ensure that they were always in-sync with the project.
Right now, with the SSSDConfig API, we are already following this approach. We have a schema file /etc/sssd.api.conf and a drop-directory, sssd.api.d. The sssd.api.conf contains the common options that all services and domains must support. The sssd.api.d/ contains separate files for all of the internal backends that we support: sssd-local.conf, sssd-ldap.conf, sssd-simple.conf, sssd-krb5.conf. Each of these subsequent files contains the set of options available for use by that particular backend.
In theory, for a v1 approach, we could continue using this exact schema format, though it's extremely limited. Right now, it cannot describe the proper format of the value, it will only guarantee that it is an integral type, string type, or list of [int|str]
I could write a module using libini_config in C to follow this exact behavior in about two days, without adding an actual schema validation into libini_config, but my real goal here is to come up with a solution that can be packaged with libini_config and be available for other projects. We can make this a later goal, if time is constrained.
However, I would REALLY like to come up with the complete format now. I can write a custom validator JUST for the SSSD, but we need to agree on how the format should look.
I think it is very important to get the format down right from the start. We can always opt to ignore certain things at first. (For example, maybe in the first pass we will ignore the dependency checks, but we'll absolutely handle the range/regex checks)
Having a complete format doesn't mandate that the parser has to actually implement everything all at once, but it's VERY hard to change the format later on.
So at this time, I'm recommending that we solve this format question immediately, and defer implementation into the libini_config itself.
I would not argue about the "fundamental" things. I just know how vendors bend the rules if they need to. IMO it is like building a house of cards in front of Godzilla. :-) But this is more a philosophical discussion. We can table it for now.
Defining format - sure. We can and should do it now so that we know how it would work to avoid problems in future. But the actual implementation should be deferred or at list done is stages.
But with Jakub's finding should we reconsider XML approach or the XML approach con's still outweigh the pros?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/08/2010 08:44 AM, Dmitri Pal wrote:
Stephen Gallagher wrote:
On 04/07/2010 08:10 PM, Dmitri Pal wrote:
Jeff, I value your input too. But for me it is just a non starter even if it is generally a right thing do in a long run. I have some time to do some dev work on the project. I can't commit myself to doing it if I do not see that I can actually do it in a reasonable amount of time. This is the constraint on my contributions to the project.
But back to the technical part of the discussion. Here is my thinking about the whole file editing situation. The editing of the files is prohibited by the Fedora guidelines at the install time. But did I say anything about the install time?
Would it be fine if an admin comes and manually edits the file? The answer is yes, this is what admins do. It can be sssd.conf, pam.conf or any other file. They can edit it (assuming they know what they are doing). Ok but if I am a smart admin I will probably create a script to do so. If I also have some sort of the central config management solution I will create a puppet or cfengine module that would do this editing centrally, right? Have I violated anything? Even if I did this is a natural thing people do and if it is against any guidelines then the guidelines should be considered for a re-review.
So as a third party vendor (and I worked for one for 10 years) I would just include a puppet module into my solution and say:
- Install this package
- Edit the files this way, here are the scripts an modules that would
help you to do so. And this would fly fine with everybody.
So the whole discussion boils down to making things work nicer together and letting 3rd party vendors to do less work. Since there are no third party vendors lined up for their life to be made easier I come to conclusion that the requirement can be deferred in the first implementation of the INI validation library and can be added as we get 3rd party vendors lined up to build the plugins.
I agree that there is no sense to continue this discussion.
I disagree.
Dmitri, you're missing a fundamental piece of information here. You're confusing "configuration files" with "data files".
Configuration files, like the sssd.conf are files that it is expected that an admin will need to modify for their environment to function. The proper approach is to have configuration files handle values that will be site-specific. In RPM terminology, these are config files marked as noreplace (in other words, RPM upgrades won't overwrite local changes, they will simply drop sssd.conf.rpmnew in place, so an admin can manually merge differences if they are necessary, which in most cases they won't be).
Data files, on the other hand, are files that by necessity MUST be the same for all deployments. A schema file is one such. In order for the SSSD to be stable on all deployments, the bare minimum set of options must always be reliable. These are the files that RPM upgrades would INTENTIONALLY overwrite on an upgrade, to ensure that they were always in-sync with the project.
Right now, with the SSSDConfig API, we are already following this approach. We have a schema file /etc/sssd.api.conf and a drop-directory, sssd.api.d. The sssd.api.conf contains the common options that all services and domains must support. The sssd.api.d/ contains separate files for all of the internal backends that we support: sssd-local.conf, sssd-ldap.conf, sssd-simple.conf, sssd-krb5.conf. Each of these subsequent files contains the set of options available for use by that particular backend.
In theory, for a v1 approach, we could continue using this exact schema format, though it's extremely limited. Right now, it cannot describe the proper format of the value, it will only guarantee that it is an integral type, string type, or list of [int|str]
I could write a module using libini_config in C to follow this exact behavior in about two days, without adding an actual schema validation into libini_config, but my real goal here is to come up with a solution that can be packaged with libini_config and be available for other projects. We can make this a later goal, if time is constrained.
However, I would REALLY like to come up with the complete format now. I can write a custom validator JUST for the SSSD, but we need to agree on how the format should look.
I think it is very important to get the format down right from the start. We can always opt to ignore certain things at first. (For example, maybe in the first pass we will ignore the dependency checks, but we'll absolutely handle the range/regex checks)
Having a complete format doesn't mandate that the parser has to actually implement everything all at once, but it's VERY hard to change the format later on.
So at this time, I'm recommending that we solve this format question immediately, and defer implementation into the libini_config itself.
I would not argue about the "fundamental" things. I just know how vendors bend the rules if they need to. IMO it is like building a house of cards in front of Godzilla. :-) But this is more a philosophical discussion. We can table it for now.
Defining format - sure. We can and should do it now so that we know how it would work to avoid problems in future. But the actual implementation should be deferred or at list done is stages.
But with Jakub's finding should we reconsider XML approach or the XML approach con's still outweigh the pros?
The more I have been thinking about it, the more I think that XML would be a bad idea for the schema. It's a difficult format to use correctly, and I think it would introduce more headaches than it solves.
- -- Stephen Gallagher RHCE 804006346421761
Delivering value year after year. Red Hat ranks #1 in value among software vendors. http://www.redhat.com/promo/vendor/
On Wed, Apr 7, 2010 at 1:02 PM, Dmitri Pal dpal@redhat.com wrote:
Stephen Gallagher wrote:
On 04/07/2010 02:46 PM, Dmitri Pal wrote:
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality).
As for separators I imply that space is always a valid separator. Those would be additional separators we can check.
I think it makes sense to NEVER use space as a separator, so that the use of a separator then becomes explicit and unambiguous. (And it's possible to add human-readable spaces around a separator)
Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz
There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing logic.
My concern is for the possibility of options where you specify foo bar, baz
Because with your proposal, this is always translated to: (foo, bar, baz), when what they really wanted was (foo bar, baz)
Right now in the SSSD there aren't any cases like this, but we're trying to build libini_config for general consumption. I don't think we're out of line requiring an explicit separator. Additionally, we can add another rule that whitespace surrounding a separator is not included in the value.
So name = value1\s\s\s,\svalue2 would become ('value1', 'value2') and not ('value1 ', ' value2')
I kind of disagree with the whole approach. If you want spaces inside the value you need to make the string quoted.
Something like
value = "fist value", "second value"
JSON != ini. As a format, ini allows spaces just like Stephen posted. If you want something like JSON perhaps the config format should indeed be JSON or YAML. However, I think Stephen is correct in that: key = value1, value two, valueTHREE
should evaluate to this in python syntax: key = ("value1", "value two", "valueTHREE")
What you proposed seems counter-intuitive.
This is currently not supported but would be easy to add when needed. But again we are talking about the ini files and explicitly multi value lists. It is not a good practice to have spaces in the multi value lists anyways. And I do not want people to encourage doing it. In case of single value the inner spaces are preserved but leading and trailing ones are trimmed.
Explicit is always better than implicit. The reverse is poor design in most cases. Take a look at places where this caused problems in the past *I'm looking at you php's magic quotes*.
Jeff Schroeder wrote:
On Wed, Apr 7, 2010 at 1:02 PM, Dmitri Pal dpal@redhat.com wrote:
Stephen Gallagher wrote:
On 04/07/2010 02:46 PM, Dmitri Pal wrote:
See below
We really need to have regex = as an option. For example: [ldap_uri] type = field vtype = string regex = ldaps?://[\w.]* list = true sep = , #How would we do space-separation? Should we even bother?
We can look into regex validation. Do not see big deal here. Will be a special function though. Can be added incrementally in a separate patch and handled as a separate task (within v1 validation functionality).
As for separators I imply that space is always a valid separator. Those would be additional separators we can check.
I think it makes sense to NEVER use space as a separator, so that the use of a separator then becomes explicit and unambiguous. (And it's possible to add human-readable spaces around a separator)
Adding comma to the list for example would not generate an error and would parse list: foo bar baz the same way as foo,bar,baz or foo, bar, baz
There was a bug recently about the similar issue. Multiple separators for a list is already baked into the parsing logic.
My concern is for the possibility of options where you specify foo bar, baz
Because with your proposal, this is always translated to: (foo, bar, baz), when what they really wanted was (foo bar, baz)
Right now in the SSSD there aren't any cases like this, but we're trying to build libini_config for general consumption. I don't think we're out of line requiring an explicit separator. Additionally, we can add another rule that whitespace surrounding a separator is not included in the value.
So name = value1\s\s\s,\svalue2 would become ('value1', 'value2') and not ('value1 ', ' value2')
I kind of disagree with the whole approach. If you want spaces inside the value you need to make the string quoted.
Something like
value = "fist value", "second value"
JSON != ini. As a format, ini allows spaces just like Stephen posted. If you want something like JSON perhaps the config format should indeed be JSON or YAML. However, I think Stephen is correct in that: key = value1, value two, valueTHREE
should evaluate to this in python syntax: key = ("value1", "value two", "valueTHREE")
What you proposed seems counter-intuitive.
Ok I did some tests and looked at the code. I was wrong. My memory failed me :-)
Here is how it works now.
* Space is not counted as a separator * Spaces (and tabs) are trimmed around separators * Inner spaces are preserved * There is no way to preserve leading or trailing spaces in a multi value list
So I think we are Ok on this point. Note: The function does not prevent you from specifying a space as a separator so if you want it be a separator you can.
This is currently not supported but would be easy to add when needed. But again we are talking about the ini files and explicitly multi value lists. It is not a good practice to have spaces in the multi value lists anyways. And I do not want people to encourage doing it. In case of single value the inner spaces are preserved but leading and trailing ones are trimmed.
Explicit is always better than implicit. The reverse is poor design in most cases. Take a look at places where this caused problems in the past *I'm looking at you php's magic quotes*.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/07/2010 07:36 PM, Stephen Gallagher wrote:
I thought it was a part of libxml2, but upon further research I see I was mistaken.
Sorry for coming late into the discussion..but I think there /is/ support for Schematron in libxml2..check out the --schematron parameter to xmllint and http://xmlsoft.org/html/libxml-schematron.html
For simple validation, the datatypes W3C module can also be very useful.
sssd-devel@lists.fedorahosted.org