New version of selog.h and observations

List overview All Threads
Download

newer

older

field names and taxonomy

connection to syslog (or...

Dmitri Pal

27 Mar 2012 27 Mar '12

11:30 p.m.

Hi,

1) I added more types. I think the more types the API supports out of box the more convenient it is to the user. The whole point is that the caller i.e. application developer would not need to convert the data or cast it. May be we should eventually add some well known structures like sockaddr for example as a type. It is always a pain to convert addresses. Library can do it itself. But this can be added as we go. The list of types is this header is good enough for starters.

2) I added arrays. Array can contain values of different types for the same attribute. Keith, does XML schema with types take this into the account?

3) Since we say we do not need LSON I removed it form the examples. An interesting observation: it would not have worked the way I proposed it with subtypes in names anyways because of the arrays. It would have limited the arrays to single type array which is generally not the case for example for proposed XML for the auditd.

4) Some thoughts about arrays in the event. How the arras are created? Does the developer know all the values at the same time or builds array gradually? How to reference array elements? What does it mean that there are many KVPs with the same key and different values? Is it one key with many values or one key with one value and the value should be overwritten?

Here is my take: a) If the developer knows all values in advance he can just specify them in one call like in the example on line 176. For example in the auditd case it will be convenient as the interpreted and raw values are known at the same time. b) The data is always added to the event and never modified unless the key is explicitly deleted. This means that the event will logically treat each key as an array of 1 and would be able to add another value if the KVP with the same key is specified again. This would allow building part of the event in a loop. c) Events are generally not editable - they are added to but in some cases it might be beneficial to drop some KVP from already existing set. This can be done by specifying a special value SELOG_DEL_ATTR. See lines 48 and 179. If the key has more than one value the whole array is deleted. d) There is no need to provide a way to access elements of the array as this would create too much of the complexity for a corner case. Rather than that I would suggest (if people ask) add a way to delete a special element from the already existing KVP array. This can be done by yet another special value SELOG_DEL_ATTRELEM, N when N is the index. This however can be added later if needed.

If you agree with my approach to arrays I can start on the implementation. I really can do an implementation of this interface and produce the results pretty quickly. Any objections? But I can't start though unless I hear that it really makes sense to do. We can definitely reiterate and polish but I can't afford doing the work and then throwing it away, sorry.

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Attachments:

selog.h (text/plain — 7.3 KB)

Show replies by date

David Lang

28 Mar 28 Mar

1:29 a.m.

On Tue, 27 Mar 2012, Dmitri Pal wrote:

...

Hi,

I added more types. I think the more types the API supports out of

box the more convenient it is to the user. The whole point is that the caller i.e. application developer would not need to convert the data or cast it. May be we should eventually add some well known structures like sockaddr for example as a type. It is always a pain to convert addresses. Library can do it itself. But this can be added as we go. The list of types is this header is good enough for starters.

Yes, this is exactly the type of thing that the library should make easy. Passing the well known structure in has the added advantage that it makes it easier for us to optimize a binary format later.

...

I added arrays. Array can contain values of different types for the

same attribute. Keith, does XML schema with types take this into the account?

arrays with different types sounds strange to me (especially from a C point of view)

what is the advantage of an array like this as opposed to simply adding many different elements with the same name?

i.e.

<parent> <tag>value</tag> <tag>value</tag> <tag>value</tag> <tag>value</tag> </parent>

...

Some thoughts about arrays in the event. How the arras are created?

Does the developer know all the values at the same time or builds array gradually? How to reference array elements? What does it mean that there are many KVPs with the same key and different values? Is it one key with many values or one key with one value and the value should be overwritten?

Here is my take: a) If the developer knows all values in advance he can just specify them in one call like in the example on line 176. For example in the auditd case it will be convenient as the interpreted and raw values are known at the same time. b) The data is always added to the event and never modified unless the key is explicitly deleted. This means that the event will logically treat each key as an array of 1 and would be able to add another value if the KVP with the same key is specified again. This would allow building part of the event in a loop. c) Events are generally not editable - they are added to but in some cases it might be beneficial to drop some KVP from already existing set. This can be done by specifying a special value SELOG_DEL_ATTR. See lines 48 and 179. If the key has more than one value the whole array is deleted. d) There is no need to provide a way to access elements of the array as this would create too much of the complexity for a corner case. Rather than that I would suggest (if people ask) add a way to delete a special element from the already existing KVP array. This can be done by yet another special value SELOG_DEL_ATTRELEM, N when N is the index. This however can be added later if needed.

in general editing of events feels like unneded complexity.

I can see doing something like filling in values for an event created from a template, but I can also see it being a reasonable requirement to pass in all the possible values at the event creation time.

David Lang

/* SELOG - Structured Event Logging Interface

Aggregated header file for the ELAPI interface.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/. */

#ifndef SELOG_H #define SELOG_H

/* Main types */ #define SELOG_TYPE_STRING 0 /* Null terminated strings */ #define SELOG_TYPE_ARRAY 1 /* Array of elements */ #define SELOG_TYPE_INT64 2 /* 64-bit signed number */ #define SELOG_TYPE_BIN 3 /* Binary sequence of bytes */ #define SELOG_TYPE_DOUBLE 4 /* Double */ #define SELOG_TYPE_BOOL 5 /* Boolean as a number */ #define SELOG_TYPE_BOOLSTR 6 /* Boolean as string */ #define SELOG_TYPE_IPV4 7 /* IPv4 as binary */ #define SELOG_TYPE_IPV4STR 8 /* IPv4 as string */ #define SELOG_TYPE_IPV6 9 /* IPv6 as binary */ #define SELOG_TYPE_IPV6STR 10 /* IPv6 as string */ #define SELOG_TYPE_UUID 11 /* UUID string in a canonical form */ #define SELOG_TYPE_EMAIL 12 /* Email address in a canocal form */ #define SELOG_TYPE_FQDN 13 /* Host name in a canonical form */ #define SELOG_TYPE_MAC 14 /* Six byte MAC address array */ #define SELOG_TYPE_MACSTR 15 /* Mac addess as string */

/* Additional types added for completenss */ #define SELOG_TYPE_I32 50 /* 32-bit signed number */ #define SELOG_TYPE_U32 51 /* 32-bit unsigned number */ #define SELOG_TYPE_U64 52 /* 64-bit unsigned number */ /* More numeric types can be added later if needed */

/* Special argument to remove an attribute */ #define SELOG_DEL_ATTR "-"

/* Initialization Flags - TBD */

/******************************************************************************/ /* Initialization */ /******************************************************************************/

/* Init the interface */ int selog_init(const char *appname, int option, int facility, int flags);

/* Init the interface with defaults */ int selog_init_simple(const char *appname);

/* Close interfce */ void selog_close(void);

/******************************************************************************/ /* Logging */ /******************************************************************************/

/* Log key value pairs optionally using precreated event */ int selog_event(int priority, selog_e event, ...);

/* Function to log key value pairs - a wrapper around selog_event */ int selog_kvp(int priority, ...);

/******************************************************************************/ /* Event object manipulation */ /******************************************************************************/

/* Function to create event out of key value pairs and another event if any */ int selog_event_create(selog_e *new_event, selog_e *event, ...);

/* Function to modify an event, can be used to remove or update values */ int selog_event_modify(selog_e event, ...);

/* Function to destroy event */ int selog_event_destroy(selog_e event);

/******************************************************************************/ /* Intereface usage examples */ /******************************************************************************/ /* int main() { int error = 0; char bindata[10] = { 1,2,3,4,5,6,7,8,9,10 };

error = selog_init_simple("MyExample"); if (error) { printf("Failed to init log %d\n", error); return error; }

error = selog_kvp(LOG_INFO, "field1", SELOG_TYPE_STRING, "my string", "field2", SELOG_TYPE_INT64, -5, "peer_ip", SELOG_TYPE_IPV4STR, "192.168.0.1" "subset!a", SELOG_TYPE_BOOLSTR, "yes", "subset!b", SELOG_TYPE_BIN, 10, bindata, NULL); if (error) { printf("Failed to log KVPs %d\n", error); selog_close(); return error; }

selog_close(); return 0; }

Timestamp is added automatically by the library.

Expected message contents in the syslog message is (spaces and new lines are added for redability):

@json { "timestamp": "2001-12-31T12:00:00", "app": "MyExample" "level":"INFO", "field1": "my string", "field2": -5, "peer_ip": ["192.168.0.1", "10.100.6.26"] "subset: { "a": true, "b": "'0102030405060708090A'"}}" }

Other serialization examples will be added in unit tests.

Same data but example on how the event can be built gradually.

int main() { int error = 0; char bindata[10] = { 1,2,3,4,5,6,7,8,9,10 }; selog_e event = NULL; selog_e common = NULL;

error = selog_init_simple("MyExample"); if (error) { printf("Failed to init log %d\n", error); return error; }

error = selog_event_create(&common, NULL, "field1", SELOG_TYPE_STRING, "my string", NULL); if (error) { printf("Failed to create an event %d\n", error); selog_close(); return error; }

error = selog_event_create(&event, common, "field2", SELOG_TYPE_INT64, -5, "field3", SELOG_TYPE_EMAIL, "me@mydomain.org", NULL); if (error) { printf("Failed to create an event %d\n", error); selog_event_destroy(common); selog_close(); return error; }

error = selog_event_modify(event, "peer_ip", SELOG_TYPE_ARRAY, 2 SELOG_TYPE_IPV4STR, "192.168.0.1" SELOG_TYPE_IPV4STR, "10.100.6.26" SELOG_DEL_ATTR, "field3", "subset!a", SELOG_TYPE_BOOLSTR, "yes", NULL); if (error) { printf("Failed to log event %d\n", error); selog_event_destroy(common); selog_event_destroy(event); selog_close(); return error; }

error = selog_log_event(LOG_INFO, event, "subset!b", SELOG_TYPE_BIN, 10, bindata, NULL); if (error) { printf("Failed to log event %d\n", error); selog_event_destroy(common); selog_event_destroy(event); selog_close(); return error; }

selog_event_destroy(common); selog_event_destroy(event); selog_close(); return 0;

}

#endif

_______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

Rainer Gerhards

6:05 a.m.

...

-----Original Message----- From: lumberjack-developers-bounces@lists.fedorahosted.org [mailto:lumberjack-developers-bounces@lists.fedorahosted.org] On Behalf Of david@lang.hm Sent: Wednesday, March 28, 2012 7:29 AM To: dpal@redhat.com; lumberjack logging Subject: Re: [lumberjack] New version of selog.h and observations

On Tue, 27 Mar 2012, Dmitri Pal wrote:

...
Hi,

I added more types. I think the more types the API supports out of

box the more convenient it is to the user. The whole point is that

the

...
caller i.e. application developer would not need to convert the data

or

...
cast it. May be we should eventually add some well known structures

like

...
sockaddr for example as a type. It is always a pain to convert addresses. Library can do it itself. But this can be added as we go.

The

...
list of types is this header is good enough for starters.

Yes, this is exactly the type of thing that the library should make easy. Passing the well known structure in has the added advantage that it makes it easier for us to optimize a binary format later.

...

...

I added arrays. Array can contain values of different types for

the

...
same attribute. Keith, does XML schema with types take this into the account?

arrays with different types sounds strange to me (especially from a C point of view)

++1 - don't like that idea... At least it should not be an "array" but something like a "set" (?). Array IMHO implies to most programmers same data type (even an array of objects has object as the base type, no matter what the actual object is...).

...

what is the advantage of an array like this as opposed to simply adding many different elements with the same name?

i.e.

<parent> <tag>value</tag> <tag>value</tag> <tag>value</tag> <tag>value</tag> </parent>

...

Some thoughts about arrays in the event. How the arras are

created?

...
Does the developer know all the values at the same time or builds

array

...
gradually? How to reference array elements? What does it mean that

there

...
are many KVPs with the same key and different values? Is it one key

with

...
many values or one key with one value and the value should be

overwritten?

...
Here is my take: a) If the developer knows all values in advance he can just specify

them

...
in one call like in the example on line 176. For example in the

auditd

...
case it will be convenient as the interpreted and raw values are

known

...
at the same time. b) The data is always added to the event and never modified unless

the

...
key is explicitly deleted. This means that the event will logically treat each key as an array of 1 and would be able to add another

value

...
if the KVP with the same key is specified again. This would allow building part of the event in a loop. c) Events are generally not editable - they are added to but in some cases it might be beneficial to drop some KVP from already existing

set.

...
This can be done by specifying a special value SELOG_DEL_ATTR. See

lines

...
48 and 179. If the key has more than one value the whole array is

deleted.

...
d) There is no need to provide a way to access elements of the array

as

...
this would create too much of the complexity for a corner case.

Rather

...
than that I would suggest (if people ask) add a way to delete a

special

...
element from the already existing KVP array. This can be done by yet another special value SELOG_DEL_ATTRELEM, N when N is the index. This however can be added later if needed.

in general editing of events feels like unneded complexity.

But it's probably good to have for augmenting relays.

Rainer

Dmitri Pal

11:18 a.m.

On 03/28/2012 01:29 AM, david@lang.hm wrote:

...

On Tue, 27 Mar 2012, Dmitri Pal wrote:

...
Hi,

I added more types. I think the more types the API supports out of

box the more convenient it is to the user. The whole point is that the caller i.e. application developer would not need to convert the data or cast it. May be we should eventually add some well known structures like sockaddr for example as a type. It is always a pain to convert addresses. Library can do it itself. But this can be added as we go. The list of types is this header is good enough for starters.

Yes, this is exactly the type of thing that the library should make easy. Passing the well known structure in has the added advantage that it makes it easier for us to optimize a binary format later.

...

I added arrays. Array can contain values of different types for the

same attribute. Keith, does XML schema with types take this into the account?

arrays with different types sounds strange to me (especially from a C point of view)

Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would emerge.

...

what is the advantage of an array like this as opposed to simply adding many different elements with the same name?

i.e.

<parent> <tag>value</tag> <tag>value</tag> <tag>value</tag> <tag>value</tag> </parent>

...

Some thoughts about arrays in the event. How the arras are created?

Does the developer know all the values at the same time or builds array gradually? How to reference array elements? What does it mean that there are many KVPs with the same key and different values? Is it one key with many values or one key with one value and the value should be overwritten?

Here is my take: a) If the developer knows all values in advance he can just specify them in one call like in the example on line 176. For example in the auditd case it will be convenient as the interpreted and raw values are known at the same time. b) The data is always added to the event and never modified unless the key is explicitly deleted. This means that the event will logically treat each key as an array of 1 and would be able to add another value if the KVP with the same key is specified again. This would allow building part of the event in a loop. c) Events are generally not editable - they are added to but in some cases it might be beneficial to drop some KVP from already existing set. This can be done by specifying a special value SELOG_DEL_ATTR. See lines 48 and 179. If the key has more than one value the whole array is deleted. d) There is no need to provide a way to access elements of the array as this would create too much of the complexity for a corner case. Rather than that I would suggest (if people ask) add a way to delete a special element from the already existing KVP array. This can be done by yet another special value SELOG_DEL_ATTRELEM, N when N is the index. This however can be added later if needed.

in general editing of events feels like unneded complexity.

I can see doing something like filling in values for an event created from a template, but I can also see it being a reasonable requirement to pass in all the possible values at the event creation time.

David Lang

lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

11:21 a.m.

...

Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would emerge.

I guess I missed that mail. Is it really an array? I'd still say it is a set...

Rainer

Dmitri Pal

11:58 a.m.

On 03/28/2012 11:21 AM, Rainer Gerhards wrote:

...

...
Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would emerge.

I guess I missed that mail. Is it really an array? I'd still say it is a set...

Rainer

This is how it was in JSON:

1 2012-03-16T20:37:04 localhost.localdomain cee-plugin 9011 - - @cee:{"time":"2012-03-09T14:56:32.359-05:00","serial":"192","id":"1100","p_host":[],"p_app":"auditd","file":"stdin","line":"4810","auditd.type":"USER_AUTH","auditd.pid":"0","auditd.uid":["0","root"],"auditd.auid":["1000","bockel"],"auditd.ses":"1","auditd.subj":"unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023","auditd.op":"PAM:authentication","auditd.acct":"root","auditd.exe":"/bin/su","auditd.hostname":[],"auditd.addr":[],"auditd.terminal":"pts/1","auditd.res":"success"}

May be I misread but the encoding can be:

["0", "root"]

and

[0,"root"]

Both should be possible. So it seems that I need to support arrays and sets. Not a biggie can be done.

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

12:02 p.m.

...

-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 5:58 PM To: Rainer Gerhards Cc: lumberjack logging; david@lang.hm Subject: Re: [lumberjack] New version of selog.h and observations

On 03/28/2012 11:21 AM, Rainer Gerhards wrote:

...
...
Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would

emerge.

...
I guess I missed that mail. Is it really an array? I'd still say it

is a

...
set...

Rainer

This is how it was in JSON:

1 2012-03-16T20:37:04 localhost.localdomain cee-plugin 9011 - - @cee:{"time":"2012-03-09T14:56:32.359- 05:00","serial":"192","id":"1100","p_host":[],"p_app":"auditd","file":" stdin","line":"4810","auditd.type":"USER_AUTH","auditd.pid":"0","auditd .uid":["0","root"],"auditd.auid":["1000","bockel"],"auditd.ses":"1","au ditd.subj":"unconfined_u:unconfined_r:unconfined_t:s0- s0:c0.c1023","auditd.op":"PAM:authentication","auditd.acct":"root","aud itd.exe":"/bin/su","auditd.hostname":[],"auditd.addr":[],"auditd.termin al":"pts/1","auditd.res":"success"}

May be I misread but the encoding can be:

["0", "root"]

and

[0,"root"]

Both should be possible. So it seems that I need to support arrays and sets. Not a biggie can be done.

This even looks like an ordered set, where the index matters. Or in other words: its values with implicit names, where the name is provided by the index? I mean we get position-dependent here, right?

If I am right, this is not necessarily bad, but do we intentionally want to do that? If so, we should spell it out explicitely. I prefer explicit names.

Rainer

Dmitri Pal

12:10 p.m.

On 03/28/2012 12:02 PM, Rainer Gerhards wrote:

...

...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 5:58 PM To: Rainer Gerhards Cc: lumberjack logging; david@lang.hm Subject: Re: [lumberjack] New version of selog.h and observations

On 03/28/2012 11:21 AM, Rainer Gerhards wrote:

...
...
Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would

emerge.

...
I guess I missed that mail. Is it really an array? I'd still say it

is a

...
set...

Rainer

This is how it was in JSON:

1 2012-03-16T20:37:04 localhost.localdomain cee-plugin 9011 - - @cee:{"time":"2012-03-09T14:56:32.359- 05:00","serial":"192","id":"1100","p_host":[],"p_app":"auditd","file":" stdin","line":"4810","auditd.type":"USER_AUTH","auditd.pid":"0","auditd .uid":["0","root"],"auditd.auid":["1000","bockel"],"auditd.ses":"1","au ditd.subj":"unconfined_u:unconfined_r:unconfined_t:s0- s0:c0.c1023","auditd.op":"PAM:authentication","auditd.acct":"root","aud itd.exe":"/bin/su","auditd.hostname":[],"auditd.addr":[],"auditd.termin al":"pts/1","auditd.res":"success"}

May be I misread but the encoding can be:

["0", "root"]

and

[0,"root"]

Both should be possible. So it seems that I need to support arrays and sets. Not a biggie can be done.

This even looks like an ordered set, where the index matters. Or in other words: its values with implicit names, where the name is provided by the index? I mean we get position-dependent here, right?

If I am right, this is not necessarily bad, but do we intentionally want to do that? If so, we should spell it out explicitely. I prefer explicit names.

I am not saying it should be ordered. I am saying we should not force a developer to sprintf uid into string before passing it into the library and would should allow developer to pass the two values in any order. However I agree the spec should explicitly allow or restrict mixed types.

Keith, what is your take?

...

Rainer

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

12:13 p.m.

...

I am not saying it should be ordered. I am saying we should not force a developer to sprintf uid into string before passing it into the library and would should allow developer to pass the two values in any order.

If not spec'ed, that could quickly boil down (again) that you don't know exactly what information are you dealing with.

...

However I agree the spec should explicitly allow or restrict mixed types.

...

Keith, what is your take?

Rainer

William Heinbockel

1 Apr 1 Apr

4:11 p.m.

On Mar 28, 2012 12:10 PM, "Dmitri Pal" dpal@redhat.com wrote:

...

On 03/28/2012 12:02 PM, Rainer Gerhards wrote:

...
...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 5:58 PM To: Rainer Gerhards Cc: lumberjack logging; david@lang.hm Subject: Re: [lumberjack] New version of selog.h and observations

On 03/28/2012 11:21 AM, Rainer Gerhards wrote:

...
...
Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would

emerge.

...
I guess I missed that mail. Is it really an array? I'd still say it

is a

...
set...

Rainer

This is how it was in JSON:

1 2012-03-16T20:37:04 localhost.localdomain cee-plugin 9011 - - @cee:{"time":"2012-03-09T14:56:32.359- 05:00","serial":"192","id":"1100","p_host":[],"p_app":"auditd","file":" stdin","line":"4810","auditd.type":"USER_AUTH","auditd.pid":"0","auditd .uid":["0","root"],"auditd.auid":["1000","bockel"],"auditd.ses":"1","au ditd.subj":"unconfined_u:unconfined_r:unconfined_t:s0- s0:c0.c1023","auditd.op":"PAM:authentication","auditd.acct":"root","aud itd.exe":"/bin/su","auditd.hostname":[],"auditd.addr":[],"auditd.termin al":"pts/1","auditd.res":"success"}

May be I misread but the encoding can be:

["0", "root"]

and

[0,"root"]

Both should be possible. So it seems that I need to support arrays and sets. Not a biggie can be done.

This even looks like an ordered set, where the index matters. Or in

other

...

...
words: its values with implicit names, where the name is provided by the index? I mean we get position-dependent here, right?

If I am right, this is not necessarily bad, but do we intentionally

want to

...

...
do that? If so, we should spell it out explicitely. I prefer explicit

names.

...

...
I am not saying it should be ordered. I am saying we should not force a developer to sprintf uid into string before passing it into the library and would should allow developer to pass the two values in any order. However I agree the spec should explicitly allow or restrict mixed types.

Keith, what is your take?

The easiest way to support this in xml is to follow David's proposal from above. In that case, each value has its own element. In any case, there is no way to "enforce" that all values are of the same type. Same with json, there is no restriction on the mixing of types in an array.

Whether or not we support mixed types influences the field names. For example, to we have a "src" that has ip and string values? Or do we split them into separate fields based on the different types.

Personally, I agree with Rainer, though I am mixed as to which would work better for our use case

...

...
Rainer

-- Thank you, Dmitri Pal

Sr. Engineering Manager IPA project, Red Hat Inc.

Looking to carve out IT costs? www.redhat.com/carveoutcosts/

lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

David Lang

10:05 p.m.

On Sun, 1 Apr 2012, William Heinbockel wrote:

...

On Mar 28, 2012 12:10 PM, "Dmitri Pal" dpal@redhat.com wrote:

...
On 03/28/2012 12:02 PM, Rainer Gerhards wrote:

...
...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com]

This is how it was in JSON:

1 2012-03-16T20:37:04 localhost.localdomain cee-plugin 9011 - - @cee:{"time":"2012-03-09T14:56:32.359- 05:00","serial":"192","id":"1100","p_host":[],"p_app":"auditd","file":" stdin","line":"4810","auditd.type":"USER_AUTH","auditd.pid":"0","auditd .uid":["0","root"],"auditd.auid":["1000","bockel"],"auditd.ses":"1","au ditd.subj":"unconfined_u:unconfined_r:unconfined_t:s0- s0:c0.c1023","auditd.op":"PAM:authentication","auditd.acct":"root","aud itd.exe":"/bin/su","auditd.hostname":[],"auditd.addr":[],"auditd.termin al":"pts/1","auditd.res":"success"}

May be I misread but the encoding can be:

["0", "root"]

and

[0,"root"]

Both should be possible. So it seems that I need to support arrays and sets. Not a biggie can be done.

This even looks like an ordered set, where the index matters. Or in other words: its values with implicit names, where the name is provided by the index? I mean we get position-dependent here, right?

If I am right, this is not necessarily bad, but do we intentionally want to do that? If so, we should spell it out explicitely. I prefer explicit names.

I am not saying it should be ordered. I am saying we should not force a developer to sprintf uid into string before passing it into the library and would should allow developer to pass the two values in any order. However I agree the spec should explicitly allow or restrict mixed types.

Keith, what is your take?

The easiest way to support this in xml is to follow David's proposal from above. In that case, each value has its own element. In any case, there is no way to "enforce" that all values are of the same type. Same with json, there is no restriction on the mixing of types in an array.

Whether or not we support mixed types influences the field names. For example, to we have a "src" that has ip and string values? Or do we split them into separate fields based on the different types.

Personally, I agree with Rainer, though I am mixed as to which would work better for our use case

I think we are better off with a source that contains multiple possible items, these items may be different types and depending on the source, some items may not be there.

possible items that I can think of (for different contexts)

IP hostname FQDN port interface

David Lang

_______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

William Heinbockel

4:03 p.m.

On Mar 28, 2012 11:58 AM, "Dmitri Pal" dpal@redhat.com wrote:

...

On 03/28/2012 11:21 AM, Rainer Gerhards wrote:

...
...
Yes but this is what was proposed for processing interpreted and uninterpreted auditd output. I think it is reasonable to assume that other use cases would emerge.

I guess I missed that mail. Is it really an array? I'd still say it is a set...

Rainer

This is how it was in JSON:

1 2012-03-16T20:37:04 localhost.localdomain cee-plugin 9011 - -

@cee:{"time":"2012-03-09T14:56:32.359-05:00","serial":"192","id":"1100","p_host":[],"p_app":"auditd","file":"stdin","line":"4810","auditd.type":"USER_AUTH","auditd.pid":"0","auditd.uid":["0","root"],"auditd.auid":["1000","bockel"],"auditd.ses":"1","auditd.subj":"unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023","auditd.op":"PAM:authentication","auditd.acct":"root","auditd.exe":"/bin/su","auditd.hostname":[],"auditd.addr":[],"auditd.terminal":"pts/1","auditd.res":"success"}

...

May be I misread but the encoding can be:

["0", "root"]

and

[0,"root"]

Do not get stuck on this format. The one thing that auditd does to that we should think about: resolution

For example, uid 0 gets "resolved" to "root". So there is an equivalence between 0 and root. Similar applies to ip and dns names. Some events will have one or the other, some may have both.

Should we have some way of expressing this?

...

Both should be possible. So it seems that I need to support arrays and sets. Not a biggie can be done.

-- Thank you, Dmitri Pal

Sr. Engineering Manager IPA project, Red Hat Inc.

Looking to carve out IT costs? www.redhat.com/carveoutcosts/

lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

Rainer Gerhards

28 Mar 28 Mar

6:07 a.m.

...

In addition to the feedback in reply to David... If you agree with my approach to arrays I can start on the implementation. I really can do an implementation of this interface and produce the results pretty quickly. Any objections?

This sounds good enough for me with my current understanding and scope. But talking about evolution, I cannot make any guarantee that I will change my mind later. I simply do not know enough (and only time will tell IMHO)...

Sorry for not having a more definite answer.

Rainer

4469

Age (days ago)

4474

Last active (days ago)

lumberjack-developers@lists.fedorahosted.org

12 comments

4 participants

tags (0)

participants (4)

David Lang
Dmitri Pal
Rainer Gerhards
William Heinbockel