Re: [lumberjack] how many steps to do - - lumberjack-developers

List overview All Threads
Download

newer

Re: [lumberjack] how many steps to do -

older

wrap-up of a night of traffic

umberlog & glibc

Rainer Gerhards

28 Mar 2012 28 Mar '12

7:05 a.m.

+1 Well said :-) Rainer

Gergely Nagy algernon@balabit.hu hat geschrieben:"Rainer Gerhards" rgerhards@hq.adiscon.com writes:

...

...
With the current plan we ask application developers to change things twice. First to new syslog umberlog interface and then later to a better interface. IMO this is wrong. I tried to rise the concerns about this several times but have not been heard. IMO we need to define one interface that the application developers would use and start to migrate to and then evolve the infrastructure under it.

We should take selog or like, polish it warp around ul_log or pure syslog and have as an interface. While developers would migrate to it we will evolve the library and syslog implementation under it to serialize and format things into JSON XML etc but developer will have to change things once.

This definitely is a valid argument, but I think we need to weigh the pros and cons. Too much change in a single instant is very hard to sell. To small steps don't make sense, either.

My personal opinion (and experience) is that smaller steps work better than larger ones. There are obviously people in the absolute opposite camp. I still think that evolution works better than revolution.

There's one more thing I would like to add, regarding the discussion about handshake and how to change/replace/extend/whatever /dev/log: why care?

I don't think there must be a single interface to get logs from one application to another, nor should we care about the transportation issue, unless in context of libc (but more on that later, in another thread).

Why? Because there's no single size that fits all. We want to be able to *work with* logs, so we need a format, or representation that makes sense, is easy to work with, and is pretty much a standard. How it arrives from the producer to the consumer, is none of our business, in my opinion.

Like Rainer, I'm not against anyone trying to go down this route, but that's a route that I, personally, do not wish to tread.

-- |8] _______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

Show replies by date

Brian Knox

28 Mar 28 Mar

7:20 a.m.

New subject: how many steps to do -

A nice, fast call I can use to generate the structured logs in a CEE serialization format is important to me. Additional features such as handling the transport are not as much. If people want to add in the ability to handle the transport I can't find a reason to object, as long as there is a transport agnostic call that given a set of data from an application on a host, hands me back a serialized CEE log line ready to use.

Brian

On Wed, Mar 28, 2012 at 7:05 AM, Rainer Gerhards rgerhards@hq.adiscon.comwrote:

...

+1 Well said :-) Rainer

Gergely Nagy algernon@balabit.hu hat geschrieben:"Rainer Gerhards" < rgerhards@hq.adiscon.com> writes:

...
...
With the current plan we ask application developers to change things twice. First to new syslog umberlog interface and then later to a better interface. IMO this is wrong. I tried to rise the concerns about this several times but have not been heard. IMO we need to define one interface that the application developers would use and start to migrate to and then evolve the infrastructure under it.

We should take selog or like, polish it warp around ul_log or pure syslog and have as an interface. While developers would migrate to it we will evolve the library and syslog implementation under it to serialize and format things into JSON XML etc but developer will have to change things once.

This definitely is a valid argument, but I think we need to weigh the

pros

...
and cons. Too much change in a single instant is very hard to sell. To

small

...
steps don't make sense, either.

My personal opinion (and experience) is that smaller steps work better

than

...
larger ones. There are obviously people in the absolute opposite camp. I still think that evolution works better than revolution.

There's one more thing I would like to add, regarding the discussion about handshake and how to change/replace/extend/whatever /dev/log: why care?

I don't think there must be a single interface to get logs from one application to another, nor should we care about the transportation issue, unless in context of libc (but more on that later, in another thread).

Why? Because there's no single size that fits all. We want to be able to *work with* logs, so we need a format, or representation that makes sense, is easy to work with, and is pretty much a standard. How it arrives from the producer to the consumer, is none of our business, in my opinion.

Like Rainer, I'm not against anyone trying to go down this route, but that's a route that I, personally, do not wish to tread.

-- |8]

lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers _______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

William Heinbockel

11:27 a.m.

New subject: how many steps to do -

Right.

My thought was to provide options:

1. umberlog is a fast, easy way to get minimal support via syslog.h

2. More full featured log API options will be produced (e.g., selog/ELAPI)

Some users will do the fast, easy option (i.e., umberlog). Others will utilize the more advanced API. Some folks may first transition to the umberlog, then to a more advanced API.

The importance is in the underlying consistency of the event model. That is, that I can use the same name-value constructs from umberlog and use the same/similar model with selog.

On Wed, Mar 28, 2012 at 7:20 AM, Brian Knox briank@talksum.com wrote:

...

A nice, fast call I can use to generate the structured logs in a CEE serialization format is important to me. Additional features such as handling the transport are not as much. If people want to add in the ability to handle the transport I can't find a reason to object, as long as there is a transport agnostic call that given a set of data from an application on a host, hands me back a serialized CEE log line ready to use.

Brian

On Wed, Mar 28, 2012 at 7:05 AM, Rainer Gerhards rgerhards@hq.adiscon.com wrote:

...
+1 Well said :-) Rainer

Gergely Nagy algernon@balabit.hu hat geschrieben:"Rainer Gerhards" rgerhards@hq.adiscon.com writes:

...
...
With the current plan we ask application developers to change things twice. First to new syslog umberlog interface and then later to a better interface. IMO this is wrong. I tried to rise the concerns about this several times but have not been heard. IMO we need to define one interface that the application developers would use and start to migrate to and then evolve the infrastructure under it.

We should take selog or like, polish it warp around ul_log or pure syslog and have as an interface. While developers would migrate to it we will evolve the library and syslog implementation under it to serialize and format things into JSON XML etc but developer will have to change things once.

This definitely is a valid argument, but I think we need to weigh the pros and cons. Too much change in a single instant is very hard to sell. To small steps don't make sense, either.

My personal opinion (and experience) is that smaller steps work better than larger ones. There are obviously people in the absolute opposite camp. I still think that evolution works better than revolution.

There's one more thing I would like to add, regarding the discussion about handshake and how to change/replace/extend/whatever /dev/log: why care?

I don't think there must be a single interface to get logs from one application to another, nor should we care about the transportation issue, unless in context of libc (but more on that later, in another thread).

Why? Because there's no single size that fits all. We want to be able to *work with* logs, so we need a format, or representation that makes sense, is easy to work with, and is pretty much a standard. How it arrives from the producer to the consumer, is none of our business, in my opinion.

Like Rainer, I'm not against anyone trying to go down this route, but that's a route that I, personally, do not wish to tread.

-- |8]

David Lang

4:31 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, William Heinbockel wrote:

...

Right.

My thought was to provide options:

umberlog is a fast, easy way to get minimal support via syslog.h

More full featured log API options will be produced (e.g., selog/ELAPI)

Some users will do the fast, easy option (i.e., umberlog). Others will utilize the more advanced API. Some folks may first transition to the umberlog, then to a more advanced API.

The importance is in the underlying consistency of the event model. That is, that I can use the same name-value constructs from umberlog and use the same/similar model with selog.

same serialization formats yes.

same names in the same structure is something that we cannot force, that is going to be entirely up to the application programmer. They can follow the cee suggestions, or they can opt to ignore them and do their own thing.

unfortunantly, I expect the latter to be the common case. But even in that case, the log will be structured so that the data fields can be extracted from it unabiguously

David Lang

...

On Wed, Mar 28, 2012 at 7:20 AM, Brian Knox briank@talksum.com wrote:

...
A nice, fast call I can use to generate the structured logs in a CEE serialization format is important to me. Additional features such as handling the transport are not as much. If people want to add in the ability to handle the transport I can't find a reason to object, as long as there is a transport agnostic call that given a set of data from an application on a host, hands me back a serialized CEE log line ready to use.

Brian

On Wed, Mar 28, 2012 at 7:05 AM, Rainer Gerhards rgerhards@hq.adiscon.com wrote:

...
+1 Well said :-) Rainer

Gergely Nagy algernon@balabit.hu hat geschrieben:"Rainer Gerhards" rgerhards@hq.adiscon.com writes:

...
...
With the current plan we ask application developers to change things twice. First to new syslog umberlog interface and then later to a better interface. IMO this is wrong. I tried to rise the concerns about this several times but have not been heard. IMO we need to define one interface that the application developers would use and start to migrate to and then evolve the infrastructure under it.

We should take selog or like, polish it warp around ul_log or pure syslog and have as an interface. While developers would migrate to it we will evolve the library and syslog implementation under it to serialize and format things into JSON XML etc but developer will have to change things once.

This definitely is a valid argument, but I think we need to weigh the pros and cons. Too much change in a single instant is very hard to sell. To small steps don't make sense, either.

My personal opinion (and experience) is that smaller steps work better than larger ones. There are obviously people in the absolute opposite camp. I still think that evolution works better than revolution.

There's one more thing I would like to add, regarding the discussion about handshake and how to change/replace/extend/whatever /dev/log: why care?

I don't think there must be a single interface to get logs from one application to another, nor should we care about the transportation issue, unless in context of libc (but more on that later, in another thread).

Why? Because there's no single size that fits all. We want to be able to *work with* logs, so we need a format, or representation that makes sense, is easy to work with, and is pretty much a standard. How it arrives from the producer to the consumer, is none of our business, in my opinion.

Like Rainer, I'm not against anyone trying to go down this route, but that's a route that I, personally, do not wish to tread.

-- |8]

lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

David Lang

4:34 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, Brian Knox wrote:

...

A nice, fast call I can use to generate the structured logs in a CEE serialization format is important to me. Additional features such as handling the transport are not as much. If people want to add in the ability to handle the transport I can't find a reason to object, as long as there is a transport agnostic call that given a set of data from an application on a host, hands me back a serialized CEE log line ready to use.

This is a little different requirement than we've been talking about.

this is like a sprintf version of the syslog() call, but it would need to have an added parameter (or global variable) to define what serialization options to use.

David Lang

...

Brian

On Wed, Mar 28, 2012 at 7:05 AM, Rainer Gerhards rgerhards@hq.adiscon.comwrote:

...
+1 Well said :-) Rainer

Gergely Nagy algernon@balabit.hu hat geschrieben:"Rainer Gerhards" < rgerhards@hq.adiscon.com> writes:

...
...
With the current plan we ask application developers to change things twice. First to new syslog umberlog interface and then later to a better interface. IMO this is wrong. I tried to rise the concerns about this several times but have not been heard. IMO we need to define one interface that the application developers would use and start to migrate to and then evolve the infrastructure under it.

We should take selog or like, polish it warp around ul_log or pure syslog and have as an interface. While developers would migrate to it we will evolve the library and syslog implementation under it to serialize and format things into JSON XML etc but developer will have to change things once.

This definitely is a valid argument, but I think we need to weigh the

pros

...
and cons. Too much change in a single instant is very hard to sell. To

small

...
steps don't make sense, either.

My personal opinion (and experience) is that smaller steps work better

than

...
larger ones. There are obviously people in the absolute opposite camp. I still think that evolution works better than revolution.

There's one more thing I would like to add, regarding the discussion about handshake and how to change/replace/extend/whatever /dev/log: why care?

I don't think there must be a single interface to get logs from one application to another, nor should we care about the transportation issue, unless in context of libc (but more on that later, in another thread).

Why? Because there's no single size that fits all. We want to be able to *work with* logs, so we need a format, or representation that makes sense, is easy to work with, and is pretty much a standard. How it arrives from the producer to the consumer, is none of our business, in my opinion.

Like Rainer, I'm not against anyone trying to go down this route, but that's a route that I, personally, do not wish to tread.

-- |8]

lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers _______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

_______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

Dmitri Pal

11:40 a.m.

New subject: how many steps to do -

On 03/28/2012 07:05 AM, Rainer Gerhards wrote:

...

+1 Well said :-) Rainer

Gergely Nagy algernon@balabit.hu hat geschrieben:"Rainer Gerhards" rgerhards@hq.adiscon.com writes:

...
...
With the current plan we ask application developers to change things twice. First to new syslog umberlog interface and then later to a better interface. IMO this is wrong. I tried to rise the concerns about this several times but have not been heard. IMO we need to define one interface that the application developers would use and start to migrate to and then evolve the infrastructure under it.

We should take selog or like, polish it warp around ul_log or pure syslog and have as an interface. While developers would migrate to it we will evolve the library and syslog implementation under it to serialize and format things into JSON XML etc but developer will have to change things once.

This definitely is a valid argument, but I think we need to weigh the pros and cons. Too much change in a single instant is very hard to sell. To small steps don't make sense, either.

My personal opinion (and experience) is that smaller steps work better than larger ones. There are obviously people in the absolute opposite camp. I still think that evolution works better than revolution.

There's one more thing I would like to add, regarding the discussion about handshake and how to change/replace/extend/whatever /dev/log: why care?

I don't think there must be a single interface to get logs from one application to another, nor should we care about the transportation issue, unless in context of libc (but more on that later, in another thread).

Why? Because there's no single size that fits all. We want to be able to *work with* logs, so we need a format, or representation that makes sense, is easy to work with, and is pretty much a standard. How it arrives from the producer to the consumer, is none of our business, in my opinion.

Like Rainer, I'm not against anyone trying to go down this route, but that's a route that I, personally, do not wish to tread.

So please explain to me how you think the library should decide what to use for serialization JSON, XML, XML with types or something else we add later? The file approach that Rainer suggested is close to a config file for a library. I thought we wanted to avoid that. Are we back to ENV vars does not look appealing to me? I do not think application cares about the serialization format. Only consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the format preference for the library.

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

11:44 a.m.

New subject: how many steps to do -

...

So please explain to me how you think the library should decide what to use for serialization JSON, XML, XML with types or something else we add later? The file approach that Rainer suggested is close to a config file for a library. I thought we wanted to avoid that. Are we back to ENV vars does not look appealing to me? I do not think application cares about the serialization format. Only consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the format preference for the library.

I guess the answer to this question is already buried in some thread, but please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

(In other words: do we intend to have a single standard or multiple slightly different ones?)

Rainer

Dmitri Pal

12:03 p.m.

New subject: how many steps to do -

On 03/28/2012 11:44 AM, Rainer Gerhards wrote:

...

...
So please explain to me how you think the library should decide what to use for serialization JSON, XML, XML with types or something else we add later? The file approach that Rainer suggested is close to a config file for a library. I thought we wanted to avoid that. Are we back to ENV vars does not look appealing to me? I do not think application cares about the serialization format. Only consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the format preference for the library.

I guess the answer to this question is already buried in some thread, but please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

Same but because some of them do not pass all the types (JSON) the information is a bit tarnished.

...

(In other words: do we intend to have a single standard or multiple slightly different ones?)

Information is the same but lack of type support creates an opportunity for something to be lost in translation.

...

Rainer

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

12:05 p.m.

New subject: how many steps to do -

...

-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:03 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 11:44 AM, Rainer Gerhards wrote:

...
...
So please explain to me how you think the library should decide what

to

...
...
use for serialization JSON, XML, XML with types or something else we add later? The file approach that Rainer suggested is close to a config file

for a

...
...
library. I thought we wanted to avoid that. Are we back to ENV vars does not look appealing to me? I do not think application cares about the serialization format.

Only

...
...
consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the

format

...
...
preference for the library.

I guess the answer to this question is already buried in some thread,

but

...
please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

Same but because some of them do not pass all the types (JSON) the information is a bit tarnished.

...
(In other words: do we intend to have a single standard or multiple

slightly

...
different ones?)

Information is the same but lack of type support creates an opportunity for something to be lost in translation.

I know this has been discussed at length, but if we really see value in the types, we need to preserve them somehow in JSON. Information loss just because of different encoding is a really bad thing IMHO. You create a second-class citizen. One of two things will happen:

a) nobody uses JSON b) nobody uses types

I'd assume b) will happen.

Rainer

Dmitri Pal

12:22 p.m.

New subject: how many steps to do -

On 03/28/2012 12:05 PM, Rainer Gerhards wrote:

...

...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:03 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 11:44 AM, Rainer Gerhards wrote:

...
...
So please explain to me how you think the library should decide what

to

...
...
use for serialization JSON, XML, XML with types or something else we add later? The file approach that Rainer suggested is close to a config file

for a

...
...
library. I thought we wanted to avoid that. Are we back to ENV vars does not look appealing to me? I do not think application cares about the serialization format.

Only

...
...
consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the

format

...
...
preference for the library.

I guess the answer to this question is already buried in some thread,

but

...
please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

Same but because some of them do not pass all the types (JSON) the information is a bit tarnished.

...
(In other words: do we intend to have a single standard or multiple

slightly

...
different ones?)

Information is the same but lack of type support creates an opportunity for something to be lost in translation.

I know this has been discussed at length, but if we really see value in the types, we need to preserve them somehow in JSON. Information loss just because of different encoding is a really bad thing IMHO. You create a second-class citizen. One of two things will happen:

a) nobody uses JSON b) nobody uses types

I'd assume b) will happen.

Rainer

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both. Which should be used? Who defines it? Application developer Library developer Application admin System admin Syslog developer

IMO it is application or system admin's responsibility to configure an instance of the syslog to suit the needs of the application in the way he needs. So his main focus of the configuration will be syslog. To avoid double effort and not to require to set an environment variable to use a specific (matching to syslog) configuration for the library the syslog configuration should be detectable by the library. This is what I am arguing for.

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

12:36 p.m.

New subject: how many steps to do -

...

-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:22 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 12:05 PM, Rainer Gerhards wrote:

...
...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:03 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 11:44 AM, Rainer Gerhards wrote:

...
...
So please explain to me how you think the library should decide

what

...
...
to

...
...
use for serialization JSON, XML, XML with types or something else

we

...
...
...
...
add later? The file approach that Rainer suggested is close to a config file

for a

...
...
library. I thought we wanted to avoid that. Are we back to ENV

vars

...
...
...
...
does not look appealing to me? I do not think application cares about the serialization format.

Only

...
...
consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the

format

...
...
preference for the library.

I guess the answer to this question is already buried in some

thread,

...
...
but

...
please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

Same but because some of them do not pass all the types (JSON) the information is a bit tarnished.

...
(In other words: do we intend to have a single standard or multiple

slightly

...
different ones?)

Information is the same but lack of type support creates an

opportunity

...
...
for something to be lost in translation.

I know this has been discussed at length, but if we really see value

in the

...
types, we need to preserve them somehow in JSON. Information loss

just

...
because of different encoding is a really bad thing IMHO. You create

a

...
second-class citizen. One of two things will happen:

a) nobody uses JSON b) nobody uses types

I'd assume b) will happen.

Rainer

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both.

I strogly doubt that XML over syslog will fly.

Balabit guys: will you support that? I had no plans in rsyslog (but that can change)? IMO it's way to heavy for syslog, especially if in a legacy relay chain you have some old UDP stuff...

Rainer

Dmitri Pal

12:54 p.m.

New subject: how many steps to do -

On 03/28/2012 12:36 PM, Rainer Gerhards wrote:

...

...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:22 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 12:05 PM, Rainer Gerhards wrote:

...
...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:03 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 11:44 AM, Rainer Gerhards wrote:

...
...
So please explain to me how you think the library should decide

what

...
...
to

...
...
use for serialization JSON, XML, XML with types or something else

we

...
...
...
...
add later? The file approach that Rainer suggested is close to a config file

for a

...
...
library. I thought we wanted to avoid that. Are we back to ENV

vars

...
...
...
...
does not look appealing to me? I do not think application cares about the serialization format.

Only

...
...
consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the

format

...
...
preference for the library.

I guess the answer to this question is already buried in some

thread,

...
...
but

...
please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

Same but because some of them do not pass all the types (JSON) the information is a bit tarnished.

...
(In other words: do we intend to have a single standard or multiple

slightly

...
different ones?)

Information is the same but lack of type support creates an

opportunity

...
...
for something to be lost in translation.

I know this has been discussed at length, but if we really see value

in the

...
types, we need to preserve them somehow in JSON. Information loss

just

...
because of different encoding is a really bad thing IMHO. You create

a

...
second-class citizen. One of two things will happen:

a) nobody uses JSON b) nobody uses types

I'd assume b) will happen.

Rainer

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both.

I strogly doubt that XML over syslog will fly.

Balabit guys: will you support that? I had no plans in rsyslog (but that can change)? IMO it's way to heavy for syslog, especially if in a legacy relay chain you have some old UDP stuff...

Rainer

Here my basic assumptions: 1) Library (ELAPI/selog) is for use by the applications on top of syslog 2) The library would emit structured data into syslog via old or new interfaces and that should be its sole purpose.

The whole discussion about the types was about passing information about very specific types application knows about down the stack. We agreed I think to evolve the precision and start with JSON and XML and later add other formats if needed. But if syslog would not support XML why bother. Even less work for me.

-- Thank you, Dmitri Pal Sr. Engineering Manager IPA project, Red Hat Inc. ------------------------------- Looking to carve out IT costs? www.redhat.com/carveoutcosts/

Rainer Gerhards

12:56 p.m.

New subject: how many steps to do -

...

The whole discussion about the types was about passing information about very specific types application knows about down the stack. We agreed I think to evolve the precision and start with JSON and XML and later add other formats if needed. But if syslog would not support XML why bother.

I have not yet said that ;) Let's wait for Balabit to chime in... (nxlog as far as I understood would probably not do syslog but its own protocol, thus no question up to there).

Rainer

Botond Botyanszki

1:13 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012 18:56:24 +0200 "Rainer Gerhards" rgerhards@hq.adiscon.com wrote:

...

...
The whole discussion about the types was about passing information about very specific types application knows about down the stack. We agreed I think to evolve the precision and start with JSON and XML and later add other formats if needed. But if syslog would not support XML why bother.

I have not yet said that ;) Let's wait for Balabit to chime in... (nxlog as far as I understood would probably not do syslog but its own protocol, thus no question up to there).

Nxlog can already do (parse and generate) XML and JSON over syslog (both legacy and ietf). It has it's own binary protocol to be able to transfer structured logs with type information over the network, but this does not mean you cannot use something else (the aforementioned formats, CSV etc).

Binary protocols are more efficient because you do not need to parse/escape the data and in some cases you can even avoid byte copy. In this regard I don't think there is much difference between XML and JSON. Both need escaping. Just because the XML tags make the result somewhat larger I wouldn't claim that JSON is superior.

Regards, Botond

William Heinbockel

1:56 p.m.

New subject: how many steps to do -

On Wed, Mar 28, 2012 at 12:56 PM, Rainer Gerhards rgerhards@hq.adiscon.com wrote:

...

...
The whole discussion about the types was about passing information about very specific types application knows about down the stack. We agreed I think to evolve the precision and start with JSON and XML and later add other formats if needed. But if syslog would not support XML why bother.

I have not yet said that ;) Let's wait for Balabit to chime in... (nxlog as far as I understood would probably not do syslog but its own protocol, thus no question up to there).

Rainer

JSON is definitely priority over XML. As for whether all lumberjack-based API support the creation of XML logs, I think that is a resounding *no*. They should support the JSON however.

As long as everyone supports JSON, translation from XML to JSON is trivial to provide (assuming the representation of common data types is consistent, especially for timestamps and IPv4/IPv6 addresses).

David Lang

4:40 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, William Heinbockel wrote:

...

On Wed, Mar 28, 2012 at 12:56 PM, Rainer Gerhards rgerhards@hq.adiscon.com wrote:

...
...
The whole discussion about the types was about passing information about very specific types application knows about down the stack. We agreed I think to evolve the precision and start with JSON and XML and later add other formats if needed. But if syslog would not support XML why bother.

I have not yet said that ;) Let's wait for Balabit to chime in... (nxlog as far as I understood would probably not do syslog but its own protocol, thus no question up to there).

Rainer

JSON is definitely priority over XML. As for whether all lumberjack-based API support the creation of XML logs, I think that is a resounding *no*. They should support the JSON however.

why should the libraries not support creating XML? If they shouldn't, then why are we even talking about XML?

As I see it, the current fad/'everybody knows' is swinging from "anything structured should be in XML" to "anything structured should be in JSON because XML is too slow". some people buy into this, others are still in the XML camp, and yet others dislike both.

given a choice, I personally would not do both, but XML does have some advantages (the schema support can be very useful in the right place for example)

David Lang

William Heinbockel

12:55 p.m.

New subject: how many steps to do -

On Wed, Mar 28, 2012 at 12:36 PM, Rainer Gerhards rgerhards@hq.adiscon.com wrote:

...

...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:22 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 12:05 PM, Rainer Gerhards wrote:

...
...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com] Sent: Wednesday, March 28, 2012 6:03 PM To: Rainer Gerhards Cc: lumberjack logging Subject: Re: [lumberjack] how many steps to do -

On 03/28/2012 11:44 AM, Rainer Gerhards wrote:

...
...
So please explain to me how you think the library should decide

what

...
...
to

...
...
use for serialization JSON, XML, XML with types or something else

we

...
...
...
...
add later? The file approach that Rainer suggested is close to a config file

for a

...
...
library. I thought we wanted to avoid that. Are we back to ENV

vars

...
...
...
...
does not look appealing to me? I do not think application cares about the serialization format.

Only

...
...
consumer - in this case syslog implementations or other log destinations if we ever add them should. So I fail to understand why you try to stay away of driving the

format

...
...
preference for the library.

I guess the answer to this question is already buried in some

thread,

...
...
but

...
please bear with me.

Do the JSON, XML, ... representations convey the same or different information?

Same but because some of them do not pass all the types (JSON) the information is a bit tarnished.

...
(In other words: do we intend to have a single standard or multiple

slightly

...
different ones?)

Information is the same but lack of type support creates an

opportunity

...
...
for something to be lost in translation.

I know this has been discussed at length, but if we really see value

in the

...
types, we need to preserve them somehow in JSON. Information loss

just

...
because of different encoding is a really bad thing IMHO. You create

a

...
second-class citizen. One of two things will happen:

a) nobody uses JSON b) nobody uses types

I'd assume b) will happen.

Rainer

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both.

I strogly doubt that XML over syslog will fly.

Balabit guys: will you support that? I had no plans in rsyslog (but that can change)? IMO it's way to heavy for syslog, especially if in a legacy relay chain you have some old UDP stuff...

Rainer

I don't see any need/reason for Syslogd to support more than JSON.

My goal was to have the representations be compatible for data. The only incompatibilities would be with types.

To lessen this burden, we declare a couple of necessary types and conventions. For the few types that JSON supports, utilize them. Otherwise, define specific syntaxes for the string subtypes: time, ipv4, ipv6

That way, if there is no type information in JSON, we can at least see that it is a string. Smarter applications can check to see if the string format matches that of one of the string subtypes and type cast it.

In XML, the type information could be provided via the @type attribute, inferred through the XML Schema (via JAXB or similar), or stored as a ducktyped string.

David Lang

4:54 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, William Heinbockel wrote:

...

I don't see any need/reason for Syslogd to support more than JSON.

if the transport only supports JSON, where would XML ever be involved?

David Lang

...

My goal was to have the representations be compatible for data. The only incompatibilities would be with types.

To lessen this burden, we declare a couple of necessary types and conventions. For the few types that JSON supports, utilize them. Otherwise, define specific syntaxes for the string subtypes: time, ipv4, ipv6

That way, if there is no type information in JSON, we can at least see that it is a string. Smarter applications can check to see if the string format matches that of one of the string subtypes and type cast it.

In XML, the type information could be provided via the @type attribute, inferred through the XML Schema (via JAXB or similar), or stored as a ducktyped string. _______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

William Heinbockel

9:46 p.m.

New subject: how many steps to do -

Let me try to briefly explain:

My vision for lumberjack is for structured logs, but goes beyond just syslog (the message format and protocol)

The first step is to provide structured logging with existing syslog api and compatible with the syslog protocol. This should be simple and minimal.

...

From there, we need to look at more advanced logging apis and protocols,

hence some of the other discussions. The only way for this to work is to offer several different, interchangeable formats to encode structured log messages.

Syslog+json is just the first step in log world domination.

I just do not see the value of sending xml over syslog. And sending something like BSON in syslog messages is an abuse of the protocol and will break many implementations. However this does not preclude us from from defining web service or other log interfaces to complement syslog. On Mar 28, 2012 4:54 PM, david@lang.hm wrote:

...

On Wed, 28 Mar 2012, William Heinbockel wrote:

I don't see any need/reason for Syslogd to support more than JSON.

...
if the transport only supports JSON, where would XML ever be involved?

David Lang

...
My goal was to have the representations be compatible for data. The only incompatibilities would be with types.

To lessen this burden, we declare a couple of necessary types and conventions. For the few types that JSON supports, utilize them. Otherwise, define specific syntaxes for the string subtypes: time, ipv4, ipv6

That way, if there is no type information in JSON, we can at least see that it is a string. Smarter applications can check to see if the string format matches that of one of the string subtypes and type cast it.

In XML, the type information could be provided via the @type attribute, inferred through the XML Schema (via JAXB or similar), or stored as a ducktyped string. ______________________________**_________________ lumberjack-developers mailing list lumberjack-developers@lists.**fedorahosted.orglumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/**mailman/listinfo/lumberjack-**developers https://fedorahosted.org/mailman/listinfo/lumberjack-developers

______________________________**_________________ lumberjack-developers mailing list lumberjack-developers@lists.**fedorahosted.orglumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/**mailman/listinfo/lumberjack-**developers https://fedorahosted.org/mailman/listinfo/lumberjack-developers

David Lang

11:36 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, William Heinbockel wrote:

...

Let me try to briefly explain:

My vision for lumberjack is for structured logs, but goes beyond just syslog (the message format and protocol)

The first step is to provide structured logging with existing syslog api and compatible with the syslog protocol. This should be simple and minimal.

From there, we need to look at more advanced logging apis and protocols, hence some of the other discussions. The only way for this to work is to offer several different, interchangeable formats to encode structured log messages.

Syslog+json is just the first step in log world domination.

I just do not see the value of sending xml over syslog. And sending something like BSON in syslog messages is an abuse of the protocol and will break many implementations. However this does not preclude us from from defining web service or other log interfaces to complement syslog.

I think you are vastly underestimating the capabilities of modern syslog daemons.

Currently they operate on text formatted data, but internally they are very powerful routing and filtering engines that are operating on fairly complicated memory structures around that contain the log data.

it would not take a very large modification to have them accept or output data in a binary format. Per this e-mail discussion, nxlog already offers a binary format for it's network communication. Rsyslog can already send logs to web services servers. Rsyslog supports binary log transport via the 0MQ interfaces.

I'm not saying that there isn't a place for a web services logging option, but I think you need to accept that a syslog daemon (or something that is a fairly small modification of an existing syslog daemon) is probably going to be a core part of routing logs for a long time, if not forever.

Your reaction is one of the downsides of these being called 'syslog daemons' because you assume that they are far more limited than they really are.

Just about every statement that has been made that "syslog can't do X" has resulted in either "syslog has been doing that for years", or "that would be fairly easy to add"

Just think of the syslog daemon as a highly optimized routing, filtering, and formatting engine and don't limit yourself by the traditional formats (either in message or transport protocols)

David Lang

...

On Mar 28, 2012 4:54 PM, david@lang.hm wrote:

...
On Wed, 28 Mar 2012, William Heinbockel wrote:

I don't see any need/reason for Syslogd to support more than JSON.

...
if the transport only supports JSON, where would XML ever be involved?

David Lang

...
My goal was to have the representations be compatible for data. The only incompatibilities would be with types.

To lessen this burden, we declare a couple of necessary types and conventions. For the few types that JSON supports, utilize them. Otherwise, define specific syntaxes for the string subtypes: time, ipv4, ipv6

That way, if there is no type information in JSON, we can at least see that it is a string. Smarter applications can check to see if the string format matches that of one of the string subtypes and type cast it.

In XML, the type information could be provided via the @type attribute, inferred through the XML Schema (via JAXB or similar), or stored as a ducktyped string. ______________________________**_________________ lumberjack-developers mailing list lumberjack-developers@lists.**fedorahosted.orglumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/**mailman/listinfo/lumberjack-**developers https://fedorahosted.org/mailman/listinfo/lumberjack-developers

______________________________**_________________ lumberjack-developers mailing list lumberjack-developers@lists.**fedorahosted.orglumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/**mailman/listinfo/lumberjack-**developers https://fedorahosted.org/mailman/listinfo/lumberjack-developers

_______________________________________________ lumberjack-developers mailing list lumberjack-developers@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/lumberjack-developers

David Lang

4:51 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, Rainer Gerhards wrote:

...

...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com]

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both.

I strogly doubt that XML over syslog will fly.

Balabit guys: will you support that? I had no plans in rsyslog (but that can change)? IMO it's way to heavy for syslog, especially if in a legacy relay chain you have some old UDP stuff...

I would not think that XML over syslog would be the common case, but I can sure see situations where some people may want to use it.

My expectation is that support on the syslog side would go something like this.

1. JSON/XML transport

available today, it doesn't parse the message, just transports it. If the message is too long for the transport, it will get truncated (which can be a serious problem for a verbose serialization like XML)

2. full JSON support

the syslog daemons would be able to understand the JSON fields to make delivery decisions

the syslog daemons would be able to accept JSON and output traditional logs

the syslog daemons would be able to accept tradtional logs and output JSON

3. full XML support

same as JSON support, but with XML input and output (with the twist of being able to convert between JSON and XML)

4. binary logging format (BSON??)

the syslog daemons would be able to accept a structured binary format as input and be able to output the various supported formats. It's up to the particular syslog daemon to decide if they are going to have a binary transport or not (nxlog does, currently rsyslog and syslog-ng do not)

in RFC speek, I would say that the ability to output in JSON is REQUIRED, XML is RECOMMENDED, BSON is SUGGESTED

David Lang

Balazs Scheidler

29 Mar 29 Mar

5:43 p.m.

New subject: how many steps to do -

----- Original message -----

...

On Wed, 28 Mar 2012, Rainer Gerhards wrote:

...
...
-----Original Message----- From: Dmitri Pal [mailto:dpal@redhat.com]

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both.

I strogly doubt that XML over syslog will fly.

Balabit guys: will you support that? I had no plans in rsyslog (but that can change)? IMO it's way to heavy for syslog, especially if in a legacy relay chain you have some old UDP stuff...

I would not think that XML over syslog would be the common case, but I can sure see situations where some people may want to use it.

My expectation is that support on the syslog side would go something like this.

JSON/XML transport

available today, it doesn't parse the message, just transports it. If the message is too long for the transport, it will get truncated (which can be a serious problem for a verbose serialization like XML)

full JSON support

the syslog daemons would be able to understand the JSON fields to make delivery decisions

the syslog daemons would be able to accept JSON and output traditional logs

the syslog daemons would be able to accept tradtional logs and output JSON

fully agree. syslogd has to be smart enough to do both conversions.

...

full XML support

same as JSON support, but with XML input and output (with the twist of being able to convert between JSON and XML)

binary logging format (BSON??)

the syslog daemons would be able to accept a structured binary format as input and be able to output the various supported formats. It's up to the particular syslog daemon to decide if they are going to have a binary transport or not (nxlog does, currently rsyslog and syslog-ng do not)

syslog-ng has a binary serialization format, but that's not used over the network.

...

in RFC speek, I would say that the ability to output in JSON is REQUIRED, XML is RECOMMENDED, BSON is SUGGESTED

I'm not sure we need xml/bson though, but I guess time will tell.

David Lang

28 Mar 28 Mar

4:53 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, Dmitri Pal wrote:

...

a) JSON b) XML

DATA -> XML no loss but heavy DATA -> JSON potential loss but good enough

Library supports both syslog supports both. Which should be used? Who defines it? Application developer Library developer Application admin System admin Syslog developer

IMO it is application or system admin's responsibility to configure an instance of the syslog to suit the needs of the application in the way he needs. So his main focus of the configuration will be syslog. To avoid double effort and not to require to set an environment variable to use a specific (matching to syslog) configuration for the library the syslog configuration should be detectable by the library. This is what I am arguing for.

I agree with this. The question is how to make this detectable.

David Lang

4:59 p.m.

New subject: how many steps to do -

On Wed, 28 Mar 2012, Rainer Gerhards wrote:

...

...
Information is the same but lack of type support creates an opportunity for something to be lost in translation.

I know this has been discussed at length, but if we really see value in the types, we need to preserve them somehow in JSON. Information loss just because of different encoding is a really bad thing IMHO. You create a second-class citizen. One of two things will happen:

a) nobody uses JSON b) nobody uses types

I'd assume b) will happen.

The problem is that it's not always going to be possible to retrofit types into the existing protocols.

So we either need to completely abandon types, or we need to accept that in some cases type data will be lost.

I primarily see types as having two advantages.

1. input validation by the logging library

2. transport optimization (binary serialization formats, compression, etc)

for #1, if the type data is lost in transport, no big deal

for #2 it's an efficiancy thing, not something fundamental.

Note that in addition to this, the cee specs may specify that specific fields in specific messages are a given type. If this is defined, then the type information doesn't need to be passed along either, it's like an XML schema 'known' to both source and endpoint.

David Lang

4473

Age (days ago)

4474

Last active (days ago)

lumberjack-developers@lists.fedorahosted.org

23 comments

7 participants

tags (0)

participants (7)

Balazs Scheidler
Botond Botyanszki
Brian Knox
David Lang
Dmitri Pal
Rainer Gerhards
William Heinbockel