~rumpelsepp/homepage

ref: 2a0829d7d5c82198e590e856efc864471be97a18 homepage/_specs/penlog.7.adoc -rw-r--r-- 8.1 KiB
2a0829d7Stefan Tatschner add degoogle | A huge list of alternatives to Google products. Privacy tips, tricks, and links. 26 days ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
= penlog(7)
:doctype:    manpage
:man source: penlog

== Name

penlog - An abstract and machine readable logging format

== Specification

The PENLog logging format is intended to be used as a generic and reusable data format for measurement data.
The penlog format specifies an abstract data format consisting of various fields with data and metadata.
The abstract penlog format can be mapped to multiple output formats, for instance `json`, or `hr`, …
All available output formats are explained below.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

=== Abstract Logging Format

The penlog structured logging format consists of the following fields.
Unset fields which are considered optional MUST be absent.

`component` (string, OPTIONAL)::
    The component, e.g. software module, which has issued the log message.
    In absence, an implementation SHOULD pull the content of the environment variable `PENLOG_COMPONENT` and MUST set it to `root` as a fallback.

`data` (string, REQUIRED)::
    The log message as an UTF-8 string.

`host` (string, OPTIONAL)::
    The hostname of the machine who generated the messages.
    This field is OPTIONAL, since it is missing in the human readable format.
    It is RECOMMENDED that implementations include this field, as it increases reproducability of logging data.

`id` (string, OPTIONAL)::
    A unique message identifier.

`line` (string, OPTIONAL)::
    Information about the file and line number where this log entry was generated.
    The information MUST be in the form `filename:number`.
    `filename` can be an absolute or relative path, or a filename.

`priority` (int, OPTIONAL)::
    This field can be used to optionally set the priority.
    For priorities, the syslog priorities are used as defined by RFC5424.
    Implementations can indicate priorities by e.g. a separate color.

`stacktrace` (string, OPTIONAL)::
    Implementations can optionally include a stacktrace.
    This could be useful for debugging if fatal errors occur.
    Stacktraces are very specific to the used programming language, e.g. python or go.
    Thus, this field is just an unstructured string.

`tags` (list[string], OPTIONAL)::
    To each log entry a custom list of tags can be applied.
    For instance: `["autogenerated", "pre-test", "post-test", …]`.
    Tags MAY be key value pairs, separated by `=`.

`timestamp` (string, REQUIRED)::
    ISO8601 string of the current date.

`type` (string, REQUIRED)::
    The type field is a free field which can be used to assign a particular message type.

Custom fields can be added freely, in other words, additional custom fields are OPTIONAL.
Their post-processing and tooling around these custom fields is up to the developer and MUST be ignored by generic converters.

=== JSON Format (json)

A penlog log file stored on disk is typically stored in the `json` output format. 
The tool `hr(1)` is intended to be used -- similar to `cat(1)` -- for viewing penlog data in the `json` output format.
If encoding of a log message fails, the component MUST be set to `JSON`, the type to `ERROR`, and the error message MUST be included in `data`.

The `json` format consists of a verbatim sequence of the described JSON objects.
Each JSON object MUST be present at one line, separated by `\n` (ascii 0x0a).
In order to keep decoding simple and line based, no JSON arrays or virtual, endless JSON structures are employed.

=== JSON pretty (json-pretty)

The `json` format forces every JSON object to appear in a single line.
The `json-pretty` format provides an indented, more readable json form for debugging purposes.
The actual content of `json` and `json-pretty` is the same.
It is adviced to use `json` for data processing pipelines due to less overhead.

=== Human Readable Format (hr)

The syntax of the human readable format looks like the following.
Curly braces indicate a field from the JSON format.
If a field is empty it expands to an zero length string; if `id`, `line`, `tags`, or `stacktrace` are not availabe, the whole line is omitted.
A verbatim curly brace brace is expressed with two ones: `{{` means `{`.

    {timestamp} {{{component}}} [{type}]: {prio-prefix} {data}
       -> id  : {id}
       -> line: {line}
       -> tags: {tags}
       -> stacktrace:
       | {stacktrace}

`timestamp`::
    The RECOMMENDED timestamp format is Go's `StampMilli` format as defined to `Jan _2 15:04:05.000`.

`component`::
`type`::
    The `component` and `type` fields MUST be padded or truncated that the colons, `:`, in every single line are perfectly aligned.

`data`::
    The actual log message.
    It MAY be truncated to fit in the current terminal size.
    When it is truncated an ellipsis character (`…`) MUST be appended to indicate the truncation for the user,

`id`::
    The optional unique message identifier.

`line`::
    The optional filename and line number where this log entry origins from.

`stacktrace`::
    The optional stacktrace where this log entry origins from.

`prio-prefix`::
    An optional priority prefix.
    It is RECOMMENDED to indicate message priorities via colors.
    If colors are not available it MAY be desireable to indicate the priority via a short prefix.
    The prefixes are enclosed by brackets `[` and `]`: `E` `A`, `C`, `e`, `w`, `n`, `i`, `d`.
    These letters stand for: emergency, alert, critical, error, warning, notice, info, debug.

`tags`::
    The optional tags as comma separated values.


=== Tiny Human Readable Format (hr-tiny)

The `hr-tiny` format is the same as `hr` except that `component` and `type` are omitted.

    Apr  2 12:48:08.906: Starting tshark with
    Apr  2 12:48:09.583: Doing stuff

=== Example

    Apr  2 12:48:08.906 {scanner } [message]: Starting tshark with
    Apr  2 12:48:09.583 {moncay  } [message]: Doing stuff

If a JSON line cannot be decoded, the faulty text MUST be included in messages of type `ERROR` and component `JSON`:

    $ python -c "import foo" 2>&1 | hr
    Jun 16 08:19:01.305 {JSON    } [ERROR   ]: Traceback (most recent call last):
    Jun 16 08:19:01.305 {JSON    } [ERROR   ]:   File "<string>", line 1, in <module>
    Jun 16 08:19:01.305 {JSON    } [ERROR   ]: ModuleNotFoundError: No module named 'foo'

== Environment Variables

The following environment variables MAY be understood by penlog implementations.
The supported datatypes are `string` and `bool`.
A `bool` is a special string consisting of either `t, T, true, TRUE, 1` or `f, F, false, FALSE, 0`.

`PENLOG_COMPONENT` (string)::
    If no component is set, the `component` field MAY be set via the `PENLOG_COMPONENT` variable at the scope of an operating system process.

`PENLOG_CAPTURE_LINES` (bool)::
    If this environment variable is set, implementations SHOULD emit filenames with line numbers via the `line` field.

`PENLOG_CAPTURE_STACKTRACES` (bool)::
    If this environment variable is set, implementations SHOULD provide stacktraces via the `stacktrace` field.

`PENLOG_OUTPUT` (string)::
    A switch for implementations to choose from several output forms.
    Available are: `hr`, `hr-tiny`, `json`, `json-pretty`, `systemd`.

`PENLOG_LOGLEVEL` (string)::
    In order to limit the emitted logging messages, loglevels MAY be supported.
    If a library supports filtering based on loglevels, it MUST check this environment variable.
    The supported values are `critical`, `error`, `warning`, `notice`, `info`, `debug`.
    The default MUST be `debug`.
    A message MUST omitted if its `priority` field contains a value greater than `PENLOG_LOGLEVEL`.
    A mapping between these strings and integer values is availabe in RFC5424.

== See Also

hr(1), penlog-best-practice(7)

== Bugs

This project is maintained on Github: https://github.com/Fraunhofer-AISEC/penlog.

== Authors

Current maintainers are:

* Stefan Tatschner <stefan@rumpelsepp.org>
* Tobias Specht

== License

This document is published under the Apache-2.0 license.
The license of the code can be obtained from the Git repository.