Introduction
This article covers several methods that can be used to evaluate PowerQuery results with substrings of the log timestamp
. Although PowerQueries display timestamps in a string format (ex. "June 3 2:28:00.000 pm"), timestamp
is a reserved field that is stored as a number (Unix epoch, nanosecond). Please note that at this time, it's not possible to directly extract time intervals (ex. strftime) as strings from the timestamp
field.
Use the timebucket
function
The timebucket
function converts log timestamps into more generic intervals, such as day, hour, minute, etc. This can also streamline queries by reducing the quantity of log events which are returned. Here is an example query that groups the count of total access log requests and errors by minutes.
dataset = "accesslog"
| group requests = count(), errors = count(status == 404) by timestamp = timebucket("1m")
More information on the timebucket
function can be found here (scroll down to "Time Functions")
Perform string manipulation on the message
field
Although the timebucket
function is straightforward, it has some drawbacks.
For example, timebucket
won't work when searching for an extended time period (> 500 timebuckets). If you anticipate that the query will be applied to extended time periods, or a specific part of the (log based) timestamp string (ex. date) is needed, the parse
function could be used.
The message
field is a reserved field that contains the entire original log event that was sent to DataSet.
The following example demonstrates how to extract the date (ex. "Jun/1") from the message
field with the parse
function and a regex:
$dataset = "accesslog"
| parse ".* \\[$date$\/[\\d:]+ \\+0000\\] .*" from message
| group requests = count(), errors = count(status == 404) by date
| filter date != null
Use a parser!
Although slightly beyond the original scope of PowerQueries, a parser can extract the month/day substring from these log events. Using a parser has the following additional benefits:
- Improved PowerQuery performance, as extraction has already been performed
- Attribute only has to be extracted once and can be used within other PowerQueries, searches, alerts, graphs, and dashboards
- Regular expression syntax is usually identical (I chose a different one in this example to illustrate the use of
patterns
)
Given the following log events:
231.202.158.42 - - [17/Nov/2022:23:48:18 +0000] "POST /reportAbuse?source=image&account=voldermort@hogwarts.edu HTTP 1.1" 200 511 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 105
33.201.177.40 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=review&account=overshare@facebook.com HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 154
33.201.177.40 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=image&account=evil@spammers.org HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36" 170
192.186.196.230 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=image&account=anonymous_email@foobar.com HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 144
We can use the following parser to retrieve the timestamp and month/date substring:
{
patterns: {
tsPattern: "\\d{2}\/\\w{3}\/\\d{4}:\\d{2}:\\d{2}:\\d{2} [\\-\\+]\\d{4}",
monthDayPattern: "\\d{2}\/\\w{3}"
},
formats: [
{
// get timestamp
format: ".* \\[$timestamp=tsPattern$\\]"
},
{
// get month/day substring
format: ".* \\[$monthDay=monthDayPattern$\/"
}
]
}
Results
231.202.158.42 - - [17/Nov/2022:23:48:18 +0000] "POST /reportAbuse?source=image&account=voldermort@hogwarts.edu HTTP 1.1" 200 511 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 105
message: 231.202.158.42 - - [17/Nov/2022:23:48:18 +0000] "POST /reportAbuse?source=image&account=voldermort@hogwarts.edu HTTP 1.1" 200 511 "" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 105
monthDay: 17/Nov
timestamp: 17/Nov/2022:23:48:18 +0000 (parsed as: Thu Nov 17, 2022 11:48:18 PM GMT, i.e. 10 minutes ago)
33.201.177.40 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=review&account=overshare@facebook.com HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 154
message: 33.201.177.40 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=review&account=overshare@facebook.com HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 154
monthDay: 17/Nov
timestamp: 17/Nov/2022:23:48:19 +0000 (parsed as: Thu Nov 17, 2022 11:48:19 PM GMT, i.e. 10 minutes ago)
33.201.177.40 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=image&account=evil@spammers.org HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36" 170
message: 33.201.177.40 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=image&account=evil@spammers.org HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36" 170
monthDay: 17/Nov
timestamp: 17/Nov/2022:23:48:19 +0000 (parsed as: Thu Nov 17, 2022 11:48:19 PM GMT, i.e. 10 minutes ago)
192.186.196.230 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=image&account=anonymous_email@foobar.com HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 144
message: 192.186.196.230 - - [17/Nov/2022:23:48:19 +0000] "POST /reportAbuse?source=image&account=anonymous_email@foobar.com HTTP 1.1" 200 511 "" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36" 144
monthDay: 17/Nov
timestamp: 17/Nov/2022:23:48:19 +0000 (parsed as: Thu Nov 17, 2022 11:48:19 PM GMT, i.e. 10 minutes ago)
Comments
0 comments
Please sign in to leave a comment.