Introduction
The log volume ingested by DataSet is the basis of billing, so it's important for customers to understand how it is calculated. The items below were written to facilitate the processes of streamlining logs that are output by your applications / platform, and optimizing the Agent (or other) upload mechanisms.
Counted Toward Log Volume
- Raw log messages are counted
- Attribute values that originate in the Agent or at the DataSet API are counted.
- Attribute name lengths aren't counted, but each attribute costs 1 byte plus the attribute value length
In theory, a log event with no attributes would cost 1 byte
Not Counted
- Metalog events are not counted
sessionInfo
/serverInfo
fields and server attributes (app
,launchTime
,severity
,session
,machine
,serverType
,serverScope
,sessionType
,serverHost
) are not counted- The
logfile
attribute, as well as internal K8s attributes (containerName
,containerId
,pod_name
,pod_namespace
,namespace
,pod_uid
,k8s_container_name
,original_file
,scalyr-category
,k8s_node
,container_id
) are not counted - There is no separate charge for parsed attributes from the
message
attribute. You are only billed once for the complete log event.- Note: Parsers can only be applied to the original log event, which is contained in the
message
attribute - Hence, attributes that are created by the parser are not counted.
- Note: Parsers can only be applied to the original log event, which is contained in the
Comments
1 comment
Hi Mark!
A bit unclear on attribute costs.
Was this meant to be "each attribute costs at least 1 byte?"
Cool so if from the log event that looks like this:
[29/Apr/2022:13:49:15 +0000] "POST /auth/api/15/feature-flags/context HTTP/1.1" 200 200 "-"
We parse out the following part into the attribute called `msg`
"POST /auth/api/15/feature-flags/context HTTP/1.1" 200
We will not pay extra for `msg`? (apart maybe for the `index:length` that you will have to store for each event to seek into the raw log to get the `msg` part)
And yeah another question was meant to be - do I understand correctly that the way you're storing attribute values is basically `index:length` pairs with which you can find the attribute's value in the raw message, and you only retrieve that value when needed and never store it separately? Do you then also have to index the attribute for search? I'd imagine indexing taking some space as well - if it does can we opt out of indexing for some attributes?
Please sign in to leave a comment.