Introduction
The Scalyr Agent can be configured with redaction rules, which enable logs to be modified with regular expressions prior to being uploaded to DataSet. This greatly simplifies operations like:
- Removing or replacing security tokens, passwords, etc.
- Restructuring complex logs for parsing on ingestion
- Reducing log volume by removing unneeded / unwanted log segments
Redaction rules have the advantage of being applied by the Scalyr Agent before log data is uploaded. Since they are configured with regular expressions at this level, they can modify log segments with pinpoint accuracy.
Since redaction rules are handled within the Python-based Scalyr Agent, standardized regular expression formats apply. Furthermore, differences in the re
library between Python v2.6, v2.7, or v3.5+ may result in processing variations. Consequently, we strongly recommend testing any updates made to your Scalyr Agent's redaction rules before deploying the configuration to a live environment.
More information on redaction rules can be found here: https://app.scalyr.com/help/scalyr-agent#redaction
Alternatives
The following methods can be employed to remove / obscure sections of log events once they have been ingested by DataSet:
- Parsers - Log sections can be modified via the
rewrites
statement. Impacted log lines will not count against your log volume. - Scrubbing via "Log Processing" - Applied to raw log lines before they are parsed.
If you want to remove the sections altogether, you can use:
- Cost Management - The "Discard" feature will remove unwanted log lines, and you will not be billed for it
Implementation
Redaction rules are configured in the Scalyr Agent configuration file (agent.json) on a per-log basis. For example, to remove the accessKeyId and sessionToken parameters from the following log example:
{"awsRegion":"us-west-1","eventID":"b84ed668-0e65-4fd4-97a3-ee066fc878ea","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T17:57:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"b8413714-d95f-42c6-8af4-ec8747d43aec","requestParameters":{"durationSeconds":1800,"roleArn":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling","roleSessionName":"AutoScaling"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"assumedRoleUser":{"arn":"arn:aws:sts::526942720160:assumed-role\/AWSServiceRoleForAutoScaling\/AutoScaling","assumedRoleId":"AROAI7CCJHZPKCOC77O7U:AutoScaling"},"credentials":{"accessKeyId":"ASIAXVMB32CQOYMWYHX5","expiration":"Sep 9, 2020 6:27:45 PM","sessionToken":"IQoJb3JpZ2luX2VjEEIaCXVzLXdlc3QtMSJHMEUCIQD1QM9fjBYSkIeqopNF7prvkyhevWgwQN86w3bvNKWu4QIgZ39ECV1ehc4tGqWOhm0X2m73kj7V5FGJ7tiWsu0gj98q2AIISxABGgw1MjY5NDI3MjAxNjAiDOvWX\/vjyD9lztBVyCq1Ajn\/XjUY87hkGZwCLsb\/zF7z\/ws7uk6WIzMIpPcEaXgrSh6EalelmCcxsKQ3Ssn1uLkNTIk9lsRezlKVB4LvtXSTBJXEy9rQGjLsNIiruYkQNj7UGuqqpOu4xiiYNiwrkYmaz+1pf+wRncYte99Yno8mrhRKOwYERK8VTsy13oD+EU9JM7Ie\/UYhzx6YZiKfRSn5PmILWWxtAxqp7l7l3WV4zKQRZB5GmOro51MnDOF2pT4T3Rd+NTnqFbUUiout09gcyyUPBZJkixAdMQNufdrvovBqUUiwuaN4DzRPuUaxpo8ustYO85\/gDsscu5B4Q7L4mUSpee0\/cAUf4qQkiWJQl\/ZusFpW7O\/y5yQYzDCmDFNWmes6P9JJ9N9D2qtaMBckvjneY6t07fUwfTSEuIk1CdPf8DCZr+T6BTq\/AVM\/j08P+Q3eTBXYVisvCuek08PdeD65uj5cFr7UOl3KN1Hby6xX3JRpaSlxCz3m9pLIHlC8WPoDlBGbVyRStvE\/EY0D28WyO0QRyM3xpswy+BPxIPaYc17zD+H3QVmUJcJspv+1xf6jw0HV28XZB4ADxCPaff2Guk1iaAELHHV1rsc+Xn88VYQ7trF9KW65ufyqYVMl3CA\/4bbYNwUeBmqG3\/m4o9ApjPxRK09gCZUG78h8nNIn\/YSndaRjrn1W"}},"sharedEventID":"89d33e12-8796-44b9-a3d3-12f6a519d590","sourceIPAddress":"autoscaling.amazonaws.com","userAgent":"autoscaling.amazonaws.com","userIdentity":{"invokedBy":"autoscaling.amazonaws.com","type":"AWSService"}}
{"awsRegion":"us-west-1","eventID":"6c2a3d51-d5fc-4589-ad2d-5ee150f0f386","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T18:00:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"Lux-88defdf6-478b-42a3-a2b5-67734c8518a4","requestParameters":{"roleArn":"arn:aws:iam::526942720160:role\/ecsInstanceRole","roleSessionName":"i-007a61d4d8d1333b5"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/ecsInstanceRole","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"credentials":{"accessKeyId":"ASIAXVMB32CQPV55HQF2","expiration":"Sep 10, 2020, 12:05:03 AM","sessionToken":"IQoJb3JpZ2luX2VjEEIaCXVzLXdlc3QtMSJIMEYCIQCabDWJvVAdSRGuDm\/nmJMR9Idf+UyqmGJSj2IKht\/eBgIhALL3QzBrtnShtmzkT+nIDdoozYNaXnk7AA+hwQynOtkwKrQDCEsQARoMNTI2OTQyNzIwMTYwIgwvSRCAp4d7PhtLAhkqkQO23\/2Sf5YiOdgmgQOANxRpsk6rfEtpHIuH7gM6otD2LJZPsKeDGJtnJJ30uZczWgGtC1Z7JNHtyn6psRlNU9UJ2Apy3C1gsjH9557hScDpLu2bTLyadooCORUeJr0H96ueaokbAr8eKeFEGc\/rpqA3lx8bT1IxBiDExnxLqoF8X6qsF\/JT5I18ImlHhDHoyzIX8PbYOWIWr9osN0c6fIaOt50TUBtkuPQD69ys55PhSgu7\/56cUWf6D+qNJ0IFE6W7UA\/+0XJwa1u5s1dOly2\/nXa\/GkegdYugTCDp6aZGej49Xjbt0gh+aqXW1aVlmN89JVvXarpTwWwquPumNLUOnR8OCaaKZXCmkhQFFAzCtATGulKahdQM87k9raSVS3s7FJxuOw2bY1iWkPg6EZBccWtJhECEEH9X7l6LKKNey4ge6gH9S7OjSivgepqMPyB5D7z1ChXWMZ8p+ePMHL7MvDRqz3rJzO\/8\/gnI+c2lrrEkNzTgq2WnBy4rsW1ng5rtbmuw7U2IZ2+4j2UZtzPIGzDNsOT6BTrqAZjhDaghTPSD8y3Dqy\/b1jutFYJT+fmyPHI1aoG0rXN0pseleTzdxhK8YZmzVpIhXct9km6j2YmFXr0LAGPf3R5NDIvkojCDbgzCW5+08DuOAHYJT+o4NQDILEwHqh9xKS5GbFFu1LSE2hwzka6VGu393DBi51HVCVMX64XTBlQoF7ryUx8RFz+jlMCoGyEx3l1VqpXstlDv2F50OPdbxTlsIoj3CpoXoxsDQzc\/c\/h2aO4jTnVADya1wH5pYmCufhm7LoNLdyQy03jtQ4XB3glr\/xrPCYB6Yt\/ckqqzQ7qyOa2FMZ6oGCejoQ=="}},"sharedEventID":"fde465cc-c953-40ec-99e7-93499de78f39","sourceIPAddress":"ec2.amazonaws.com","userAgent":"ec2.amazonaws.com","userIdentity":{"invokedBy":"ec2.amazonaws.com","type":"AWSService"}}
{"awsRegion":"us-west-1","eventID":"e95778bf-0c43-49a5-9208-703d80555fed","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T18:00:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"Lux-88defdf6-478b-42a3-a2b5-67734c8518a4","requestParameters":{"roleArn":"arn:aws:iam::526942720160:role\/ecsInstanceRole","roleSessionName":"i-007a61d4d8d1333b5"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/ecsInstanceRole","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"credentials":{"accessKeyId":"ASIAXVMB32CQCQGWGPUG","expiration":"Sep 10, 2020, 12:12:02 AM","sessionToken":"IQoJb3JpZ2luX2VjEEIaCXVzLXdlc3QtMSJHMEUCIQDbdtX6HOUUXmSmFJoErEUWKWg53LarQBf0WtM7gwz\/+gIgH0na+NjxnVKyZjidJ75FDAH7jQ+\/s5zFGGvTdAuXVk0qtAMISxABGgw1MjY5NDI3MjAxNjAiDFOoPEX5UD78TOj9tSqRAxqjbhrlB+QCgbq7gPat5F9oqHM+V7pNQwSw+dwQfSjaOgmTVYRQnGl4CmN6w5NEhBGQIr0m7llKt9+44upg\/C2JjwcyxJCIZlCrIHVCp9UiO6PB4v+qyiq37PR8eCXtg8Foy+83Tyojdef0bvHDG3sCdbhI7ADN\/Nyhp65\/9GiapOR4swNlokw6WH6iywJewUfxDLY1y5ZegYJMLXU29Y9f1s3pJoMiYM5eupmxb2hQ53+l\/sF16AqXPFuMhbeCFaTI8KA5cUUhSmiRj\/OvPqK91bihCuvrOBMrR6cXGITxhDoUc0FNeIH9Tcxz03OG7xFdKtjpVB9Njzik1HbqI0nW73LFp5Luh4amuF6yZn5zJXQMZ0TXKnzWjrtsP1OnB+byk4Of\/OoAV65RpccesaPVq2MTDbzylUGcutmheORUgyeOaoDQNSVWoIfB9rs1zrNAdgdArfxrkXMq34AFTE9PF2vmphC8wg1OtIxlqLmrRe9ooYnWMzjvJ2bf1e0PJHLPVrLb0K1qqMzGWi530VfPMM2w5PoFOusBxbhCY8lU3wlVg4nyFhgNLeSgc90VccT1qQ8fapytT64dUfAxhyAmfoq4UwC12GmvOAX70IdHSnKFlnn\/B0519f7XRn6XKgh3ro8OyCQaLEcWmRVbTsqsDlCrlTgB6QYBW1sXQYDRp6ZMfp8dfA5W7qXAY8kY9yGzou18KBUrMLrH\/cph8IYKAqjDNwGjYZQHdS+q47EjvgowI6\/eZmWPzBh2Wf7t6ATKH7LL6NMN37SFVrP3zXVJSTtxrPIF+ffflEDHFGeZBSWZwKIruo4\/OiF1J6KNwGijOZ62t2bI\/Cd3ikngMOM\/\/XApRw=="}},"sharedEventID":"23cf51fc-f7a2-432f-8dd1-c0f1c9a2ee7e","sourceIPAddress":"ec2.amazonaws.com","userAgent":"ec2.amazonaws.com","userIdentity":{"invokedBy":"ec2.amazonaws.com","type":"AWSService"}}
The following redaction_rules are implemented:
logs: [
...
{ path: "/var/log/whee.log", attributes: {parser: "whee"},
redaction_rules: [
{ match_expression: "\"accessKeyId\":\"[a-zA-Z0-9]+\",?", replacement: "" },
{ match_expression: "(\"sessionToken\":)\"[a-zA-Z0-9=\\+\/\\\\]+\"", replacement: "\\1\"\"" },
]
},
...
]
The accessKeyId parameter and its value are completely removed, while the sessionToken parameter is replaced with an empty value field:
{"awsRegion":"us-west-1","eventID":"b84ed668-0e65-4fd4-97a3-ee066fc878ea","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T17:57:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"b8413714-d95f-42c6-8af4-ec8747d43aec","requestParameters":{"durationSeconds":1800,"roleArn":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling","roleSessionName":"AutoScaling"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"assumedRoleUser":{"arn":"arn:aws:sts::526942720160:assumed-role\/AWSServiceRoleForAutoScaling\/AutoScaling","assumedRoleId":"AROAI7CCJHZPKCOC77O7U:AutoScaling"},"credentials":{"expiration":"Sep 9, 2020 6:27:45 PM","sessionToken":""}},"sharedEventID":"89d33e12-8796-44b9-a3d3-12f6a519d590","sourceIPAddress":"autoscaling.amazonaws.com","userAgent":"autoscaling.amazonaws.com","userIdentity":{"invokedBy":"autoscaling.amazonaws.com","type":"AWSService"}}
{"awsRegion":"us-west-1","eventID":"6c2a3d51-d5fc-4589-ad2d-5ee150f0f386","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T18:00:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"Lux-88defdf6-478b-42a3-a2b5-67734c8518a4","requestParameters":{"roleArn":"arn:aws:iam::526942720160:role\/ecsInstanceRole","roleSessionName":"i-007a61d4d8d1333b5"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/ecsInstanceRole","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"credentials":{"expiration":"Sep 10, 2020, 12:05:03 AM","sessionToken":""}},"sharedEventID":"fde465cc-c953-40ec-99e7-93499de78f39","sourceIPAddress":"ec2.amazonaws.com","userAgent":"ec2.amazonaws.com","userIdentity":{"invokedBy":"ec2.amazonaws.com","type":"AWSService"}}
{"awsRegion":"us-west-1","eventID":"e95778bf-0c43-49a5-9208-703d80555fed","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T18:00:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"Lux-88defdf6-478b-42a3-a2b5-67734c8518a4","requestParameters":{"roleArn":"arn:aws:iam::526942720160:role\/ecsInstanceRole","roleSessionName":"i-007a61d4d8d1333b5"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/ecsInstanceRole","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"credentials":{"expiration":"Sep 10, 2020, 12:12:02 AM","sessionToken":""}},"sharedEventID":"23cf51fc-f7a2-432f-8dd1-c0f1c9a2ee7e","sourceIPAddress":"ec2.amazonaws.com","userAgent":"ec2.amazonaws.com","userIdentity":{"invokedBy":"ec2.amazonaws.com","type":"AWSService"}}
SInce the JSON formatting of these logs remains intact, we now apply a JSON format
statement (via the "whee" parser we defined in this example) as follows:
{
formats: [
{
format: "$=json{parse=json}$",
repeat: true,
rewrites: [
{
input: "eventTime",
output: "timestamp",
match: "(.+)",
replace: "$1"
}
]
}
]
}
Note: The rewrites
rule is used to establish the timestamp
. timestamp
is a reserved parameter, which is associated (via parser) with when the log event originally occurred on your platform
Parser Output (first log event only):
{"awsRegion":"us-west-1","eventID":"b84ed668-0e65-4fd4-97a3-ee066fc878ea","eventName":"AssumeRole","eventSource":"sts.amazonaws.com","eventTime":"2020-09-09T17:57:45Z","eventType":"AwsApiCall","eventVersion":"1.05","recipientAccountId":"526942720160","requestID":"b8413714-d95f-42c6-8af4-ec8747d43aec","requestParameters":{"durationSeconds":1800,"roleArn":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling","roleSessionName":"AutoScaling"},"resources":[{"ARN":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling","accountId":"526942720160","type":"AWS::IAM::Role"}],"responseElements":{"assumedRoleUser":{"arn":"arn:aws:sts::526942720160:assumed-role\/AWSServiceRoleForAutoScaling\/AutoScaling","assumedRoleId":"AROAI7CCJHZPKCOC77O7U:AutoScaling"},"credentials":{"expiration":"Sep 9, 2020 6:27:45 PM","sessionToken":""}},"sharedEventID":"89d33e12-8796-44b9-a3d3-12f6a519d590","sourceIPAddress":"autoscaling.amazonaws.com","userAgent":"autoscaling.amazonaws.com","userIdentity":{"invokedBy":"autoscaling.amazonaws.com","type":"AWSService"}}
awsRegion: us-west-1
eventID: b84ed668-0e65-4fd4-97a3-ee066fc878ea
eventName: AssumeRole
eventSource: sts.amazonaws.com
eventTime: 2020-09-09T17:57:45Z
eventType: AwsApiCall
eventVersion: 1.05
message: ... Truncated ...
recipientAccountId: 526942720160
requestID: b8413714-d95f-42c6-8af4-ec8747d43aec
requestParametersDurationSeconds: 1800
requestParametersRoleArn: arn:aws:iam::526942720160:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling
requestParametersRoleSessionName: AutoScaling
resources: [{"accountId":"526942720160","type":"AWS::IAM::Role","ARN":"arn:aws:iam::526942720160:role\/aws-service-role\/autoscaling.amazonaws.com\/AWSServiceRoleForAutoScaling"}]
responseElementsAssumedRoleUserArn: arn:aws:sts::526942720160:assumed-role/AWSServiceRoleForAutoScaling/AutoScaling
responseElementsAssumedRoleUserAssumedRoleId: AROAI7CCJHZPKCOC77O7U:AutoScaling
responseElementsCredentialsExpiration: Sep 9, 2020 6:27:45 PM
responseElementsCredentialsSessionToken:
sharedEventID: 89d33e12-8796-44b9-a3d3-12f6a519d590
sourceIPAddress: autoscaling.amazonaws.com
timestamp: 2020-09-09T17:57:45Z (parsed as: Wed Sep 9, 2020 5:57:45 PM GMT, i.e. 230 minutes ago)
userAgent: autoscaling.amazonaws.com
userIdentityInvokedBy: autoscaling.amazonaws.com
userIdentityType: AWSService
...
Testing
As previously mentioned, we strongly recommend testing any redaction rules before deploying them to a live environment. Since replacements occur at the Agent level, it's important to perform any debugging in advance. When testing,
- Configure a Scalyr Agent with a bare bones config file - this simplifies the implementation and process of tracking issues.
- Choose a unique log file for testing - this should be readily distinguishable from live logs that are in use
- Copy real-world data - Identify potential worst-case scenarios from your platform / application logs
- Test your redaction rules against your log data by pasting them into the log file you chose in step (2)
- Review the results in the DataSet search UI. Confirm that only the fields you redacted have been removed
- If other fields have been impacted, double check your regular expressions and how they applied to your data.
- You can speed up the testing process by restarting the "test-only" Scalyr Agent as needed.
- Note that the agent.json file is updated once every 30s, so if you'd like your changes to be recognized faster, you can simply restart the Agent.
- The Scalyr Agent will default to its last good configuration if there is an error. This can be confirmed by running
scalyr-agent-2 status -v
and checking the "Agent Configuration" section. If this is the case, your most recent changes may not be applied.
Example of an invalid configuration file error:
Agent configuration:
====================
Configuration file: /etc/scalyr-agent-2/agent.json
Status: Bad (could not parse, using last good version)
Last checked: Wed Sep 9 18:55:17 2020 UTC
Last changed observed: Wed Sep 9 18:55:17 2020 UTC
Parsing error: The value for required field "match_expression" has a value that cannot be parsed as string regular expression (using python syntax). Error is in the entry with index=1 in the "redaction_rules" array in the entry for "/var/log/whee.log" in the "logs" array in configuration file "/etc/scalyr-agent-2/agent.json" [[badField="match_expression" errorCode="notRegexp"]]
Upcoming Features
There are plans to make redaction rules applicable to log files that are ingested via other methods (such as the syslog monitor). Stay tuned!
Comments
0 comments
Please sign in to leave a comment.