Mapping JSON data to DataSet
The DataSet Kafka Connector supports sending custom application messages to DataSet. The mapping of application message fields to DataSet event attributes must be specified in the custom_app_event_mapping DataSet connector config. This is optional but useful if you want to take full advantage of the DataSet UI.
1. Go to the sink config.
cd $KAFKA_SCALYR_SINK_CONFIG
vim connect-scalyr-sink-custom-app.json
{
"name": "scalyr-sink-connector",
"config": {
"connector.class": "com.scalyr.integrations.kafka.ScalyrSinkConnector",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable":"false",
"tasks.max": "3", //should match tasks max for topic.
"topics": "logs",
"api_key": "<SCALYR LOG WRITE API TOKEN>",
"event_enrichment": "tag=kafka",
"custom_app_event_mapping":"[{\"matcher\": {\"attribute\": \"app.name\", \"value\": \"myapp\"}, \"eventMapping\": {\"message\": \"message\", \"logfile\": \"log.path\", \"source\": \"host.hostname\", \"parser\": \"fields.parser\", \"version\": \"app.version\", \"appField1\":\"appField1\", \"appField2\":\"nested.appField2\"}}]"
}
}
2. Modify or add a custom_app_event_mapping to your field.
[{
"matcher": {
"attribute": "app.name",
"value": "myapp"
},
"eventMapping": {
"message": "message",
"logfile": "log.file.path",
"serverHost": "host.hostname",
"parser": "fields.parser",
"version": "app.version",
"appField1", "appField1",
"appField2", "nested.appField2"
},
"delimiter":"\\."
}]
3. Map fields to their proper key according to the table above. You can flatten using . syntax. Take a look at the log below. and to map to logfile in DataSet, you would set. "logfile":"log.file.path".
{
"@timestamp": "2020-02-15T01:59:18.429Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.6.0",
"pipeline": "filebeat-7.6.0-system-syslog-pipeline"
},
"message": "Feb 14 17:53:46 user com.apple.xpc.launchd[1] (com.apple.xpc.launchd.domain.pid.WebContent.22691): Path not allowed in target domain: type = pid, path = /System/Library/StagedFrameworks/Safari/SafariShared.framework/Versions/A/XPCServices/com.apple.Safari.SearchHelper.xpc/Contents/MacOS/com.apple.Safari.SearchHelper error = 147: The specified service did not ship in the requestor's bundle, origin = /System/Library/StagedFrameworks/Safari/WebKit.framework/Versions/A/XPCServices/com.apple.WebKit.WebContent.xpc",
"input": {
"type": "log"
},
"event": {
"module": "system",
"dataset": "system.syslog",
"timezone": "-08:00"
},
"host": {
"hostname": "Test-Host",
"architecture": "x86_64",
"os": {
"platform": "darwin",
"version": "10.14.6",
"family": "darwin",
"name": "Mac OS X",
"kernel": "18.7.0",
"build": "18G95"
},
"id": "4C372C34-DFFD-5B38-B575-DCF17623AD29",
"name": "Test-Host"
},
"log": {
"file": {
"path": "/var/log/system.log"
},
"offset": 176072
},
"fileset": {
"name": "syslog"
},
"service": {
"type": "system"
},
"ecs": {
"version": "1.4.0"
},
"agent": {
"hostname": "Test-Host",
"id": "7714b3df-79af-4c7a-8c47-d45b134bbd24",
"version": "7.6.0",
"type": "filebeat",
"ephemeral_id": "a65e0775-fb06-409f-a366-c04a2157603c"
}
}
4. Convert to an escaped string.
"custom_app_event_mapping":"[{\"matcher\": {\"attribute\": \"app.name\", \"value\": \"myapp\"}, \"eventMapping\": {\"message\": \"message\", \"logfile\": \"log.path\", \"source\": \"host.hostname\", \"parser\": \"fields.parser\", \"version\": \"app.version\", \"appField1\":\"appField1\", \"appField2\":\"nested.appField2\"}}]"
5. Start the connector
curl POST -H "Content-Type: application/json" -d @connect-scalyr-sink-custom-app-nginx.json
Appendix A
If you are configuring a custom connector, the below documentation will explain how to map fields to DataSet and what to look for.
Here is a table of required or special fields within DataSet.
Field |
Requirement | Description |
message |
required | This is the log message body that will be interpreted by the parser and allow for free text search. |
serverHost | recommended | This will allow for the proper display of the home page dashboard as well be used to report on log volume. |
logfile | optional | This will be used to report on log volume and is an attribute in the home page dashboard and the logvolume dashboard |
timestamp | optional | This should be in the message or in a field. This is important unless you can tolerate the ingestion time being set as timestamp. This will keep your logs in order and ensure higher accuracy in searches |
severity | optional | Will be converted to an int with a range of 1 - 6 and can be used to easily search for errors |
parser | recommended | The value of this field will create a parser with that name if one is not already there. It is important to use one parser per logical data grouping. You can specify prebuilt parsers from this list, or create your own. |
serverIP | optional | Optional field will show up on the home page. |
An application may have nested fields. DataSet events support a flat key/value structure. Nested fields are specified in the format: field1.field2.field3, where the nested fields are separated by a delimiter. By default, the delimiter is "."
Comments
0 comments
Please sign in to leave a comment.