Importing Splunk Data – DataSet Customer Portal

Introduction

This article covers the process of importing log data from your Splunk instance to DataSet while retaining context. There are two methods to doing this.

Important

The parser provided below is configured to extract the reserved DataSet timestamp parameter from Splunk's result_indextime field. Consequently, logs imported to DataSet will be associated with the same timestamp they had in Splunk and you will need to expand your search to see older entries. Note that logs with a timestamp greater than your retention period will not be stored on DataSet. For example, a log event from 32 days ago will not be stored if your retention period is 30 days.

Upload > 1GB of logs

If you are sending more than 1GB of logs, we recommend using the DataSet S3 import routine.

Prerequisites

Splunk account
Access to Splunk CLI
AWS S3 account
DataSet account with full access

Instructions

Please follow these instructions to establish an S3 bucket and SQS queue which is capable of importing logs to your DataSet account: https://app.scalyr.com/solutions/import-logs-from-s3

Set up an S3 Bucket
Create an SQS Queue and configure your S3 bucket to publish new object notifications to it
Configure IAM permissions (See this section for DataSet-specific permissions)

In DataSet, setup an S3 Monitor (click the User Menu -> "Monitors" -> "Edit JSON" button)

Your monitor config will be similar to the following example:

  monitors: [
    {
      type: "s3Bucket",
      region: "us-east-1",
      roleToAssume: "arn:aws:iam::account-id:role/role-name-with-path",
      queueUrl: "https://sqs.us-east-1.amazonaws.com/nnnnnnnnnnnn/scalyr-s3-bucket-foo"
      fileFormat: "text_gzip",
      hostname: "foo",
      parser: "foo"
    }
  ]

Configure the parser (click the User Menu -> "Manage Logs" -> "Parser" -> "Add Parser" -> Enter "splunk-json"). You can use the following example parser or create your own:

// Parser for log files containing JSON records.
{
  attributes: {
  // Tag all events parsed with this parser so we can easily select them in queries.
  dataset: "splunk"
},
formats: [
{
     format: "${parse=json}$", repeat: true
rewrites: [
     {
        input: "result_indextime",
        output: "timestamp",
        match: ".*",
        replace: "$0"
     }, 
     {
       input: "resultSource",
       output: "logfile",
       match: ".*",
       replace: "$0"
     }.
     {
       input: "resultHost",
       output: "serverHost",
       match: ".*",
       replace: "$0"
     }
    ]  
   }
 ]
}

Export the Splunk log file

cd /opt/splunk/bin
./splunk search "query" -output json -maxout 0 > scalyr_testuat_s3.log

Compress log file into multiple files

split -l 40000 "scalyr_testuat_s3.log" "splunk3.gz.part-" && gzip -9 splunk3.gz.part*

Add log files to your S3 bucket

Upload < 1GB of logs

If you are uploading less than 1GB of logs, you can export them from the Splunk CLI or UI as JSON objects and configure the Scalyr Agent to stream it to Scalyr.

Prerequisites

Splunk
Access to Splunk CLI
Scalyr Agent
DataSet account with full access

Objectives

Migrate this Splunk data to DataSet

Splunk Configuration

Define the query in Splunk for the data you want to export. For example,
```
index=* 
```

Navigate to your Splunk instance and invoke the CLI

cd /opt/splunk/bin
./splunk search "query" -output json -maxout 0

DataSet Configuration

Install the Agent

curl -sO https://www.scalyr.com/install-agent.sh
sudo bash ./install-agent.sh --set-api-key "api-key"
sudo scalyr-agent-2 start

You can access your API key by clicking the User Menu -> "API Keys" -> Select one with "Write" log access

Configure the Agent
Note: Intead of vim, feel free to use a text editor of your choice
```
vim /etc/scalyr-agent-2/agent.json
```
Configure the agent to point to a directory and specify the name of the parser you will be using ("splunk-json" in this example):
```
{
  path: "/opt/splunk_export/*.log",
  attributes: {parser:"splunk-json"}
}
```

Configure the parser (click the User Menu -> "Manage Logs" -> "Parser" -> "Add Parser" -> Enter "splunk-json"). You can use the following example parser or create your own:

// Parser for log files containing JSON records.
{
  attributes: {
  // Tag all events parsed with this parser so we can easily select them in queries.
  dataset: "splunk"
},
formats: [
{
     format: "${parse=json}$", repeat: true
rewrites: [
     {
        input: "result_indextime",
        output: "timestamp",
        match: ".*",
        replace: "$0"
     }, 
     {
       input: "resultSource",
       output: "logfile",
       match: ".*",
       replace: "$0"
     }.
     {
       input: "resultHost",
       output: "serverHost",
       match: ".*",
       replace: "$0"
     }
    ]  
   }
 ]
}

Once the agent is installed and configured and the parser is installed and configured, we are ready to export.

Export logs from Splunk

Export data from Splunk to Scalyr by entering the command from above and pointing it to the file we just configured. Note: For the Splunk search command, set the -maxout value to 0 for unlimited output. If there are issues with the dataset, you can use the dump command)
```
./splunk search "index=_internal earliest=mm/dd/yyyy:hh:mm:ss latest=mm/dd/yyyy:hh:mm:ss " -output rawdata -maxout 0 > /opt/splunk_export/output.log 
```
Since we configured the Agent to look for files within the /opt/splunk_export/ directory ending in .log, it will upload the logs as they are directed to the output.log file
Uploaded logs should now be visibile from your DataSet account.

Introduction

Important

Upload > 1GB of logs

Prerequisites

Instructions

Upload < 1GB of logs

Prerequisites

Objectives

Splunk Configuration

DataSet Configuration

Export logs from Splunk

Related articles