Parsing is an important step in any log aggregation tool. Scalyr parsers are relatively simple to build, and there are several predefined parsers that work out of the box, but you can now leverage the power of Logstash and the Logstash community to manipulate your data instead of or in addition to Scalyr parsers. If you've already created Grok patterns or other transforms within Logstash, you can leverage those when bringing logs into DataSet.
Prereqs
Logs (Nginx/Node in this example)
Filebeat
Logstash
DataSet Account
Docker-compose (optional)
Docker (optional)
Setup
Option 1: docker-compose
- You can spin this up in docker-compose easily. Check it out on Github. Don't forget to specify an API key in
./logstash/pipeline/nginx.conf
- Skip to "Configure Logstash" below.
Option 2: Manual Setup
0. Install logstash and filebeat.
1. Install the DataSet output plugin.
cd /usr/share bin/logstash-plugin install logstash-output-scalyr
2. Add the DataSet output plugin configuration to your logstash config file (`logstash-simple.conf` Get your API key here
output { scalyr { api_write_token => "<your API token here>" } }
Configure Logstash
3. Grok data - this will parse the data into key-value pairs and replace the DataSet parser functionality.
grok { match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"] overwrite => [ "message" ] }
4. Transform and map data
filter { mutate { add_field => { "serverHost" => "my hostname" } rename => { "path" => "logfile" } rename => { "data" => "message" } } }
5. View Data in DataSet.
DataSet equivalent parser that is being bypassed
// Parser for standard Apache-format access logs. { attributes: { // Tag all events parsed with this parser so we can easily select them in queries. dataset: "accesslog" }, formats: [ // Extended format including referrer, user-agent, and response time. { format: "$ip$ $user$ $authUser$ \\[$timestamp$\\] \"$method$ $uri{parse=uri}$ $protocol$\" $status$ $bytes$ $referrer=quotable$ $agent=quotable$ $time=number$", halt: true }, // Format including referrer and user-agent (but no response time) { format: "$ip$ $user$ $authUser$ \\[$timestamp$\\] \"$method$ $uri{parse=uri}$ $protocol$\" $status$ $bytes$ $referrer=quotable$ $agent=quotable$", halt: true }, // Including referrer and user-agent, but with no separate method, uri, and protocol. Sometimes // observed for invalid or incomplete requests. { format: "$ip$ $user$ $authUser$ \\[$timestamp$\\] \"$header$\" $status$ $bytes$ $referrer=quotable$ $agent=quotable$", halt: true }, // Basic format with no referrer or user-agent { format: "$ip$ $user$ $authUser$ \\[$timestamp$\\] \"$method$ $uri{parse=uri}$ $protocol$\" $status$ $bytes$", halt: true }, // Basic format, with no separate method, uri, and protocol. { format: "$ip$ $user$ $authUser$ \\[$timestamp$\\] \"$header$\" $status$ $bytes$", halt: true } ] }
Comments
0 comments
Please sign in to leave a comment.