This article was written to assist customers who have noticed a sudden, unexpected decline in log activity for a particular log. It includes a number of best practices that can be used to isolate the potential cause(s).
Step 1 - Verify the log file path exists and is accessible to the Scalyr Agent
This step is fairly self-explanatory, and will vary by OS and host configuration.
Linux hosts only: If you're running logrotate with
compress, be sure that you are also using
delaycompress and a reasonable
size (>= 200).
delaycompress prevents logrotate from immediately compressing logs once a rotate cycle is initiated. Instead, logrotate will wait one cycle before compressing, which gives the Agent enough time to complete its upload to DataSet.
Step 2 - Verify timestamps
Note: This only applies if timestamps within the log are parsed (and recognized). If not, log events are associated with the time of ingestion.
If the log contains timestamps, confirm that any searches you perform to verify upload activity are for the same time period and timezone. For example, a log event with a timestamp from 2 weeks ago will be associated with that date when uploaded to DataSet, rather than the time of ingestion.
You can confirm this by looking for the
sca:ingestTime attribute. Click a log event and check the "Inspect Log Line" dialog. If present,
sca:ingestTime will be the epoch of when the log event was ingested and indicates that the timestamp occurred prior to the log's actual time of ingestion.
Furthermore, if the timestamp is from after your retention period expires (ex. you have 30d retention and the log's timestamp is from 32 days prior to the present), the log event won't be included in any search results.
Step 3 - Check discard rules
Before proceeding with any other steps, be sure to verify that the logs in question are not being discarded by a discard rule on the "Cost Management" page. The following query makes the identification process easier, especially if a lot of discard rules were configured.
- Identify when the last log event arrived with the "Search" function
- Go back ~10 minutes or so
- Run the following Search query after adjusting the time range. Note: "Full" permissions are required in order to access the audit logs.
tag='audit' action='updateFilter' filterRequest.disabledUntil contains "1970"
This query will list any "Cost Management" discard rules that were activated during the timeframe when the logs went missing. The user will need to confirm that the discard rule actively affects the logs in question. Discards that occur via a parser or the Agent are not returned.
Step 4 - Confirm that the log is being uploaded
If you're using the Agent, run
sudo scalyr-agent-2 status -v to confirm that the log in question is being monitored for changes and uploaded to DataSet. In particular refer to the "Log Transmission" section.
Review the /var/log/scalyr-agent-2/agent.log file for errors and / or reported issues that are associated with the missing log events. From the DataSet search, you can run
logfile='/var/log/scalyr-agent-2/agent.log severity > 3 , if
implicit_agent_log_collection = true. Otherwise, a cursory search can be performed via the local file.
From your DataSet account, run
tag='logVolume' forlogfile!='none' for the timeframe determined in Step 3. Review the resultant
forlogfile values and verify that the log in question is not present.
Step 5 - Contact DataSet Support
If you have completed the above steps and have any questions, please contact the DataSet Support team by signing into the Support Portal and submitting a ticket. When doing so, kindly include the following in a .tar.gz file:
- The output of
sudo scalyr-agent-2 status -v
- The /etc/scalyr-agent-2/* directory, with any API keys redacted
- The /var/log/scalyr-agent-2/agent.log file
- The last 20 lines or so of the log file in question (
tail -n 20 <path> > logfile_snippet.txt)