Log Aggregation

Logs from the various component services that make up PNDA, and the applications that run on PNDA, are collected and stored on the logserver node.

Logstash clients on each node monitor the various log files and push data in near real-time to a logstash server. The logstash server writes the logs into raw text files under /var/log/pnda, and also adds them into elasticsearch. Elasticsearch indexes the logs and makes them available to Kibana.


Kibana is linked from the main console under Metrics ► Quick Links ► PNDA Logserver. You can search for specific logs, or create graphs on various aspects of the log data. A basic dashboard for PNDA logs is provided out-of-the-box.

Raw logs

Plain text log files are written under /var/log/pnda on the logserver node.

They are rotated with logrotate, limiting each file to 10 MB with five prior versions retained. The logrotate config is in /etc/logrotate.d/pnda. It can be useful to tweak the settings for this to retain more logs for specific components when debugging a particular problem.

Something to beware of is that logrotate only runs every 15 minutes. So in the case that a rogue application writes a lot of log data in under 15 minutes (this isn't as hard as it sounds with big data—consider per-message debug output), the logserver could run out of disk space.

Application Logs

Application logs are aggregated from log files named stdout, stderr and spark.log for YARN applications. The log files are named yarn_applicationId.log.

The YARN application ID (or IDs, in the case of a PNDA application that makes us of multiple YARN applications) can be obtained programatically by querying the Deployment Manager (/applications/application_id/detail), or manually by looking in the YARN Resource Manager UI. This will be integrated into the PNDA Console in a future release.

results matching ""

    No results matching ""