Overview

What are logs?

Logs are a key source of information for developers wanting to understand how their applications are running. Logs can help developers troubleshoot issues with applications and services, and can aid in the detection of problems or opportunities for improvement.

Logs are a key component of observability, and there is a ton of valuable information stored in log files. Traditionally, log management can be challenging due to the sheer volume of data to sift through, and the fact that 99% of log entries can look exactly the same. Storing and analyzing logs - especially at scale - is rarely easy or fast, and it can be difficult to extract meaning and insights from raw log data.

What is Loki?

Loki is a service running on the platform that collects, aggregates, and stores logs. Loki is an open-source, centralized logging solution from Grafana Labs (the makers of the powerful Grafana dashboard and the Prometheus monitoring system).

Loki aims to solve some of the traditional challenges of log management by leveraging powerful technologies developed in other Grafana products. For more background and technical information on why we think Loki is so awesome, see here.

And how do I view and consume logs collected with Loki?

Loki integrates seamlessly with Grafana as a frontend user interface that can be used to view, search, query, parse, and analyze logs.

Additionally, Loki comes with a powerful query language - LogQL - which offers advanced syntax and regular expressions for looking deeply into the content of logs to find exactly what you are looking for.

Loki can also be queried directly against its API, allowing developers to create automation or scripts that help them to add further instrumentation that can extract the value that logs data provides.

How to…

View Loki logs in Grafana

The Grafana UI is used as the visualization interface for logs aggregated by Loki.

To view logs collected by Loki through Grafana, simply:

  1. Log in to the VFS Grafana instance 🧦 (SOCKS needed; use GitHub account for auth; see image below)

  2. Go to Explore (the little compass icon on the left-hand navigation; see image below)

  3. Select the Loki environment that you’re interested in (from the drop-down near the top-left of the page; see image below)

  4. You’ll now see a query interface for searching and analyzing log files:

  5. Many developers will be interested in vets-api logs:

    1. To use this use case as an example, click “Log browser”

    2. Make sure “app” is highlighted

    3. Scroll down and highlight “vets-api-server” or “vets-api-worker” depending on your need

    4. Click “Show logs”

Query and analyze Loki log data in Grafana

Step 1: Get familiar with your data

In order to understand how search and analyze the logs that you are interested in, you’ll first need to understand the data that’s contained within the log files. This includes WHAT information is being sent, HOW it looks, and HOW it is being tagged and labeled. The labeling aspect is especially important since that is how you will facet/group/slice/segment the log data.

To do this, try the following:

  • Log into Grafana (remember SOCKS needed)

  • Go to Explore (the little compass icon)

  • Select the respective Loki data source

  • Click “Log labels” to See how ALL the data is structured and organized

    • NOTE: Most people will find the app label to be the most useful

  • Now you should see some data:

    • The image above is showing ALL of the log data for the app (or label) that has been selected

    • The bar graph at the top is showing the rate of log messages for a particular label

    • The lines below (at the top of the image) are individual log messages

    • Try clicking on a specific message to see additional details, including extracted fields and other labels that are tagged onto that specific message

Step 2: Try some basic data manipulation
Let’s see how we can manipulate this data (without going too deeply):

  • Try clicking the time picker at the top, to select a range of dates/times or a specific time window

  • Try clicking the split icon to get two windows side-by-side which can be helpful for manual correlation

  • Try clicking the “Live” button to see ingestion and processing of log data in real-time (whoa - cool!)

  • Try clicking the various options below the graph to see how it changes the detail view.

Step 3: Try some simple queries
The power of Loki is in labeling the log messages, which enables you to quickly sort, query, and slice & dice the log data. Selecting an app label will show all logs for a given app (for the chosen environment).

As an example, if we want to see only the web-server logs for the reverse proxy aka “revproxy”:

  • The web server logs are generated by scraping the Nginx access logs, ie /var/log/nginx/access.log

  • This can be found by looking for a web-server log entry, clicking it, and seeing the labels.

  • So, we can update our query to only show logs with BOTH labels (app, and specific log file): {app="revproxy",filename="/var/log/nginx/access.log"}

  • Pro-tip: after entering the first label, you can start typing and the filters will auto-complete with (only) valid options

  • Pro-tip: you can hit “shift-enter” as a keyboard short-cut to execute your updated query

As another example, if we want to look for something a bit more specific:

  • Errors are a common thing to look for, so…

  • By appending a simple operator to our query (|= “thingy”), we can search for errors in the revproxy's access logs: {app="revproxy",filename="/var/log/nginx/access.log"} |= "Error"


    To recap, the above image is showing us:

  • all of the web-server logs (Nginx access.log files)

  • from all revproxy instances, that…

  • contain Error over the past 12hrs

Using LogQL’s built in parsers

LogQL can be used to parse data out of certain log formats such as JSON or traditional Apache log formats. This will result in searchable labels which reflect the log data. This conversion happens in the client browser.

For instance, if you are trying to work with vets-api logs, which come from a docker container, you can use a query like this: {app="vets-api-server"} and you will see logs such as:

20:15:55 web.1       | {"host":"fe34c4f5d79d","application":"vets-api-server","environment":"production","timestamp":"2021-09-16T20:15:55.533505Z","level":"info","level_index":2,"pid":36,"thread":"puma server threadpool 013","duration_ms":1.9736800004466204,"duration":"1.974ms","named_tags":{"request_id":"4f839e23-869c-48e2-ba4e-7fb7e0069796","remote_ip":"52.61.71.205","user_agent":"node-fetch/1.0 (+https://github.com/bitinn/node-fetch)","ref":"226118959f22385843b359d0ac9f6adc9e863cd0","csrf_token":"alN6UVi+AntpUZ12wM3GND7ytxwNKvrfeZSrB110SjH4irQCeBsp7G5rcFG9mjxpt/6hLbaKDQhJbKyoWYlcCg=="},"name":"V0::Profile::TransactionsController","message":"Completed #statuses","payload":{"controller":"V0::Profile::TransactionsController","action":"statuses","format":"JSON","method":"GET","path":"/v0/profile/status","status":401,"view_runtime":0.47,"db_runtime":0.0,"allocations":856,"status_message":"Unauthorized"}}
CODE

As you can see, the latter part of this log is JSON, however, if you were to change your query to {app="vets-api-server"} | json you would get JSON parsing errors, because the JSON data is prefixed with a timestamp, container name, and a pipe symbol. In order to get the JSON data out, you must use a regexp expression to extract the JSON blob to an object and then you can parse it with the built-in JSON parser.

To start, we will regexp out the timestamp. LogQL regexp are in Go regex format.

{app="vets-api-server",filename=~".+json.log"} | regexp "(?P<time>\\d\\d:\\d\\d:\\d\\d)"
CODE

Note when you run this query that you now have a label in the log called time that contains the timestamp.

Next we will take care of the container name and the pipe symbol, and stash the JSON part of the log into a named matcher.

{app="vets-api-server", filename=~".+json.log"} | regexp "(?P<time>\\d\\d:\\d\\d:\\d\\d) (?P<process_name>\\w+.\\d) \\s+ (\\|) (?P<json>.+)"
CODE

Please note the new labels in the screenshot below

Next we will use the line_format operator, and then finally, pass the rest to the JSON parser.

{app="vets-api-server", filename=~".+json.log"} | regexp "(?P<time>\\d\\d:\\d\\d:\\d\\d) (?P<process_name>\\w+.\\d) \\s+ (\\|) (?P<json>.+)" | line_format "{{.json}}" | json
CODE

Note that all the fields in the JSON blob are now added to the list of labels.

You may now use those labels to continue to refine your log search. Please note that clicking the magnifying glass in the UI to add those things to your query may not work correctly, you should refine your query in the query input box.

Also keep in mind that you can choose multiple labels as your initial batch of log files, like {app=~"vets-api-.+} to choose both vets-api-server and vets-api-worker logs, or even two different applications if you are trying to correlate logs between services.

Additional Resources


External Resources: