Logging and how it can help with testing

I’m putting together a presentation on logging, and how it can help with testing.

Can anyone provide some examples of this? Can anyone share some articles that might be of interest?

3 Likes

Hello @lgibbs!

Great topic! Logging is one of many methods to improve testability by providing some transparency to an application’s behavior.

Logging has a spectrum of implementations and uses. At one end are tools like Splunk that can provide run-time information about the machine and the application as well as error messages. At the other end of the spectrum is simple files that can store the same information but it may be more challenging to organize and review.

I’ve used logging as a journal for transactions or workflows. That is, log entries help tell the story about the transaction and sometimes includes the value of relevant variables. I would not recommend placing sensitive information in any log. An example might be:

  • collect an address
  • store the address
  • collect vehicle information
  • store vehicle information

In this manner, a tester can verify the path through the code and identify locations where there may be an issue. This is also valuable for developers!

I’ve also used logging to record detailed error information. I’ve used error information to locate calculation errors, logic errors, and navigation errors.

Joe

1 Like

Pretty much what Joe has put.

Logging does tell a story, it’s the story of how the applicaton behaves.

What happens when the services start up,
What order
If it fails, why
It if succeeds, what information is logged

It’s also a good reference to check when you are testing. Logging is a cost because verbose logs take up disk space so do you really need your log level to be DEBUG?

Only log enough so that it brings value and gives you the information that you need. It is also a wise point to check for security concerns. Are you showing my information than you need to.

I find logs useful as a qa engineer in aiding developers, you can investigate the issue. Raise the relevant bug or conversation and shortcut the investigation time.

1 Like

I forgot about Logs in production.

Logging is an immutable, timestamped record of discrete events that happened over time. In a microservices architecture, there can be a logging mechanism to track what is happening across the multiple services. Logging, done right, can give you a holistic view of the status of the system.

When DevOps engineers look at the logs, they want to ensure that they have the whole picture that enables them to get to the root cause of a problem quickly. It is better to have a lot of information than have too little, but on the other hand it should be avoided to log every single event that occurs within the production environment, as this will create too much noise.

Here are some test ideas

  • Run a production-like environment for a week continuously and put moderate load to the system. Observe the system. Are the services working without issues? Was there any failure? What is being reported to the logs?

  • Within a production environment, isolate a sub-section of the environment. Force a failure where all of the information is known (e.g. What caused the failure and what should it report). What information is present within the logs. Is the information provided, too verbose or too sparse. Does the information presented lead the end user to the same expected conclusion.

  • Run a production-like environment for a week continuously and put moderate load
    to the system with selected stress periods to the system’s performance threshold. Observe the system. Are the services working without issues even under the periods of stress? Was there any failure? What is being reported to the logs?

  • When errors occur and are reported to the logs, what is the level of the logs. Is it appropriate for a production system. Can a user obtain the relevant information to diagnose the problem?

  • What happens to the logs when the storage is approaching the limit? Does this get noted/alerted to a user in a reasonable timescale to resolve the problem.

1 Like

Some great answers so far. I would add that many modern enterprise systems have many moving parts. You can have multiple services, database, APIs, frontend, etc. contributing to what a users sees as a single application.

Logs have fairly standard formats now. They almost all start with a timestamp. In a good enterprise system you can use things like Splunk or Kibana to collect logs from various places and combine them using the timestamp of each log. In this way, you can literally see actions from the incoming user request, through the backend and back out to the user response.

When testing it makes it easy to see integration points and when things actual fail. For example, I do something on a web application, a backend system gets into a bad state but I don’t see anything on the frontend. After 5 minutes of interacting with the frontend, as a tester, I see something go wrong. Using something like Kibana to scroll through all the different systems as one big log, I can scroll back from what I saw go wrong on the frontend to the actual failure 5 minutes earlier.

This sort of logging across multiple components doesn’t come out of the box in many cases however. There might be 100 users interacting with the system. You want to filter down to just one user. Tagging transactions with a UUID or something else unique allows us to look at the logs for one user.

Maybe talking about how that works and how it makes debugging these complex enterprise systems so much easier.

Darrell