How To Search Log Files: 3 Approaches To Extract Data | Scalyr (2024)

Searching log files can be a tedious process. It’s not an easy task to sift through large amounts of log data. However, log files can tell you what happened in your application. Therefore, it’s an important skill for a developer to be able to quickly search log files to solve time-critical problems.

There are many reasons you might want to search logs. Perhaps you want to better understand a certain problem. Log files provide a lot of valuable information that can help you nail down the root cause of your issue.

Some possible use cases where you want to search log files include

  • Finding a specific log level, such as error or fatal
  • Finding logs for events with a specific timestamp or that occurred between two timestamps
  • Searching for a specific keyword in your log data
  • Removing unnecessary information, such as a computer name or user ID

In this post, we’ll show you three ways to extract data from your log files. To accomplish this, we’ll be using the Bash Unix shell to filter, search, and pipe log data. Now, let’s take a look at what the Bash shell can do.

How To Search Log Files: 3 Approaches To Extract Data | Scalyr (1)

Understanding the Bash Unix Shell

When you start your terminal, the default shell is most frequently the Bash Unix shell for Mac and Linux users. For Windows users, it’s possible to install the Bash shell using the Windows subsystem for Linux. The Bash shell allows you to run programs, also known as commands.

Luckily, the Bash Unix shell provides us with a lot of different commands that we can use to search and filter data. Furthermore, the Bash shell provides you with the ability to pipe data. This means we can chain multiple commands and pass the output of one command to the next command in a single action.

Let’s say we have a file containing 100 lines of log data from which we want to filter out all the error log levels and sort the remaining logs by timestamp. We don’t need to write any code for this. We can use filtering commands to filter out all error-level logs and then pipe the filtered result to the sort command to sort the logs by timestamp. Below, you see a pseudocode example of how this might work:

read all logs -> find 'error' -> sort by timestamp

Now that we have the foundation, it’s time to get practical. Let’s take a look at several commands you can use to filter logs and an example use case for each.

Bash Commands To Extract Data From Log Files

For the examples in this post, let’s use the below dataset. You can also download the dataset from GitHub to try out the commands yourself.

Each log line contains the following information:

  1. Date
  2. Timestamp
  3. Log level
  4. Service or application name
  5. Username
  6. Event description
2015-12-03 17:08:36 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Attempting to add item to cache: Jimmy.Fallon.2015.12.02.Brett.Favre.720p.HDTV.x264-CROOKS[rartv]2015-12-03 17:08:36 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Unable to parse the filename Jimmy.Fallon.2015.12.02.Brett.Favre.720p.HDTV.x264-CROOKS[rartv] into a valid show2015-12-03 17:08:36 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Attempting to add item to cache: Moonbeam.City.S01E09.The.Legend.of.Circuit.Lake.720p.CC.WEBRip.AAC2.0.x264-BTW[rartv]2015-12-03 17:08:38 DEBUG SEARCHQUEUE-WEEKLY-MOVIE :: [User1] :: Unable to parse the filename Moonbeam.City.S01E09.The.Legend.of.Circuit.Lake.720p.CC.WEBRip.AAC2.0.x264-BTW[rartv] into a valid show2015-12-03 17:08:51 ERROR SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Failed to find item in cache: Black-ish.S02E09.Man.At.Work.720p.EXTENDED.HULU.WEBRip.AAC2.0.H264-NTb[rartv]2015-12-03 17:08:51 FATAL SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Search service crashed lost connection: ERRORS.PUBKEYERR.service.logger2015-12-03 17:08:53 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Unable to parse the filename Christmas.Through.the.Decades.Part1.The.60s.HDTV.x264-W4F[rartv] into a valid show2015-12-03 17:08:59 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Attempting to add item to cache: The.League.S07E12.The.13.Stages.of.Grief.720p.WEB-DL.DD5.1.H264-NTb[rartv]2015-12-03 17:09:01 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Unable to parse the filename The.League.S07E12.The.13.Stages.of.Grief.720p.WEB-DL.DD5.1.H264-NTb[rartv] into a valid show2015-12-03 17:09:29 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [admin] :: Unable to parse the filename Dan.Cruickshank.Resurrecting.History.Warsaw.HDTV.x264-C4TV[rartv] into a valid show2015-12-03 17:09:57 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Unable to parse the filename This.Is.Tottenham.720p.HDTV.x264-C4TV[rartv] into a valid show2015-12-03 17:09:57 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Transaction with 2 queries executed2015-12-03 17:09:57 INFO SEARCHQUEUE-DAILY-SEARCH :: [admin] :: Skipping Blindspot.S01E10.nl because we don't want an episode that's Unknown2015-12-03 17:09:57 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [admin] :: None of the conditions were met, ignoring found episode2015-12-03 17:09:57 INFO SEARCHQUEUE-DAILY-SEARCH :: [admin] :: Skipping Arrow.S04E08.720p.FASTSUB.VOSTFR.720p.HDTV.x264-ZT.mkv because we don't want an episode that's 720p HDTV2015-12-03 17:09:58 DEBUG SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Using cached parse result for: Arrow.S04E08.1080p.WEB-DL.DD5.1.H264-RARBG2015-12-03 17:09:58 INFO SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Skipping Arrow.S04E08.720p.WEB-DL.DD5.1.H264-RARBG because we don't want an episode that's 720p WEB-DL

Before we can explore different commands, we need to know how we can read log data from log files. The simplest solution is to use thecatcommand, which allows you to read the contents of a file. Then, we can pipe the log data to other commands. However, for some commands, such as grep, you can directly pass a file as input.

Let’s get started!

Command #1: Grep

The first command in our list is thegrep command. The Linux manual defines the grep command as follows:

grep searches for PATTERNS in each FILE. PATTERNS is one or more patterns separated by newline characters, and grep prints each line that matches a pattern.

Grep Use Case: Search for Log Level

Let’s start by searching for the error log level. We need to pass the word “ERROR” to the grep command. Note that thegrep command is case-sensitive by default. We can use the piping symbol | to pass the log data to the grep command.

cat log.txt | grep "ERROR"

This returns the following two results:

2015-12-03 17:08:51 ERROR SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Failed to find item in cache: Black-ish.S02E09.Man.At.Work.720p.EXTENDED.HULU.WEBRip.AAC2.0.H264-NTb[rartv]2015-12-03 17:08:51 FATAL SEARCHQUEUE-DAILY-SEARCH :: [User1] :: Search service crashed lost connection: ERRORS.PUBKEYERR.service.logger

However, note that this also returned a fatal log level because the description field of this log line contains the word “ERRORS.” Let’s modify our grep command to only match the exact word “ERROR” and not match variants. We can use the -w option to tell thegrep command to match the exact word.

cat log.txt | grep -w "ERROR"

And what if we want to filter for both error and info log levels? Luckily, the grep command can accept multiple patterns separated by the piping symbol. Importantly, use a backslash to escape the piping symbol.

cat log.txt | grep -w "ERROR|INFO"

However, for large log files, the number of returned results can contain hundreds of matches. Let’s apply a simple hack to count the number of results quickly. To accomplish this, the Bash shell provides us with the wc command. This command counts the number of returned lines using the -loption.

cat log.txt | grep -w "ERROR|INFO" | wc -l// 4 results

Cool, right? Next, let’s learn how to find logs between two timestamps using the sed command.

Command #2: Sed

Next, let’s explore thesed command. From the GNU manual pages, we can read the following definition:

sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed’s ability to filter text in a pipeline which particularly distinguishes it from other types of editors.

To explore thesed command, let’s look for logs that occurred between two timestamps.

How To Search Log Files: 3 Approaches To Extract Data | Scalyr (3)

Sed Use Case: Find Logs Between Two Timestamps

You often want to look for logs between two specific timestamps. However, it’s not possible to scroll through huge log files to find the exact timestamp.

Therefore, let’s use thesed command to find all logs that happened for the following timestamp: 2015-12-03 17:08. This means we want to find all logs between 2015-12-03 17:08:00 and 2015-12-03 17:08:59. The below command uses the -nflag andp option to only print the matched results:

sed -n '/2015-12-03 17:08/p' log.txt

Moreover, this command still works when the date/time field isn’t the first element in your log line. You can try this out by switching the date/time with the log level.

Next, we want to search between two timestamps for different minutes. I want to retrieve all logs that occurred between 2015-12-03 17:08:00 and2015-12-03 17:10:00. Here, the sed command accepts a second timestamp. Use a comma to separate the two timestamps. This command should return all lines for our log file:

sed -n '/2015-12-03 17:08/,/2015-12-03 17:10/p' log.txt | wc -l

However, we can accomplish the same using thegrep command and a regular expression. Let’s say we only want to return results that happened between 2015-12-03 17:08:50and 2015-12-03 17:08:59. We can simply pass the following pattern that matches numbers 50 to 59:17:09:5[0-9].

grep '17:09:5[0-9]' log.txt

As you can see, there are always many possibilities to accomplish the same task or reach similar outcomes.

Command #3: Cut

Last, let’s learn how you can use the cutcommand to transform log files.

The Wikibooks documentation provides the following definition of cut: “Cut is a Unix command-line tool used to extract fields and the like from lines of input, available on many platforms.”

We’ll use thecutcommand to transform log file data.

Cut Use Case: Transform Log Files

As mentioned in the introduction, we want to only store logs with the error or fatal log level. Besides that, we don’t want to store the username who has accessed the service. Therefore, let’s remove :: [User1] :: from each log line.

Here’s the explanation of the full command usingcut:

  • -d ‘ ‘ allows us to delimit our log line based on whitespace. Each whitespace-delimited snippet of text is identified as a column.
  • -f-4,8- allows us to cut columns 1 through 4 and column 8 until the end of the line. This removes the :: [User1] :: part. Note that :: is also treated as a column since it’s separated by whitespaces.
cat log.txt | grep -w "ERROR|INFO" | cut -d ' ' -f-4,8- log.txt

This is the final result with the username removed:

2015-12-03 17:08:36 DEBUG SEARCHQUEUE-DAILY-SEARCH Attempting to add item to cache: Jimmy.Fallon.2015.12.02.Brett.Favre.720p.HDTV.x264-CROOKS[rartv]

That’s it!

Searching Log Files With Bash: Many Commands To Reach the Same Outcome

I hope you learned how you can use different Bash shell commands to accomplish log data filtering, searching, and transforming.

As you may have noticed, you always have different possibilities and commands to accomplish the same goal. Furthermore, Bash allows you to chain multiple commands. For example, you might chain commands to read a log file, filter for certain log levels, or transform log data to a different format.

If you want to learn more about logging, read Scalyr’s article about the 10 commandments of logging. And make sure to check out Scalyr’s solutions to search log files.

How To Search Log Files: 3 Approaches To Extract Data | Scalyr (2024)
Top Articles
Stocks Are Recovering While the Economy Collapses. That Makes More Sense Than You'd Think.
Exchange-Traded Funds (ETFs) | iShares UK – BlackRock
English Bulldog Puppies For Sale Under 1000 In Florida
Katie Pavlich Bikini Photos
Gamevault Agent
Pieology Nutrition Calculator Mobile
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Compare the Samsung Galaxy S24 - 256GB - Cobalt Violet vs Apple iPhone 16 Pro - 128GB - Desert Titanium | AT&T
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Craigslist Dog Kennels For Sale
Things To Do In Atlanta Tomorrow Night
Non Sequitur
Crossword Nexus Solver
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Energy Healing Conference Utah
Geometry Review Quiz 5 Answer Key
Hobby Stores Near Me Now
Icivics The Electoral Process Answer Key
Allybearloves
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Marquette Gas Prices
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Vera Bradley Factory Outlet Sunbury Products
Pixel Combat Unblocked
Movies - EPIC Theatres
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Mia Malkova Bio, Net Worth, Age & More - Magzica
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Where Can I Cash A Huntington National Bank Check
Topos De Bolos Engraçados
Sand Castle Parents Guide
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hello – Cornerstone Chapel
Stoughton Commuter Rail Schedule
Nfsd Web Portal
Selly Medaline
Latest Posts
Article information

Author: Gregorio Kreiger

Last Updated:

Views: 5440

Rating: 4.7 / 5 (57 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Gregorio Kreiger

Birthday: 1994-12-18

Address: 89212 Tracey Ramp, Sunside, MT 08453-0951

Phone: +9014805370218

Job: Customer Designer

Hobby: Mountain biking, Orienteering, Hiking, Sewing, Backpacking, Mushroom hunting, Backpacking

Introduction: My name is Gregorio Kreiger, I am a tender, brainy, enthusiastic, combative, agreeable, gentle, gentle person who loves writing and wants to share my knowledge and understanding with you.