Handling Multiline Log formats using Regex and GROK Parser

July 20, 2023

To effectively analyze logs from multiple products, security operations teams must first comprehend the diverse landscape of log types. We will provide an overview of common log types encountered, such as system logs, application logs, network logs, and security logs. By understanding the characteristics and formats of each log type, teams can better prepare for the complexities that lie ahead.

Security Operations teams face challenges in analyzing different log types from Multiple Products. A few products have complicated log structures which requires advanced Rules and GROK pattern to extract the fields from RAW message.

The Challenge of Complicated Log Structures:

Certain products generate logs with intricate structures that pose challenges for analysis. We will examine the reasons behind these complexities, including proprietary log formats, inconsistent field naming conventions, and unstructured log data. Through examples, we will showcase the difficulties faced by security operations teams and how these complicated log structures can hinder their ability to extract relevant information effectively.

Regex and GROK Patterns – Unleashing the Power of Pattern Matching and Log Parsing:

Regular expressions, or regex, are a powerful tool for pattern matching in log analysis. We will explore techniques such as using anchors, modifiers, quantifiers, and capture groups to identify and extract relevant data from multiline log entries.

GROK patterns are a powerful tool for log parsing, enabling security operations teams to extract fields from raw log messages efficiently. Through practical examples, we will demonstrate how GROK patterns can be customized to handle complex log structures and extract valuable information. We will also highlight the importance of maintaining a GROK pattern library for consistent and scalable log analysis.

Overcoming Log Analysis Challenges:

We will address the specific challenges encountered by security operations teams when analyzing logs from multiple products with diverse log structures. We will discuss issues such as data normalization, log integration, and log source identification. Moreover, we will provide strategies and techniques to overcome these challenges, including log aggregation, log enrichment, and normalization processes.

This blog explains about analyzing and converting F5 BIG-IP logs which give different Timestamp formats and Multiple lines in one single Log into queryable/readable format.

Objective: Processing Complex log that have irregular or inconsistent patterns with various tools and frameworks.

  1. Understand the log format: Familiarize with the structure and format of the log messages. Identify the different components, fields, and patterns within the logs.
  2. Define the parsing strategy: Determine the approach used to parse the logs. This can include using regular expressions (regex), Grok patterns, or specific log parsing libraries or frameworks.
  3. Identify key fields: Identify the specific fields or information to extract from the logs. These include timestamps, log levels, error codes, user IDs, or any other relevant data.
  4. Writing Parser: Define regex/grok patterns that capture the required information and use them to extract the data using pipelines. Pipeline processes the incoming log messages by extracting relevant information, performs transformation using parser and takes actions based on condition.
  5. Utilize log parsing libraries or frameworks: For more complex log formats, leverage log parsing libraries or frameworks that provide built-in functionality to handle log parsing. Examples include Logstash, Elasticsearch, Fluentd, Apache Kafka, or specific language-specific log parsing libraries.
  6. Test and refine: Test parsing strategy and patterns against sample log messages to ensure they accurately extract the desired fields. Adjust and refine the approach as needed.
  7. Process and analyse: Once the logs are successfully parsed and extracted the relevant fields to process and analyse the data. This might involve storing the data in a database, performing aggregations or calculations, generating reports, or integrating it with other systems.

F5 BIG-IP logs

An F5 BIG-IP load balancer distributes the communications evenly across the servers in a network, so that no single server is overwhelmed. The BIG-IP keeps a constant check on the incoming and outgoing traffic of the servers and it will route the user requests to the most available server that can best handle them.

It also improves application performance, scalability and reliability while enhancing security and user experience.

Encountering various timestamps formats

For example,

“May 11, 2023, 8:54:13 AM,” the timestamp format does not have a default Grok pattern.

To extract the above timestamp, define a custom Grok pattern using below regular expression which captures the timestamp components (month, day, year, hour, minute, second, AM/PM) and assigns them to the field vendor_timestamp.


(? [A-Z][a-z]{2,3} \d{1,2}, \d{4} \d{1,2}:\d{2}:\d{2} [AP]M).
.code-block { font-family: monospace; background-color: rgb(255, 255, 255); padding: 24px; /* Block padding all around */ border-radius: 8px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Below is the sample logs with Multiple Timestamp formats.


webui INFO: Deployment of configuration descriptor /etc/tomcat/Catalina/localhost/tmui.xml has finished in 55,602 ms May 10, 2023 6:34:02 AM org.apache.catalina.startup.HostConfig deployDescriptor
webui WARNING: [SetPropertiesRule]{Server/Service/Engine/Host} Setting property 'xmlValidation' to 'false' did not find a matching property. May 12, 2023 5:35:14 AM org.apache.tomcat.util.digester.SetPropertiesRule begin usage: java org.apache.catalina.startup.Catalina [ -config {pathname} ] [ -nonaming ] { -help | start | stop } Fri May 12 05:35:11 PDT 2023
webui 2023-05-10T13:34:02Z ERROR [Thread-4] controller.SubscriberServlet:subscribe : MCP subscribe error: Unable to read POST response data java.net.ConnectException: Connection refused (Connection refused)
.code-block { font-family: monospace; background-color: rgb(255, 255, 255; /* BG color with 6% opacity */ padding: 24px; /* Block padding all around */ border-radius: 8px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Regex for Multiline log formats


(\t+)?(?[\w\W\.\d\(\):]+$)
.code-block { font-family: monospace; background-color: rgba(255, 255, 255, 0.06); /* BG color with 6% opacity */ border: 1px solid rgba(255, 255, 255, 0.2); /* Stroke color with 20% opacity */ padding: 24px; /* Block padding all around */ border-radius: 16px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Below is the sample for Multiline log.


webui SEVERE: Servlet.service() for servlet [org.apache.jsp.tmui.overview.welcome.introduction_jsp] in context with path [/tmui] threw exception May 24, 2023 11:14:39 PM org.apache.catalina.core.StandardWrapperValve invoke java.lang.NullPointerException
at com.f5.util.UsernameHolder.getUsername(UsernameHolder.java:72)
at com.f5.util.UsernameHolder.updateConnection(UsernameHolder.java:270)
at com.f5.util.UsernameHolder.updateConnection(UsernameHolder.java:245)
.code-block { font-family: monospace; background-color: rgba(255, 255, 255, 0.06); /* BG color with 6% opacity */ border: 1px solid rgba(255, 255, 255, 0.2); /* Stroke color with 20% opacity */ padding: 24px; /* Block padding all around */ border-radius: 16px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Regex with GROK

Optional GROK

Below is the GROK with REGEX pattern that parses logs with multiline and different timestamp formats.


%{WORD:logtype} (%{TIMESTAMP_ISO8601:event_created})?%{SPACE}(%{LOGLEVEL:log_level})?((%{GREEDYDATA:message})?(? [A-Z][a-z]{2,3} \d{1,2}, \d{4} \d{1,2}:\d{2}:\d{2} [AP]M) %{DATA:class} %{WORD:action})?( usage: java %{DATA:class1} \[ -config %{DATA:config_path} \] \[ -nonaming \](.*)? %{DATESTAMP_OTHER:timestamp})?(?[^\t]+)?(\t+)?(?[\w\W\.\d\(\):]+$)?
.code-block { font-family: monospace; background-color: rgba(255, 255, 255, 0.06); /* BG color with 6% opacity */ border: 1px solid rgba(255, 255, 255, 0.2); /* Stroke color with 20% opacity */ padding: 24px; /* Block padding all around */ border-radius: 16px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Analyzing logs from multiple products with complex log structures presents significant challenges for security operations teams. However, with the right approach, including the use of advanced rules and GROK patterns, these challenges can be overcome. By understanding diverse log types, leveraging advanced techniques, and embracing automation, security operations teams can extract valuable insights from log data, enabling them to proactively detect and respond to potential security incidents effectively.

Get notified

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

BLOGS AND RESOURCES

Latest Articles