Handling Multiline Log formats using Regex and GROK Parser

To effectively analyze logs from multiple products, security operations teams must first comprehend the diverse landscape of log types. We will provide an overview of common log types encountered, such as system logs, application logs, network logs, and security logs. By understanding the characteristics and formats of each log type, teams can better prepare for the complexities that lie ahead.

Security Operations teams face challenges in analyzing different log types from Multiple Products. A few products have complicated log structures which requires advanced Rules and GROK pattern to extract the fields from RAW message.

The Challenge of Complicated Log Structures:

Certain products generate logs with intricate structures that pose challenges for analysis. We will examine the reasons behind these complexities, including proprietary log formats, inconsistent field naming conventions, and unstructured log data. Through examples, we will showcase the difficulties faced by security operations teams and how these complicated log structures can hinder their ability to extract relevant information effectively.

Regex and GROK Patterns – Unleashing the Power of Pattern Matching and Log Parsing:

Regular expressions, or regex, are a powerful tool for pattern matching in log analysis. We will explore techniques such as using anchors, modifiers, quantifiers, and capture groups to identify and extract relevant data from multiline log entries.

GROK patterns are a powerful tool for log parsing, enabling security operations teams to extract fields from raw log messages efficiently. Through practical examples, we will demonstrate how GROK patterns can be customized to handle complex log structures and extract valuable information. We will also highlight the importance of maintaining a GROK pattern library for consistent and scalable log analysis.

Overcoming Log Analysis Challenges:

We will address the specific challenges encountered by security operations teams when analyzing logs from multiple products with diverse log structures. We will discuss issues such as data normalization, log integration, and log source identification. Moreover, we will provide strategies and techniques to overcome these challenges, including log aggregation, log enrichment, and normalization processes.

This blog explains about analyzing and converting F5 BIG-IP logs which give different Timestamp formats and Multiple lines in one single Log into queryable/readable format.

Objective: Processing Complex log that have irregular or inconsistent patterns with various tools and frameworks.

Understand the log format: Familiarize with the structure and format of the log messages. Identify the different components, fields, and patterns within the logs.
Define the parsing strategy: Determine the approach used to parse the logs. This can include using regular expressions (regex), Grok patterns, or specific log parsing libraries or frameworks.
Identify key fields: Identify the specific fields or information to extract from the logs. These include timestamps, log levels, error codes, user IDs, or any other relevant data.
Writing Parser: Define regex/grok patterns that capture the required information and use them to extract the data using pipelines. Pipeline processes the incoming log messages by extracting relevant information, performs transformation using parser and takes actions based on condition.
Utilize log parsing libraries or frameworks: For more complex log formats, leverage log parsing libraries or frameworks that provide built-in functionality to handle log parsing. Examples include Logstash, Elasticsearch, Fluentd, Apache Kafka, or specific language-specific log parsing libraries.
Test and refine: Test parsing strategy and patterns against sample log messages to ensure they accurately extract the desired fields. Adjust and refine the approach as needed.
Process and analyse: Once the logs are successfully parsed and extracted the relevant fields to process and analyse the data. This might involve storing the data in a database, performing aggregations or calculations, generating reports, or integrating it with other systems.

F5 BIG-IP logs

An F5 BIG-IP load balancer distributes the communications evenly across the servers in a network, so that no single server is overwhelmed. The BIG-IP keeps a constant check on the incoming and outgoing traffic of the servers and it will route the user requests to the most available server that can best handle them.

It also improves application performance, scalability and reliability while enhancing security and user experience.

F5 BIG-IP having complex log structure with multiple formats in a single log type. This kind of log requires appropriate parsing techniques and tools to ensure desired field extraction.
In this blog, F5 BIG-IP WEBUI logs is taken as an example to parse using GROK and REGEX patterns.

Encountering various timestamps formats

F5 BIG-IP WEBUI logs consist of multiple timestamp formats.
Grok provides default patterns for commonly used timestamp formats, making it easier to extract timestamps without writing custom regular expressions.
However, there may be cases where a specific timestamp format doesn’t have a default Grok pattern. In such scenarios, can use custom regular expressions that matches the desired timestamp format. This ensures accurate extraction of timestamps from log messages.

For example,

“May 11, 2023, 8:54:13 AM,” the timestamp format does not have a default Grok pattern.

To extract the above timestamp, define a custom Grok pattern using below regular expression which captures the timestamp components (month, day, year, hour, minute, second, AM/PM) and assigns them to the field vendor_timestamp.


(? [A-Z][a-z]{2,3} \d{1,2}, \d{4} \d{1,2}:\d{2}:\d{2} [AP]M).

.code-block { font-family: monospace; background-color: rgb(255, 255, 255); padding: 24px; /* Block padding all around */ border-radius: 8px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Below is the sample logs with Multiple Timestamp formats.


webui INFO: Deployment of configuration descriptor /etc/tomcat/Catalina/localhost/tmui.xml has finished in 55,602 ms May 10, 2023 6:34:02 AM org.apache.catalina.startup.HostConfig deployDescriptor
webui WARNING: [SetPropertiesRule]{Server/Service/Engine/Host} Setting property 'xmlValidation' to 'false' did not find a matching property. May 12, 2023 5:35:14 AM org.apache.tomcat.util.digester.SetPropertiesRule begin usage: java org.apache.catalina.startup.Catalina [ -config {pathname} ] [ -nonaming ]  { -help | start | stop } Fri May 12 05:35:11 PDT 2023
webui 2023-05-10T13:34:02Z ERROR  [Thread-4] controller.SubscriberServlet:subscribe     : MCP subscribe error: Unable to read POST response data java.net.ConnectException: Connection refused (Connection refused)

.code-block { font-family: monospace; background-color: rgb(255, 255, 255; /* BG color with 6% opacity */ padding: 24px; /* Block padding all around */ border-radius: 8px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

Regex for Multiline log formats

Multiline logs often have more complex structures compared to single-line logs. They may span multiple lines and contain line breaks, making it difficult to extract information using predefined Grok patterns.
Below Regex pattern provides more flexibility and control over capturing patterns across multiple lines.


(\t+)?(?[\w\W\.\d\(\):]+$)

.code-block { font-family: monospace; background-color: rgba(255, 255, 255, 0.06); /* BG color with 6% opacity */ border: 1px solid rgba(255, 255, 255, 0.2); /* Stroke color with 20% opacity */ padding: 24px; /* Block padding all around */ border-radius: 16px; overflow-x: auto; /* Enable horizontal scrolling for long lines */}

The above pattern matches with line breaks and tab spaces, captures multiline format till the end of the string.

Below is the sample for Multiline log.


webui SEVERE: Servlet.service() for servlet [org.apache.jsp.tmui.overview.welcome.introduction_jsp] in context with path [/tmui] threw exception May 24, 2023 11:14:39 PM org.apache.catalina.core.StandardWrapperValve invoke java.lang.NullPointerException 
     at com.f5.util.UsernameHolder.getUsername(UsernameHolder.java:72) 
     at com.f5.util.UsernameHolder.updateConnection(UsernameHolder.java:270)  
     at com.f5.util.UsernameHolder.updateConnection(UsernameHolder.java:245)

Regex with GROK

Using GREEDYDATA in the middle of the log message captures all the characters remaining in the log line, which affects the parsing complexity.
With help of REGEX patterns, it provides the flexibility to capture the required fields without compiling unnecessary fields. This improves the performance of the parser.
Customize Parsing: GROK comes with a set of predefined patterns, but sometimes log formats are unique and not covered by those predefined patterns.
By using regex within GROK, custom patterns belonging to the specific log format are created allowing for more precise parsing and extraction of data.

Optional GROK

F5 BIG-IP WebUI logs have different formats or structures depending on the specific events or actions being logged.
By using optional Grok patterns, extract these optional fields when they are present and ensure that the log parsing system can handle different log formats without encountering parsing errors.
The parser skips the pattern when the fields are absent which helps in handling different log format in a single pattern.

Below is the GROK with REGEX pattern that parses logs with multiline and different timestamp formats.


%{WORD:logtype} (%{TIMESTAMP_ISO8601:event_created})?%{SPACE}(%{LOGLEVEL:log_level})?((%{GREEDYDATA:message})?(? [A-Z][a-z]{2,3} \d{1,2}, \d{4} \d{1,2}:\d{2}:\d{2} [AP]M) %{DATA:class} %{WORD:action})?( usage: java %{DATA:class1} \[ -config %{DATA:config_path} \] \[ -nonaming \](.*)? %{DATESTAMP_OTHER:timestamp})?(?[^\t]+)?(\t+)?(?[\w\W\.\d\(\):]+$)?

Analyzing logs from multiple products with complex log structures presents significant challenges for security operations teams. However, with the right approach, including the use of advanced rules and GROK patterns, these challenges can be overcome. By understanding diverse log types, leveraging advanced techniques, and embracing automation, security operations teams can extract valuable insights from log data, enabling them to proactively detect and respond to potential security incidents effectively.

About Loginsoft

For over 20 years, leading companies in Telecom, Cybersecurity, Healthcare, Banking, New Media, and more have come to rely on Loginsoft as a trusted resource for technology talent. From startups, to product and enterprises rely on our services. Whether Onsite, Offsite, or Offshore, we deliver. With a track record of successful partnerships with leading technology companies globally, and specifically in the past 6 years with Cybersecurity product companies, Loginsoft offers a comprehensive range of security offerings, including Software Supply Chain, Vulnerability Management, Threat Intelligence, Cloud Security, Cybersecurity Platform Integrations, creating content packs for Cloud SIEM, Logs onboarding and more. Our commitment to innovation and expertise has positioned us as a trusted player in the cybersecurity space. Loginsoft continues to provide traditional IT services which include Software development & Support, QA automation, Data Science & AI, etc.

Expertise in Integrations with Threat Intelligence and Security Products: Built more than 250+ integrations with leading TIP, SIEM, SOAR, and Ticketing Platforms such as Cortex XSOAR, Anomali, ThreatQ, Splunk, IBM QRadar & Resilient, Microsoft Azure Sentinel, ServiceNow, Swimlane, Siemplify, MISP, Maltego, Cryptocurrency Digital Exchange Platforms, CISCO, Datadog, Symantec, Carbonblack, F5, Fortinet, and so on. Loginsoft is a partner with industry leading technology vendors Palo Alto, Splunk, Elastic, IBM Security, etc.

In addition, Loginsoft offers Research as a service: We're more than just experts in cybersecurity; we're your accredited in-house research team focused on unraveling the complexities of cybersecurity and future technologies. From Application Security to Threat Research, our seasoned professionals have cultivated expertise in every facet of the field. We've earned the trust of over 20 security platform companies, who count on our research and analysis to strengthen their cybersecurity solutions.

Interested to learn more? Let’s start a conversation.

Handling Multiline Log formats using Regex and GROK Parser

F5 BIG-IP logs

Encountering various timestamps formats

Regex for Multiline log formats

Regex with GROK

Optional GROK

About Loginsoft

Latest Articles

Why Healthcare sector remains as the top target for cyberattacks

The BlackLock Breakdown: Tools, Tactics and The Rivalry that brought it down

Initial Access Brokers: The Hidden Architects of Modern Cyberattacks

Handling Multiline Log formats using Regex and GROK Parser

F5 BIG-IP logs

Encountering various timestamps formats

Regex for Multiline log formats

Regex with GROK

Optional GROK

About Loginsoft

Get notified

Latest Articles

Why Healthcare sector remains as the top target for cyberattacks

The BlackLock Breakdown: Tools, Tactics and The Rivalry that brought it down

Initial Access Brokers: The Hidden Architects of Modern Cyberattacks