SIEM Log Sources — Collection, Onboarding & Normalisation

Types of Log Sources — Firewalls, Endpoints, Servers, Cloud & Apps

Understanding the diverse range of SIEM log sources is fundamental for effective Security Information and Event Management (SIEM) operations. Log sources are the origins of security-related data that feed into the SIEM system for analysis, alerting, and incident response. Each type of log source provides unique insights into different aspects of an organization’s security posture. Recognizing the characteristics, typical log formats, and challenges associated with these sources enhances log collection, onboarding, and normalization processes.

**Firewalls** are primary perimeter security devices that monitor and control incoming and outgoing network traffic based on predetermined security rules. Firewall logs include details such as source and destination IP addresses, port numbers, protocol types, action taken (allow or block), and rule identifiers. These logs are crucial for detecting unauthorized access attempts, lateral movement, and policy violations. Firewalls from vendors like Cisco ASA, Palo Alto Networks, and Fortinet produce syslog or proprietary format logs.

**Endpoints** encompass workstations, laptops, servers, and mobile devices. Endpoints generate logs related to user activity, process execution, application errors, malware detection, and system events. Endpoint Detection and Response (EDR) tools like CrowdStrike, SentinelOne, and Microsoft Defender produce rich data streams that, once onboarded, can reveal malicious activities or policy violations at the device level.

**Servers**—including Windows, Linux, and database servers—produce logs covering system events, application logs, authentication attempts, and resource access. Windows servers generate Event Logs, while Linux servers produce syslog entries and application-specific logs, which, when correlated, can reveal sophisticated attack patterns or misconfigurations.

**Cloud & Applications** have become vital SIEM data sources. Cloud platforms like AWS, Azure, and GCP generate logs related to user activities, resource provisioning, network flow data, and audit trails. These logs are often collected via APIs or native integrations and are essential for cloud security monitoring and compliance.

Effective threat detection hinges on understanding the nuances of each log source type, their log formats, and the best practices for collection and normalization. Combining logs from diverse sources offers comprehensive visibility, enabling security teams to detect complex attack chains and respond swiftly. For organizations undergoing cloud migrations or adopting hybrid environments, mastering the management of varied log sources is a critical skill, as highlighted in the best cloud security courses at Networkers Home.

Log Collection Methods — Syslog, Agent, API, WEC & SNMP Traps

Collecting logs efficiently from diverse SIEM log sources requires understanding various log collection methods. Each method offers different advantages, deployment complexities, and suitability depending on the source type and organizational requirements.

Syslog remains the most widely used method for network devices, firewalls, switches, and many Linux servers. It relies on the UDP or TCP protocols to send log messages in a standardized format to a centralized syslog server or SIEM collector. Configuration involves setting the device or server to forward logs to the SIEM’s syslog receiver, often using tools like rsyslog or syslog-ng.

# Example: Configuring Cisco ASA to send logs via syslog to SIEM
logging host 192.168.1.100
logging trap informational
logging enable

Agents are lightweight software installed directly on endpoints or servers to collect logs and forward them to the SIEM. Agents like Filebeat, Winlogbeat, or proprietary solutions from security vendors gather logs from local sources such as Windows Event Logs or Linux syslog files. They provide flexibility and improve log fidelity, especially in remote or distributed environments.

APIs facilitate direct log extraction from cloud platforms, SaaS applications, or modern security tools. For instance, AWS CloudTrail logs can be fetched via AWS APIs, enabling real-time or scheduled ingestion into SIEM systems. APIs are essential for integrating cloud-native logs and custom applications, offering structured data suitable for advanced analysis.

Windows Event Collector (WEC) is a feature that centralizes Windows Event Logs from multiple endpoints. WEC servers subscribe to event sources via subscriptions, enabling consolidation and efficient onboarding of Windows logs into SIEMs. This method reduces the need for agents on each system and allows for scalable collection.

SNMP Traps are used primarily for network devices to notify management systems of specific events, such as interface failures or threshold breaches. While not as detailed as syslog, SNMP traps provide early alerts for critical network issues, which can be correlated with other logs for comprehensive security monitoring.

Choosing the right collection method depends on factors like source type, network topology, security policies, and scalability requirements. Combining multiple methods ensures comprehensive coverage, with each method complementing others to provide a holistic security view.

Syslog Configuration — rsyslog, syslog-ng & CEF/LEEF Formats

Configuring syslog for log collection SIEM involves setting up syslog servers such as rsyslog or syslog-ng. Proper configuration ensures reliable, secure, and standardized log forwarding, which is critical for effective log onboarding and normalization.

rsyslog is the default syslog daemon on many Linux distributions. Its configuration files, typically located in /etc/rsyslog.conf or /etc/rsyslog.d/, define where logs are sent and in what format. To forward logs to a SIEM, add a line like:

*.* @192.168.1.200:514

This command forwards all logs to the SIEM at IP 192.168.1.200 on port 514. For TCP transmission, use @@. To enhance security, enable TLS encryption with certificates, ensuring log confidentiality and integrity.

syslog-ng offers more advanced filtering, rewriting, and format customization capabilities. Its configuration, located in /etc/syslog-ng/syslog-ng.conf, allows defining sources, destinations, and log paths. Example configuration snippet:

destination d_siem { tcp("192.168.1.200" port(514)); };
log { source(s_src); destination(d_siem); };

Both syslog daemons can be configured to export logs in formats compatible with SIEM ingestion, such as CEF (Common Event Format) or LEEF (Log Event Extended Format). These formats standardize log data, easing normalization and correlation.

**CEF/LEEF formats** encode structured log data with headers and extensions. For example, a CEF log might look like:

CEF:0|FireEye|FireEye EDR|1.0|1001|Malicious activity detected|10|src=192.168.0.1 dst=10.0.0.5 spt=1232 dpt=80

Configuring syslog to output in CEF/LEEF involves custom formatting rules or using dedicated agents that support these formats. Proper implementation of syslog configuration ensures that logs are both comprehensive and easily parsable during onboarding and normalization stages.

Windows Event Log Collection — WEF, Sysmon & NXLog

Windows environments are prolific sources of security-relevant logs, including user activities, process creations, and system errors. Effective collection of Windows Event Logs involves using technologies like Windows Event Forwarding (WEF), Sysmon, and NXLog.

Windows Event Forwarding (WEF) is a native Windows feature that enables centralized collection of Event Logs. It operates in two modes: Source Initiated and Collector Initiated. In source-initiated mode, endpoints subscribe to a collector, which then receives logs via the WS-Management protocol. Configuration involves setting up subscription XML files and configuring group policies or PowerShell commands.

Example command to configure WEF subscription:

wecutil qc /q:

Sysmon (System Monitor) is a Windows Sysinternals tool that enhances Windows logging by capturing detailed process creation, network connections, and file modifications. It provides high-fidelity logs that can detect sophisticated threats. Deployment involves installing Sysmon and configuring its XML config file, such as:

Sysmon.exe -i sysmon.xml

Sample sysmon.xml includes rules for process creation and network connections, enabling detailed monitoring beyond native Event Logs.

NXLog is a versatile log collector supporting Windows Event Logs, Sysmon, and custom scripts. Configured via its nxlog.conf file, it can forward logs using various protocols and formats, including CEF, JSON, or plain text. Example snippet for forwarding Windows logs:


  Module im_msvistalog



  Module om_udp
  Host 192.168.1.150
  Port 514
  Exec to_syslog_ietf();



  Path in_win_eventlog => out_siem

Integrating these tools ensures complete, normalized, and secure collection of Windows logs, facilitating efficient onboarding into SIEM systems for threat detection and compliance.

Cloud Log Sources — AWS CloudTrail, Azure Activity Logs & GCP Audit

Cloud environments generate a wealth of security logs that are essential for comprehensive SIEM coverage. Major cloud providers like AWS, Azure, and Google Cloud Platform (GCP) offer native logging services designed for security monitoring, compliance, and forensic analysis.

AWS CloudTrail captures API activity across AWS accounts, including user actions, resource modifications, and service events. It can be configured to send logs to Amazon S3, CloudWatch Logs, or external SIEMs via API or CloudWatch subscriptions. Example: Setting up CloudTrail to deliver logs to CloudWatch:

aws cloudtrail create-trail --name myTrail --s3-bucket-name myBucket
aws cloudwatch put-metric-filter --log-group-name myTrail --filter-name myFilter --filter-pattern "ERROR"

Azure Activity Logs provide insights into subscription-level operations such as resource provisioning, configuration changes, and security alerts. These logs are accessible via the Azure Monitor API, Event Grid, or can be exported to SIEMs through solutions like Azure Sentinel.

GCP Audit Logs track admin activity, data access, and system events within GCP projects. They can be exported via Pub/Sub topics or Cloud Logging sinks to SIEM platforms. For example, configuring a sink:

gcloud logging sinks create my-sink pubsub.googleapis.com/projects/my-project/topics/my-topic --log-filter='resource.type="gce_instance"'

Integrating cloud logs into SIEMs requires secure API access, proper permissions, and normalization. Many SIEM solutions support native connectors for these cloud services, simplifying log onboarding. This integration ensures visibility into cloud-native threats, compliance violations, and operational anomalies, complementing on-premise log sources.

Log Normalisation — Parsing Raw Logs into Structured Fields

Raw logs are often unstructured or semi-structured, making analysis and correlation challenging. Log normalization is the process of transforming diverse log formats into structured, standardized data fields that facilitate efficient querying, alerting, and investigation.

During normalization, raw log entries are parsed to extract key attributes such as timestamp, source IP, destination IP, user IDs, event types, and severity levels. This process involves defining parsing rules, regex patterns, or leveraging log processors and agents that support schemas.

For example, a syslog message like:

May  5 12:34:56 firewall1 %ASA-6-106100: Allowed inbound TCP connection from 192.168.1.10/12345 to 10.0.0.5/80

can be parsed into structured fields:

Timestamp: 2023-05-05 12:34:56
Device: firewall1
Event ID: ASA-6-106100
Action: Allowed
Protocol: TCP
Source IP: 192.168.1.10
Destination IP: 10.0.0.5
Destination Port: 80

Normalization enables correlation across sources, reduces false positives, and enhances automated responses. Tools like custom parsers, Logstash, Fluentd, and SIEM-native parsers are commonly used for this purpose.

Effective normalization maximizes the value of collected logs, turning raw data into actionable intelligence, and aligns with best practices discussed at Networkers Home Blog.

Common Event Format Standards — CEF, LEEF & ECS

Standardized event formats streamline log onboarding and normalization in SIEM systems. The most prevalent standards include CEF (Common Event Format), LEEF (Log Event Extended Format), and ECS (Elastic Common Schema).

Format	Description	Typical Use Cases	Example Snippet
CEF	Open standard by ArcSight, structured with a header and extensions for key-value pairs.	Network devices, firewalls, endpoint security tools.	CEF:0\|Vendor\|Product\|Version\|SignatureID\|Name\|Severity\|src=1.2.3.4 dst=5.6.7.8
LEEF	Extended format by Splunk, similar to CEF but with a different syntax and emphasis on ease of parsing.	SIEM integrations, log forwarding from various tools.	LEEF:1.0\|Vendor\|Product\|Version\|SignatureID\|Name\|Severity\|src=1.2.3.4 dst=5.6.7.8
ECS	Open schema by Elastic, designed for structured, consistent log data across diverse sources.	Elastic Stack integrations, cloud-native environments.	{"@timestamp":"2023-05-05T12:34:56Z","event.category":"network","source.ip":"192.168.1.10"}

Implementing these standards during log onboarding ensures interoperability, simplifies normalization, and enhances analytical capabilities. Many SIEM solutions and log agents natively support CEF and LEEF, facilitating seamless integration with diverse Networkers Home Blog resources.

Log Source Onboarding Checklist & Best Practices

Onboarding new log sources into a SIEM requires meticulous planning and adherence to best practices. A comprehensive checklist ensures completeness and reduces gaps in visibility.

Identify the Source Type and Data Format: Determine whether the source is a firewall, endpoint, cloud service, or application. Understand the log format and required fields.
Configure Secure Log Transmission: Use encrypted protocols like TLS for syslog, secure API calls, or VPN tunnels for agent-based collection.
Standardize Log Formats: Convert logs into common schemas like CEF, LEEF, or ECS for consistency.
Implement Log Filtering and Prioritization: Filter out noise and focus on security-relevant events during collection.
Test Log Delivery: Verify logs are received accurately, complete, and in the correct format in the SIEM.
Document Configuration and Mapping: Record source details, log formats, and normalization rules for future reference and audits.
Establish Regular Monitoring and Maintenance: Continuously review log health, update configurations as needed, and ensure ongoing compliance.

Adhering to these best practices maximizes the value derived from SIEM log sources and ensures robust security monitoring. For tailored onboarding strategies, consider exploring courses at Networkers Home.

Key Takeaways

Understanding different SIEM log sources is critical for comprehensive security monitoring.
Methods like syslog, agents, APIs, WEC, and SNMP traps serve distinct collection needs based on source type.
Proper syslog configuration using rsyslog or syslog-ng supports standardized log forwarding, especially with CEF/LEEF formats.
Windows log collection benefits from WEF, Sysmon, and NXLog, enabling detailed and centralized logs.
Cloud platforms offer native logging solutions that integrate seamlessly with SIEMs for hybrid security visibility.
Normalization transforms raw logs into structured data, facilitating efficient analysis and correlation.
Adopting standards like CEF, LEEF, and ECS enhances interoperability and simplifies onboarding.
A systematic onboarding checklist ensures consistent, secure, and complete log source integration.

Frequently Asked Questions

What are the best practices for onboarding new log sources into a SIEM?

Effective onboarding involves identifying the log source type, configuring secure transmission protocols, standardizing log formats (preferably in CEF, LEEF, or ECS), implementing filters to reduce noise, verifying log integrity, documenting configurations, and establishing ongoing monitoring routines. Proper documentation and periodic reviews ensure continuous visibility and compliance. Leveraging automation tools and templates accelerates onboarding while maintaining consistency. For comprehensive training, explore courses at Networkers Home.

How does log normalization improve SIEM efficiency?

Log normalization converts raw, unstructured logs into structured, standardized formats, enabling easier parsing, correlation, and analysis. It facilitates quick identification of security incidents, reduces false positives, and enhances automation. Normalized data supports cross-source correlation, making detection of complex attack patterns more accurate. Using schemas like ECS or formats like CEF/LEEF ensures interoperability across diverse log sources and SIEM platforms, ultimately leading to faster incident response and better security posture.

Which are the most common SIEM log sources in cloud environments?

In cloud environments, the most common log sources include AWS CloudTrail for API activity, Azure Activity Logs for subscription operations, and GCP Audit Logs for admin and system events. Additionally, cloud-native network flow logs, security alerts, and application logs from SaaS platforms are vital. These logs are typically collected via APIs or native integrations, then normalized for analysis. Managing cloud logs alongside on-premise data sources provides comprehensive visibility, enabling effective hybrid security monitoring, as emphasized in courses offered by Networkers Home.