CASE STUDY:
Centralized Log Management, Security integration development for capturing, indexing and analysis of unstructured and structured data from security endpoints
- Log storing, indexing, mapping unstructured data into structured data
- Develop several Java plugins to integrate with GCP, Google Workspace, CarbonBlack, Azure Cloud and more as following services:
- Fetch logs, based on log types VPC Flow logs, Audit logs, Firewall logs defined in the application configuration settings
- Query the fetched logs at scheduled intervals
- Normalize the classic plain logs to customer’s standard format that facilitates compressing
- Ingest the logs into application for analysis
- Service Account is used for authentication as it supports app to app authorization and does not involve human intervention
- Logs streaming from Cloud Logging API are routed to a logging sink
- Logging Sink filters the logs based on log types configured in the application
- Destination for logging sink is BigQuery Data Set
- Java application creates BigQuery Data Set and tables are auto created inside the dataset when the logs are available. The schema for the table is based on log type
- Unique tables are created for each log type and on a daily basis
- Java application queries the BigQuery tables at scheduled intervals to fetch the logs and ingest into the application. The results of the BigQuery tables could be fetched in csv or in json formats
- After fetching the logs, the tables are deleted by the application based on table creation time. This will ensure storage is minimized
Pub/Sub was an alternative approach, but BigQuery was considered based on performance.
Google Workspace Logs
Approach:
Logs are fetched through Reports API that programmatically retrieves the activity and usage reports. Domain Wide Delegation is an approach to authorize a third-party application to access Admin SDK Reports API. A service account in Google Cloud Platform is authorized through Domain Wide Delegation to access the logs from Workspace.
- Authentication is done using service account for app-to-app authorization
- This service account has domain wide delegation enabled and can access workspace logs
- The Java application connects to the service account and pulls the activity report. The admin activity report lists all activities of all administrators and is organized by event names
- User can visualize admin, drive, login, calendar, token logs using Reports API
Gmail logs to BigQuery
Gmail logs can be fetched through BigQuery by setting up configurations to export Gmail logs into BigQuery by specifying the service account and the BigQuery dataset name. This feature is available to Enterprise and Education subscriptions with Standard and Plus subtypes.
When email logs are turned on, BigQuery Dataset creates a template table as daily table_ which is used as schema table. The daily tables are auto created based on availability of logs.