This article exclusively helps the beginners out there, who are willing to learn and explore Osquery.
Initially, we would like to explain the mechanism and the technical terms related to Osquery functionality and also give examples to demonstrate how Osquery works.
We all know that Osquery is a tool that exposes an operating system as a high-performance relational database by enabling the developers to write SQL-based queries.
SQL tables are thus created to understand the performance or data related to:
- Processes under execution
- Loaded kernel modules
- Open network connections
- Browser plugins
- Hardware events
- File hashes
Let us understand, how technically Osquery works and what all are required for a successful query processing.
Query Engine
The query engine of Osquery is completely SQLite, thus making it simpler to the users while querying, parsing, optimizing and executing the data as required. Also, the data here is not stored in SQLite, but captured under Virtual Tables or RocksDB Database.
Virtual Tables
Virtual Tables are the soul of Osquery, which are defined through a DSL implemented in Python. These tables are generated during a query execution either by parsing a file or calling a system API, which further plays a vital role in analytics.
How Query Engine and Virtual Tables are related?
- Initially, Osquery build powers the SQLite with all the required data definitions and file hierarchies, which helps to retrieve the data dynamically.
- When the user passes a query, the engine requests the Virtual Tables to generate the data
- Now, the Osquery code translates the SQLite table constraints so that the virtual table can generate the requested data using APIS.
Event System
The Osquery has an intelligent Event system, which has a keen eye for monitoring sense. Each time a virtual table is queried, the event system can generate, filter and store data to be exposed. A continuous visibility is promised with the help of this Event system, where the Publishers are also granted the permission to develop their own event scheduler thread and can choose the API of their choice.
Scheduler
A smart move, which allows the queries to run on an approximate scheduled time interval , rather than exact one, thus preventing resource spikes across the fleet.
Diff Engine
Osquery has an excellent feature of Differential Engine, which helps to output only the most relevant data. Each time a query runs, the results of that query are stored in the internal RocksDB store.
RocksDB
In order to support the buffering of events efficiently, an outer agent called RocksDB, is used by Osquery, which is a highly write-optimized, embedded key-value database that is compiled into the Osquery binary.
Configuration and Logger Plugins
As we know that Osquery runs in various environments, the Configuration Plugins and the result of log plugins need to be adaptable to all the platforms. Hence these Osquery plugins provide the required configuration to the daemon.
Distributed Plugins
Though not essential, the Distributed plugins enable remote live querying of data from Osquery.
Static Compilation
Osquery binds most of its dependencies into a single static executable form, by easing the deployment of wide array of endpoints.
Watchdog
A watchdog of Osquery can be defined as a performance monitor guard for every query processing. It keeps tabs on the currently executing processes and kills the unwanted worker process if they exceed beyond the defined thresholds. Watchdog can also blacklist the least performing or problem causing queries.
How Does Osquery Work?
Osquery can be widely implemented to know system related information.
Let us try to find the following:
- To know the tables and the columns
- Query: .schema users
- List users
Query: SELECT * FROM users;
- Get the process name, port and PID for all processes
- Query: SELECT DISTINCT processes.name,listening_ports.port,processes.pid FROM listening_ports JOIN processes using (pid);
- To know logged in type, user, host
- Query: SELECT type, user, host FROM logged_in_users WHERE type = 'active';
- To know open ipv4 local_address, remote_address, local_port, remote port
- Query: SELECT local_address, remote_address, local_port, remote_port, state FROM process_open_sockets WHERE family ='2';
- To know ipv6 open local_address, remote_address, local_port, remote port
- Query: SELECT local_address, remote_address, local_port, remote_port, state FROM process_open_sockets WHERE family ='10';
- To know routes information
- Query: SELECT * FROM routes;
- Check Whether your system is compromised or not
- Query: SELECT name, path, pid FROM processes WHERE on_disk=0;
- Query: SELECT pid, name FROM processes;
- Query: SELECT pid, name FROM processes WHERE name="chrome.exe";
How Osquery File Integrity Monitoring Helps a Security Researcher
Security breaches start with a small change in the existing network or files, thus exposing all the sensitive data beyond limits, welcoming a potential threat further.
File Integrity Monitoring is one of the combat mechanisms for security threats, which tracks and records every single change that may have occurred in either stored or exposed file systems further alerting you on anomaly activities.
What is File Integrity Monitoring?
FIM is a technology that helps to monitor and detect the changes in files or any suspicious activity which may lead to a future cyber attack. FIM helps us to understand the file integrity monitoring by analyzing every file and checks how, when and by whom they are changed. It also suggests the process to restore those files if in case the modifications done were un-authorized.
Importance of FIM in security perspective
- High Protection to IT Infrastructure is always ensured through FIM which monitors the file changes on servers, databases, network devices, directory servers, applications, cloud environments, virtual images and also alerts about unauthorized changes.
- Helps always to stay focused and Compliant to many regulatory compliance standards like PCI-DSS, NERC CIP, FISMA, SOX, NIST and HIPAA, and also assists to adhere to the best practice frameworks like the CIS security benchmarks.
- Threat Intelligence is one more extended advantage of FIM which notifies about the change intelligence when needed along with business context and remediation steps.
- Encourages organizations to set up initial policies and baseline guidelines , which explains what kind of files are needed to be monitored under FIM
- Intense monitoring and instant reporting on malicious activity, which can help in further exploitation.
Conclusion:
Osquery is the most implemented and recognized way of exposing a system level data with a relational-database-like API which enables to query and understand one's own operating system to detect the zombie processes or any threat exposures.
Security researches consider Osquery as a powerful weapon to efficiently manage File Integrity Monitoring, Anomaly detection, Malware analysis and a robust threat intelligence adaptation.