We're thrilled to share our approach for identifying security risks in open-source software using Semgrep. Our exploration led us to the `node:vm` module, a part of Node.js that runscode within V8 virtual machine contexts. The `vm.runInNewContext` function, which compiles and executes code, caught our attention. The module's documentation clearly states:
This statement piqued our interest, prompting us to investigate further using Sourcegraph community edition. This powerful code search tool, with its robust features, allowed us to delve into the code and gain a deeper understanding. Its ability to search across multiple repositories for specific code patterns made it an ideal tool for our investigation.
Our search led us to the Medplum package, which uses the `vm.runInNewContext` function. Given the sensitive nature of healthcare data, we decided to investigate Medplum more closely.
Using Semgrep, we actively scanned Medplum to identify potential vulnerabilities. Semgrep is a powerful tool renowned for its effectiveness in detecting security issues. Specifically, it excels at spotting vulnerabilities such as SQL injection, cross-site scripting (XSS), remote code execution, and other injection attacks. Our scan revealed a code injection vulnerability, highlighting the importance of Static Application Security Testing (SAST) scanners.
Vulnerability Analysis
We found the vulnerability in a specific line of code in the `runInVmContext` function, which executes user-provided code with in a new VM context. In the code snippet below, `${code}` is a placeholder for user-provided code. The `wrapped Code` variable directly interpolates this code, which is the source of the vulnerability.
Executing wrappedCode in runInNewContext triggers vulnerability.
Medplum uses bots to build highly automated, custom workflows. These bots form the backbone of Medplum's automation capabilities, enabling a wide range of automated tasks and processes.
We validated our findings through manual penetration testing and identified a privilege escalation vulnerability that allows a low-privileged user to access the bot editor and execute malicious code.
The Importance of Secure Coding Practices
The vulnerability we identified in Medplum's use of the Node.js `vm` module highlights the importance of secure coding practices when dealing with user-supplied input. Developers must use the vm module in Node.js with extreme caution because it can easily lead to code injection vulnerabilities if not implemented securely.
In the case of Medplum, the direct interpolation of user-provided code into the `wrappedCode` variable was the root cause of the vulnerability. Untrusted data executes as code in this classic example of a code injection vulnerability, potentially allowing an attacker to gain unauthorized access or execute malicious actions.
To mitigate such vulnerabilities, it's crucial to follow secure coding practices, such as:
- Input Validation: Thoroughly validate and sanitize all user-supplied input before using it in any sensitive operations, such as executing code within a virtual machine context.
- Least Privilege: Ensure that the execution context of the virtual machine has the minimum required permissions to perform its intended functionality, limiting the potential impact of a successful attack.
- Sandboxing: Implement robust sandboxing mechanisms to isolate the virtual machine execution from the rest of the application and the underlying system, reducing the attack surface and potential damage.
- Logging and Monitoring :Implement comprehensive logging and monitoring to detect and respond to any suspicious activity or attempted exploits.
- Regular Security Assessments: Perform regular security assessments, including static code analysis, penetration testing, and bug bounty programs, to identify and address vulnerabilities before they can be exploited.
By following these practices, developers can significantly reduce the risk of introducing vulnerabilities like the one we found in Medplum's use of the `vm` module.
Semgrep Rule Explanation
The Semgrep rule we developed to detect the code injection vulnerability in Medplum's `runInVmContext` function operates in taint mode, which tracks the flow of data from source to sink. Let's break down the different components of the rule:
- Pattern Sources: The `pattern-sources` section identifies where the potentially untrusted data originates. In this case, we are looking for variables that do not have a hardcoded string value assigned to them (i.e., pattern-not: const $SOURCE_VAR = "...";). This allows us to identify variables that may contain user-supplied or otherwise untrusted input.
- Pattern Sinks: The `pattern-sinks` section identifies the usage of the potentially untrusted data. In this case, we are searching for calls to the vm.runInNewContext, vm.runInContext, or vm.runInThisContext functions, since these functions represent the sensitive sinks where untrusted data can be executed.
- Taint Tracking: The `mode: taint` setting enables taint tracking, which follows the flow of data from the identified sources to the identified sinks. The rule enables detection of cases where user-supplied input is used in a potentially unsafe manner, even if it passes through several intermediate variables.
You can scan your codebase for similar vulnerabilities and identify areas where user-supplied input is being used in potentially unsafe ways, such as within the node:vm module, by using this Semgrep rule. This can help you proactively address security issues and improve the overall security posture of your application.
Scan Results
The scan results showed that the taint originates from the `codeUrl` variable, which is user-controlled input. It then flows through several intermediate variables, including `binary`, `stream`, and `wrappedCode`, before reaching the sink at`returnValue`.
Conclusion
Our discovery of this code injection vulnerability in Medplum through a Semgrep SAST scan underscores the importance of regular and thorough security testing in the software development life cycle. By leveraging powerful tools like Semgrep and Sourcegraph, we were able to dive deep into the application codebase and uncover critical security flaws.