github.com-robertdavidgraham-masscan_-_2019-11-07_18-58-35

github.com-robertdavidgraham-masscan_-_2019-11-07_18-58-35

Posted by

Download
ITEM TILE – File Size: 8.5K

TCP port scanner, spews SYN packets asynchronously, scanning entire Internet in under 5 minutes.

Build Status

MASSCAN: Mass IP port scanner

This is an Internet-scale port scanner. It can scan the entire Internetin under 6 minutes, transmitting 10 million packets per second,from a single machine.

It’s input/output is similar to nmap, the most famous port scanner.When in doubt, try one of those features.

Internally, it uses asynchronous tranmissions, similar to port scannerslike scanrand, unicornscan, and ZMap. It’s more flexible, allowingarbitrary port and address ranges.

NOTE: masscan uses a its own custom TCP/IP stack. Anything other thansimple port scans may cause conflict with the local TCP/IP stack. This means you need to either the --src-ip option to run from a different IP address, oruse --src-port to configure which source ports masscan uses, then alsoconfigure the internal firewall (like pf or iptables) to firewall those portsfrom the rest of the operating system.

This tool is free, but consider contributing money to its developement:Bitcoin wallet address: 1MASSCANaHUiyTtR3bJ2sLGuMw5kDBaj4T

Building

On Debian/Ubuntu, it goes something like this:

$ sudo apt-get install git gcc make libpcap-dev$ git clone https://github.com/robertdavidgraham/masscan$ cd masscan$ make

This puts the program in the masscan/bin subdirectory. You’ll have tomanually copy it to something like /usr/local/bin if you want toinstall it elsewhere on the system.

The source consists of a lot of small files, so building goes a lot fasterby using the multi-threaded build:

$ make -j

While Linux is the primary target platform, the code runs well on many othersystems. Here’s some additional build info:

  • Windows w/ Visual Studio: use the VS10 project
  • Windows w/ MingGW: just type make
  • Windows w/ cygwin: won’t work
  • Mac OS X /w XCode: use the XCode4 project
  • Mac OS X /w cmdline: just type make
  • FreeBSD: type gmake
  • other: try just compiling all the files together

PF_RING

To get beyond 2 million packets/second, you need an Intel 10-gbps Ethernetadapter and a special driver known as “PFRING ZC” from ntop. Masscan doesn’t need to be rebuilt in order to use PFRING. To use PF_RING,you need to build the following components:

  • libpfring.so (installed in /usr/lib/libpfring.so)
  • pf_ring.ko (their kernel driver)
  • ixgbe.ko (their version of the Intel 10-gbps Ethernet driver)

You don’t need to build their version of libpcap.so.

When Masscan detects that an adapter is named something like zc:enp1s0 insteadof something like enp1s0, it’ll automatically switch to PF_RING ZC mode.

Regression testing

The project contains a built-in self-test:

$ make regressbin/masscan --regressselftest: success!

This tests a lot of tricky bits of the code. You should do this after building.

Performance testing

To test performance, run something like the following:

$ bin/masscan 0.0.0.0/4 -p80 --rate 100000000 --router-mac 66-55-44-33-22-11

The bogus --router-mac keeps packets on the local network segments so thatthey won’t go out to the Internet.

You can also test in “offline” mode, which is how fast the program runswithout the transmit overhead:

$ bin/masscan 0.0.0.0/4 -p80 --rate 100000000 --offline

This second benchmark shows roughly how fast the program would run if it wereusing PF_RING, which has near zero overhead.

Usage

Usage is similar to nmap. To scan a network segment for some ports:

# masscan -p80,8000-8100 10.0.0.0/8

This will:* scan the 10.x.x.x subnet, all 16 million addresses* scans port 80 and the range 8000 to 8100, or 102 addresses total* print output to that can be redirected to a file

To see the complete list of options, use the --echo feature. Thisdumps the current configuration and exits. This output can be used as input backinto the program:

# masscan -p80,8000-8100 10.0.0.0/8 --echo > xxx.conf# masscan -c xxx.conf --rate 1000

Banner checking

Masscan can do more than just detect whether ports are open. It can alsocomplete the TCP connection and interaction with the application at thatport in order to grab simple “banner” information.

The problem with this is that masscan contains its own TCP/IP stackseparate from the system you run it on. When the local system receivesa SYN-ACK from the probed target, it responds with a RST packet that killsthe connection before masscan can grab the banner.

The easiest way to prevent this is to assign masscan a separate IPaddress. This would look like the following:

# masscan 10.0.0.0/8 -p80 --banners --source-ip 192.168.1.200

The address you choose has to be on the local subnet and not otherwisebe used by another system.

In some cases, such as WiFi, this isn’t possible. In those cases, you canfirewall the port that masscan uses. This prevents the local TCP/IP stackfrom seeing the packet, but masscan still sees it since it bypasses thelocal stack. For Linux, this would look like:

# iptables -A INPUT -p tcp --dport 61000 -j DROP# masscan 10.0.0.0/8 -p80 --banners --source-port 61000

You probably want to pick ports that don’t conflict with ports Linux might otherwisechoose for source-ports. You can see the range Linux uses, and reconfigurethat range, by looking in the file:

/proc/sys/net/ipv4/ip_local_port_range

On the latest version of Kali Linux (2018-August), that range is 32768 to 60999, soyou should choose ports either below 32768 or 61000 and above.

Setting an iptables rule only lasts until the next reboot. You need to lookup how tosave the configuration depending upon your distro, such as using iptables-save and/or iptables-persistant.

On Mac OS X and BSD, there are similar steps. To find out the ranges to avoid,use a command like the following:

# sysctl net.inet.ip.portrange.first net.inet.ip.portrange.last

On FreeBSD and older MacOS, use an ipfw command:

# sudo ipfw add 1 deny tcp from any to any 40000 in# masscan 10.0.0.0/8 -p80 --banners --source-port 40000

On newer MacOS and OpenBSD, use the pf packet-filter utility. Edit the file /etc/pf.conf to add a line like the following:

block in proto tcp from any to any port 40000

Then to enable the firewall, run the command:

# pfctrl -E

If the firewall is already running, then either reboot or reload the ruleswith the following command:

# pfctl -f /etc/pf.conf

Windows doesn’t respond with RST packets, so neither of these techniquesare necessary. However, masscan is still designed to work best using itsown IP address, so you should run that way when possible, even when itsnot strictly necessary.

The same thing is needed for other checks, such as the --heartbleed check,which is just a form of banner checking.

How to scan the entire Internet

While useful for smaller, internal networks, the program is really designedwith the entire Internet in mind. It might look something like this:

# masscan 0.0.0.0/0 -p0-65535

Scanning the entire Internet is bad. For one thing, parts of the Internet reactbadly to being scanned. For another thing, some sites track scans and add youto a ban list, which will get you firewalled from useful parts of the Internet.Therefore, you want to exclude a lot of ranges. To blacklist or exclude ranges,you want to use the following syntax:

# masscan 0.0.0.0/0 -p0-65535 --excludefile exclude.txt

This just prints the results to the command-line. You probably want themsaved to a file instead. Therefore, you want something like:

# masscan 0.0.0.0/0 -p0-65535 -oX scan.xml

This saves the results in an XML file, allowing you to easily dump theresults in a database or something.

But, this only goes at the default rate of 100 packets/second, which willtake forever to scan the Internet. You need to speed it up as so:

# masscan 0.0.0.0/0 -p0-65535 --max-rate 100000

This increases the rate to 100,000 packets/second, which will scan theentire Internet (minus excludes) in about 10 hours per port (or 655,360 hoursif scanning all ports).

The thing to notice about this command-line is that these are all nmapcompatible options. In addition, “invisible” options compatible with nmapare also set for you: -sS -Pn -n --randomize-hosts --send-eth. Likewise,the format of the XML file is inspired by nmap. There are, of course, alot of differences, because the asynchronous nature of the programleads to a fundamentally different approach to the problem.

The above command-line is a bit cumbersome. Instead of putting everythingon the command-line, it can be stored in a file instead. The above settingswould look like this:

# My Scanrate =  100000.00output-format = xmloutput-status = alloutput-filename = scan.xmlports = 0-65535range = 0.0.0.0-255.255.255.255excludefile = exclude.txt

To use this configuration file, use the -c:

# masscan -c myscan.conf

This also makes things easier when you repeat a scan.

By default, masscan first loads the configuration file /etc/masscan/masscan.conf. Any later configuration parameters override what’sin this default configuration file. That’s where I put my “excludefile” parameter, so that I don’t ever forget it. It just works automatically.

Getting output

By default, masscan produces fairly large text files, but it’s easy to convert them into any other format. There are five supported output formats:

  1. xml: Just use the parameter -oX . Or, use the parameters --output-format xml and --output-filename .

  2. binary: This is the masscan builtin format. It produces much smaller files, so thatwhen I scan the Internet my disk doesn’t fill up. They need to be parsed,though. The command line option --readscan will read binary scan files.Using --readscan with the -oX option will produce a XML version of the results file.

  3. grepable: This is an implementation of the Nmap -oGoutput that can be easily parsed by command-line tools. Just use theparameter -oG . Or, use the parameters --output-format grepable and--output-filename .

  4. json: This saves the results in JSON format. Just use theparameter -oJ . Or, use the parameters --output-format json and--output-filename .

  5. list: This is a simple list with one host and port pair per line. Just use the parameter -oL . Or, use the parameters --output-format list and --output-filename . The format is:

    open tcp 80 XXX.XXX.XXX.XXX 1390380064

Comparison with Nmap

Where reasonable, every effort has been taken to make the program familiarto nmap users, even though it’s fundamentally different. Two importantdifferences are:

  • no default ports to scan, you must specify -p
  • target hosts are IP addresses or simple ranges, not DNS names, nor the funky subnet ranges nmap can use (like 10.0.0-255.0-255).

You can think of masscan as having the following settings permanentlyenabled:* -sS: this does SYN scan only (currently, will change in the future)* -Pn: doesn’t ping hosts first, which is fundamental to the async operation* -n: no DNS resolution happens* --randomize-hosts: scan completely randomized* --send-eth: sends using raw libpcap

If you want a list of additional nmap compatible settings, use the followingcommand:

# masscan --nmap

Transmit rate (IMPORTANT!!)

This program spews out packets very fast. On Windows, or from VMs,it can do 300,000 packets/second. On Linux (no virtualization) it’lldo 1.6 million packets-per-second. That’s fast enough to melt most networks.

Note that it’ll only melt your own network. It randomizes the targetIP addresses so that it shouldn’t overwhelm any distant network.

By default, the rate is set to 100 packets/second. To increase the rate toa million use something like --rate 1000000.

Design

This section describes the major design issues of the program.

Code Layout

The file main.c contains the main() function, as you’d expect. It alsocontains the transmit_thread() and receive_thread() functions. Thesefunctions have been deliberately flattened and heavily commented so that youcan read the design of the program simply by stepping line-by-line througheach of these.

Asynchronous

This is an asynchronous design. In other words, it is to nmap whatthe nginx web-server is to Apache. It has separate transmit and receivethreads that are largely independent from each other. It’s the same sort ofdesign found in scanrand, unicornscan, and ZMap.

Because it’s asynchronous, it runs as fast as the underlying packet transmitallows.

Randomization

A key difference between Masscan and other scanners is the way it randomizestargets.

The fundamental principle is to have a single index variable that starts atzero and is incremented by one for every probe. In C code, this is expressedas:

for (i = 0; i < range; i++) {    scan(i);}

We have to translate the index into an IP address. Let’s say that you want toscan all “private” IP addresses. That would be the table of ranges like:

192.168.0.0/1610.0.0.0/8172.16.0.0/12

In this example, the first 64k indexes are appended to 192.168.x.x to formthe target address. Then, the next 16-million are appended to 10.x.x.x.The remaining indexes in the range are applied to 172.16.x.x.

In this example, we only have three ranges. When scanning the entire Internet,we have in practice more than 100 ranges. That’s because you have to blacklistor exclude a lot of sub-ranges. This chops up the desired range into hundredsof smaller ranges.

This leads to one of the slowest parts of the code. We transmit 10 millionpackets per second, and have to convert an index variable to an IP addressfor each and every probe. We solve this by doing a “binary search” in a smallamount of memory. At this packet rate, cache efficiencies start to dominateover algorithm efficiencies. There are a lot of more efficient techniques intheory, but they all require so much memory as to be slower in practice.

We call the function that translates from an index into an IP addressthe pick() function. In use, it looks like:

for (i = 0; i < range; i++) {    ip = pick(addresses, i);    scan(ip);}

Masscan supports not only IP address ranges, but also port ranges. This meanswe need to pick from the index variable both an IP address and a port. Thisis fairly straightforward:

range = ip_count * port_count;for (i = 0; i < range; i++) {    ip   = pick(addresses, i / port_count);    port = pick(ports,     i % port_count);    scan(ip, port);}

This leads to another expensive part of the code. The division/modulusinstructions are around 90 clock cycles, or 30 nanoseconds, on x86 CPUs. Whentransmitting at a rate of 10 million packets/second, we have only100 nanoseconds per packet. I see no way to optimize this any better. Luckily,though, two such operations can be executed simultaneously, so doing two of these as shown above is no more expensive than doing one.

There are actually some easy optimizations for the above performance problems,but they all rely upon i++, the fact that the index variable increases oneby one through the scan. Actually, we need to randomize this variable. Weneed to randomize the order of IP addresses that we scan or we’ll blast theheck out of target networks that aren’t built for this level of speed. We need to spread our traffic evenly over the target.

The way we randomize is simply by encrypting the index variable. By definition,encryption is random, and creates a 1-to-1 mapping between the original indexvariable and the output. This means that while we linearly go through therange, the output IP addresses are completely random. In code, this looks like:

range = ip_count * port_count;for (i = 0; i < range; i++) {    x = encrypt(i);    ip   = pick(addresses, x / port_count);    port = pick(ports,     x % port_count);    scan(ip, port);}

This also has a major cost. Since the range is an unpredictable size insteadof a nice even power of 2, we can’t use cheap binary techniques likeAND (&) and XOR (^). Instead, we have to use expensive operations like MODULUS (%). In my current benchmarks, it’s taking 40 nanoseconds toencrypt the variable.

This architecture allows for lots of cool features. For example, it supports”shards”. You can setup 5 machines each doing a fifth of the scan, orrange / shard_count. Shards can be multiple machines, or simply multiplenetwork adapters on the same machine, or even (if you want) multiple IPsource addresses on the same network adapter.

Or, you can use a ‘seed’ or ‘key’ to the encryption function, so that you geta different order each time you scan, like x = encrypt(seed, i).

We can also pause the scan by exiting out of the program, and simplyremembering the current value of i, and restart it later. I do that a lotduring development. I see something going wrong with my Internet scan, soI hit to stop the scan, then restart it after I’ve fixed the bug.

Another feature is retransmits/retries. Packets sometimes get dropped on theInternet, so you can send two packets back-to-back. However, something thatdrops one packet may drop the immediately following packet. Therefore, youwant to send the copy about 1 second apart. This is simple. We already havea ‘rate’ variable, which is the number of packets-per-second rate we aretransmitting at, so the retransmit function is simply to use i + rateas the index. One of these days I’m going to do a study of the Internet,and differentiate “back-to-back”, “1 second”, “10 second”, and “1 minute”retransmits this way in order to see if there is any difference in whatgets dropped.

C10 Scalability

The asynchronous technique is known as a solution to the “c10k problem”.Masscan is designed for the next level of scalability, the “C10M problem”.

The C10M solution is to bypass the kernel. There are three primary kernelbypasses in Masscan:* custom network driver* user-mode TCP stack* user-mode synchronization

Masscan can use the PF_RING DNA driver. This driver DMAs packets directlyfrom user-mode memory to the network driver with zero kernel involvement.That allows software, even with a slow CPU, to transmit packets at the maximumrate the hardware allows. If you put 8 10-gbps network cards in a computer,this means it could transmit at 100-million packets/second.

Masscan has its own built-in TCP stack for grabbing banners from TCPconnections. This means it can easily support 10 million concurrent TCPconnections, assuming of course that the computer has enough memory.

Masscan has no “mutex”. Modern mutexes (aka. futexes) are mostly user-mode,but they have two problems. The first problem is that they cause cache-linesto bounce quickly back-and-forth between CPUs. The second is that when thereis contention, they’ll do a system call into the kernel, which killsperformance. Mutexes on the fast path of a program severely limits scalability.Instead, Masscan uses “rings” to synchronize things, such as when theuser-mode TCP stack in the receive thread needs to transmit a packet withoutinterfering with the transmit thread.

Portability

The code runs well on Linux, Windows, and Mac OS X. All the important bits arein standard C (C90). It therefore compiles on Visual Studio with Microsoft’scompiler, the Clang/LLVM compiler on Mac OS X, and GCC on Linux.

Windows and Macs aren’t tuned for packet transmit, and get only about 300,000packets-per-second, whereas Linux can do 1,500,000 packets/second. That’sprobably faster than you want anyway.

Safe code

A bounty is offered for vulnerabilities, see the VULNINFO.md file for moreinformation.

This project uses safe functions like strcpy_s() instead of unsafe functionslike strcpy().

This project has automated unit regression tests (make regress).

Compatibility

A lot of effort has gone into making the input/output look like nmap, whicheveryone who does port scans is (or should be) familiar with.

Authors

This tool created by Robert Graham:email: robertdavidgraham@yahoo.comtwitter: @ErrataRob

To restore the repository download the bundle

wget https://archive.org/download/github.com-robertdavidgraham-masscan_-_2019-11-07_18-58-35/robertdavidgraham-masscan_-_2019-11-07_18-58-35.bundle

and run:

 git clone robertdavidgraham-masscan_-_2019-11-07_18-58-35.bundle 

Source: https://github.com/robertdavidgraham/masscan
Uploader: robertdavidgraham
Upload date: 2019-11-07