By now you've probably come to realize that security can be a difficult thing. You've fixed a few security flaws in project 1, and every year thousands more like it are fixed in real world software, yet fixing flaws does not prove the security of a system. Moreover, flawed software often remains in use for quite some time, even when its flaws are known and a fixed version is available. Your job for this project is to develop a network intrusion detection system that monitors network packets for exploit attempts and notifies the user of any suspicious activity.
Using Java, create a network intrusion detection system (IDS) which is capable of monitoring traffic to or from a single host on the network. You should use the jpcap library to receive packets, and regular expressions to create your rules. Since you will not have root access to the host in question, you will not monitor live network traffic; instead, you will read in a trace file of captured packets. You are still simulating a program that would process network traffic as it appears, however, and thus you should not need to process the packet trace more than once. Your program should take two command-line arguments: the first is a rule file (whose syntax is defined below), and the second is the pcap trace file, several of which will be provided for your testing. Your IDS should process each packet in the trace, and as rules are matched print an alert (which should include the name of the matched rule) to standard out. One packet may figure into any number of alerts, so your IDS should not stop processing a packet or stream when a single rule is matched.
There are two types of rules: stream rules and protocol rules. Stream rules require you to reconstruct the send or receive stream of a TCP connection, and then apply a regular expression to the entire stream. Stevens' TCP/IP Illustrated and Unix Network Programming are good references if you are unfamiliar with how TCP reconstruction works. Protocol rules specify an exchange of messages which are assumed to be in single packets (a naive assumption it turns out, but a decent first step). The sub-rules may match the flags, body, or both on each packet. Each sub-rule must be matched in order and with no intermediate packets to/from the same ports and IPs for a protocol rule to match. In particular, once you have ordered the TCP packets (which may arrive out of order) the sub-rules match the packets in TCP sequence order.
Exception: TCP protocol rules should allow any number of acknowledgments (ACKs) between the packets that match the sub-rules, as long as those ACKs carry no other data. For simplicity, you may skip such packets regardless of their sequence numbers, as long as they fall between two sub-rules of a TCP protocol rule. Note that you may still need to process these packets in other ways for other reasons depending on your implementation strategy; similarly, you should not in general ignore sequence numbers, as they are a crucial part of TCP.
For this project it is acceptable to wait until a TCP session is finished (or there are no more packets to process) before checking for matches to TCP stream and protocol rules. Extra credit will be given to projects that give warnings as soon as they are applicable, as long as the rest of the project is fully implemented. You may use tools like JLex and CUP, if you are familiar with them, to construct your rule file parser, but the grammar has been designed to be easily parsed without such tools.
Your program must work on the eniac.seas.upenn.edu machine pool.
This project is to be done in groups of two or three, with no exceptions. One member of each group should e-mail the names and e-mail addresses of everyone in their group to vaughan2 @ seas.upenn.edu by Thursday, February 22.
When submitting this project, please only do so from one group member's account. If you submit from multiple accounts you must e-mail the course staff before the submission deadline telling us which one to grade.
Your final submission should consist of a well commented program, a
Makefile
(see the discussion of makefiles below), and a
README
giving a brief description of each file in your
directory. You should also submit a file documenting the design of
your intrusion detection system, and how you evaluated its correctness
(e.g. extra rulefiles). Please submit your documentation as text or
PDF. Do not submit any object code (i.e. .class
files), including either of the library files linked to from this
page.
The Java class that contains your main
method should be named
IDS
; you can name other classes however you'd like, but each name
should reflect the purpose of its class. Your archive should be named
username.tar.gz
and expand into a directory that matches your
username. For example, if your username is adent
and you've been
placing your files in ~/cis551/project2
, would could run the
following commands:
adent@minus:~/cis551/project2$ make clean adent@minus:~/cis551/project2$ cd .. adent@minus:~/cis551$ mv project2 adent adent@minus:~/cis551$ tar czf adent.tar.gz adent
You would then submit the file adent.tar.gz
.
Regular expressions are a standard tool used for pattern matching; in this
project, they will be used to specify patterns either in individual packets
or in reconstructed TCP streams which should trigger an alert. Java
provides several classes for dealing with regular expressions; an introduction
can be found in the Java
API entry for the Pattern
class. You will need to import
the package java.util.regex
in order to use the
Pattern
and Matcher
classes. Note that, unlike the
examples given in the API entry for Pattern
, you may find the
method Matcher.find
to be more useful than
Matcher.matches
.
If you are having difficulty understanding regular expressions, you might try reading some of the many tutorials available on the web. Although Java handles regular expressions through objects and methods, the actual regular expression syntax is nearly identical to that used in scripting languages like Perl and Python.
A rule file consists of exactly one host entry (which must be first), and arbitrarily many rule entries. The grammar for the configuration file is:
<file> ::= <host><rule>* <host> ::= host=<ip>\n\n <rule> ::= name=<string>\n <(tcp_stream_rule>|<protocol_rule)>\n <tcp_stream_rule> ::= type=tcp_stream\n src_port=(any|<port>)\n dst_port=(any|<port>)\n ip=(any|<ip>)\n (send|recv)=<regexp>\n <protocol_rule> ::= type=protocol\n proto=tcp|udp\n src_port=(any|<port>)\n dst_port=(any|<port>)\n ip=(any|<ip>)\n <sub_rule> <sub_rule>* <sub_rule> ::= (send|recv)=<regexp> (with flags=<flags>)?\n <string> ::= alpha-numeric string <ip> ::= string of form [0-255].[0-255].[0-255].[0-255] <port> ::= string of form [0-65535] <regexp> ::= Perl Regular Expression <flags> ::= <flag>* <flag> ::= S|A|F|R|P|U
Each rule begins with a name, which should be used when printing a notice
every time that rule is matched by a connection or packet. "ip", "dst_port",
and "recv" all refer to the remote side of a connection - these values
cannot be naively matched against similarly named packet fields
without first considering the direction in which the packet is
traveling. (This means that for protocol rules, just as you
have to switch the IP addresses when matching the "recv" subrules, you
also have to switch port numbers.) The string "name"
does not affect which packets are matched, but it should appear in the alert
printed when a match occurs. The flags are SYN, ACK, FIN, RST, PSH, and URG,
and they are present only in TCP packets.
A with flags=<flags>
clause in a UDP rule is erroneous,
while in a TCP rule it indicates that the matched packet must have
exactly those flags, and no others.
Blame Attack 1
A very simple rule which looks for the "Now I own your computer" string contained in Project 1's shellcode.
host=192.168.0.1 name=Blame Attack 1 type=protocol proto=tcp src_port=5551 dst_port=any ip=any recv="Now I own your computer"
Try with: trace1.pcap (false positive), trace2.pcap, trace3.pcap (false negative)
Blame Attack 2
Same as Blame Attack 1, except use TCP stream reconstruction.
host=192.168.0.1 name=Blame Attack 2 type=tcp_stream src_port=5551 dst_port=any ip=any recv="Now I own your computer"
Try with: trace1.pcap (false positive), trace2.pcap, trace3.pcap
General Buffer Overflow
A more sophisticated rule which matches a sequence of NOPs followed by a syscall, which might be found in many buffer overflow attacks.
host=192.168.0.1 name=Blame Attack 3 type=tcp_stream src_port=5551 dst_port=any ip=any recv="\x90{10}.*\xcd\x80"
Try with: trace1.pcap, trace2.pcap, trace3.pcap
Plaintext POP
Detect insecure logins to mailserver
host=192.168.0.1 name=Plaintext POP type=protocol proto=tcp src_port=110 dst_port=any ip=any send="\+OK.*\r\n" recv="USER .*\r\n" send="\+OK.*\r\n" recv="PASS.*\r\n" send="\+OK.*\r\n"
Try with: trace4.pcap
XMAS port scan
Detect someone attempting to do a XMAS portscan on any port
host=192.168.0.1 name=XMAS scan type=protocol proto=tcp src_port=any dst_port=any ip=any recv=".*" with flags=FUP
Try with: trace5.pcap
NULL scan against webserver
Detect someone attempting to do a NULL scan portscan on the webserver port (80)
host=192.168.0.1 name=NULL scan type=protocol proto=tcp src_port=80 dst_port=any ip=any recv=".*" with flags=
Try with: trace6.pcap
Simulated remote Linux boot
A UDP example, which might be used to detect a compromised host attempting a network boot via TFTP
host=192.168.0.1 name=TFTP remote boot type=protocol proto=udp src_port=any dst_port=69 ip=any send="vmlinuz" recv="\x00\x03\x00\x01"
Try with: trace7.pcap
All pcap files provided have been sanitized. The host to protect has been given the IP address 192.168.0.1, and other IPs have been changed if needed. To see the contents of these files, you can use tcpdump -r file.pcap
If you are interested in generating your own trace files, we recommend you look into ethereal, a tool for examining network traffic under both Windows and most Unix-based systems, including Linux and Mac OS X. Actually capturing network traffic will require root access, so this is best done on your personal machine.
You might also be interested in reading about port scanning techniques (for example at insecure.org) to see how some of the above examples were constructed.
Java 5.0 (sometimes called Java 1.5) updates the Java collection library with
generics, which, from a user's point of view, are somewhat similar
to C++ templates (although their technical details are quite a bit different).
Generic collections can save you time and help you write better code; for
example, if you want an ArrayList
(a collection which combines
the functionality of arrays and lists in an efficient manner) where every
element is a 32-bit integer, you can declare a variable of type
ArrayList<Integer>
. Similarly, the type
Hashtable<String, Boolean>
denotes a hash table that
always maps strings to booleans. This eliminates the instanceof
checks and subsequent typecasts required when using these classes in previous
versions of Java.
To use Java 5.0 on eniac
and other SEAS machines, you must first
add it to your path. Under bash
, the default shell, this can
be done with the command export PATH=/usr/java/jdk1.5.0/bin:$PATH
- under tcsh
, the command is setenv
and the '='
should be replaced by a space. We will test submissions for this project using
the Java 5.0 compiler, although you are not required to make use of any new
features of the language. We will revert to the previous Java compiler in
the unlikely event of a backwards compatibility error.
Java 5.0 has several other enhancements that you might find useful, including
automatic conversion between primitive types (int
) and object
types (Integer
), as well as an enhanced "foreach"-style
for
loop for dealing with collections. You can read a summary
of these features
here. The latest version of the
Java API gives
full documentation on the collection library.
In order to use the jpcap library, you will need to add the file
jpcap.jar to your CLASSPATH and the
directory containing the file libjpcap.so
to your LD_LIBRARY_PATH. If you are using bash
, you can
accomplish this with the export
command, as in:
adent@minus:~/cis551/project2$ export CLASSPATH=~/cis551/lib/jpcap.jar:$CLASSPATH adent@minus:~/cis551/project2$ export LD_LIBRARY_PATH=~/cis551/lib:$LD_LIBRARY_PATH
You may choose to put these files in the same directory as your project, but do not include them in the archive you submit. We will already have both files in our test environment, along with JLex and CUP.
Documentation for jpcap is
available. In
order to process a packet capture file, you will want to create a
PacketCapture
object and initialize it with the
openOffline
method. You can then call the
addPacketListener
method to enable a listener you've created
- which will perform the work of your intrusion detection system - and finally
call capture
with a negative argument. Your program will
then process packets until a CapturePacketException
is thrown;
you should catch this exception and terminate gracefully, as it most likely
indicates the end of your capture file.
The addPacketListener
method will accept any object that
implements the PacketListener
interface. Your listener class
- which may be declared as, for example,
class IDSListener implements PacketListener
- must provide a method
packetArrived
with return type void
and argument
type Packet
. This method will be called on each packet in the
capture file (in the order in which they appear); from there you can deal
with the packet in whatever way you choose.
You may also use the setFilter
method to place a global filter
on the packets you examine; any packets that do not match this filter will
be ignored, and your packetArrived
method will never be called
on them. You can find the syntax of filter expressions by looking at the
tcpdump
man
page under "expressions". Note that you do not need to use
this method, as there is nothing wrong with having your
PacketListener
examine every packet, so if you're having
difficulty constructing a filter that will match all the packets you're
interested in, feel free to ignore this feature of jpcap.
Traditionally, network programmers have had to look carefully at flags
and match against special constants to separate out different varieties of
packets. Luckily for you, the jpcap library puts this distinction into Java's
type system. The type hierarchy under Packet
reflects the
various different protocols used on the Internet; this allows you to use
Java's instanceof
operator to determine whether an IP packet
(of type IPPacket
) is in fact a TCP packet (of type
TCPPacket
). Once you have made this distinction, we highly
recommend that you cast the packet to its more specific type and pass it
to a method designed for that packet type. You can safely ignore any
non-Ethernet packets you encounter; in fact, the current implementation of
jpcap has no way of producing any such packets.
All of the classes mentioned above live in the package
net.sourceforge.jpcap.capture
, which you will probably want to
import (import net.sourceforge.jpcap.capture.*;
) at the top of
any file that refers to them. Other classes in
jpcap may live in other packages; details are given in the documentation.
This project will require you to write a good deal more code than was necessary for the first project. Make sure you have a plan before you begin, especially for the reconstruction of TCP streams. Feel free to create as many auxillary classes as you think are helpful, and divide your functionality into multiple methods along logical lines.
Each packet type defines new methods for giving you information on that variety of packet. Make use of these whenever possible; trying to interpret raw header data directly is much more likely to lead to subtle bugs. All of the packet types also include toString, toColoredString, and (sometimes) toColoredVerboseString methods; these should be very useful in debugging.
In order to match packet data against a regular expression, you will need to
convert the packet data (a byte array) to a string. The class
String
has a constructor that takes a byte array, which should
do what you want. If it does not seem to be behaving properly - which may
be the case if you are using non-standard locale settings - there is also
a constructor that takes, as a second argument, a character set name; try
"ISO-8859-1"
if you want to prevent any unintended processing of
special characters. You may also use the constant 0
as the
second argument in order to force correct behavior; this constructor is
deprecated, but it is acceptable for use in this project.
When constructing TCP streams, you may, especially if you are going for the
extra credit, feel constrained by Java's immutable String
class.
If this is the case, you might want to look at StringBuffer
, which
provides insertion and deletion methods not present in the immutable
String
. (Note, however, that it is certainly possible to take
a different approach to this part of the project and have no need of this
class.) You may also use StringBuilder
instead of
StringBuffer
as long as you do not use StringBuilder
objects in a multithreaded way.
There are many corner cases to consider for this project, and the details of exactly what functionality you should provide are up to some interpretation. When you encounter such a situation, make sure you explain the choice your group made in your documentation. Similarly, if you can think of some cases that your IDS does not handle - due to time constraints, for example - document them as well.
Finally, make sure you handle exceptions in a useful way. Liberal use of
catch (Exception e) { ... }
blocks are not good use
of exceptions, especially if the block serves only to return a trivial
value or print an unhelpful error message. It's best to catch only the
specific exception types that you're looking for, to throw them when they
are better dealt with by the calling function, and to tell the user exactly
what sort of error occurred.
You should also include a Makefile (called Makefile) which can be used to automatically build your project. A Makefile should also be helpful while you're working; if properly set up, you will be able to rebuild any .class file by typing make filename, and make will be certain to rebuild any other .class files that are relevant and that may have changed. This lets you avoid recompiling your entire project while eliminating strange bugs caused by out of date files.
More information on Makefiles can be found in The GNU Make Manual; the following simple example should help you get started. It assumes that we have three source files, Foo.java, Bar.java, and Baz.java. The class Foo makes reference to both Bar and Baz, but neither of those classes references either the other or Foo.
# A good Makefile begins with variable declarations that capture important # parts of the project. In this case we have a list of class files that # comprise the program, as well as the name and command line arguments for # the Java compiler. # If more classes are added to the program, we can list them here to ensure # that they will be compiled. CLASSES = Foo.class Bar.class Baz.class # Putting the compiler and compiler options in variables lets us change them # later and have these changes apply to every file. JAVAC = javac JAVAOPTS = # The remainder of a makefile consists of rules of the form # # target: prerequisites # command # # If you run the command "make target", make will first check that all # the prerequisites are present and up to date, then run the command. # If prerequisites are missing or out of date, make will look for rules # with those prerequisites as their targets. # # You MUST put a tab before the command; a sequence of spaces will not # work. # Some targets do not correspond to actual files; you can declare that # they are "phony" like this. .PHONY: all clean # The first target in the file is selected automatically if you call # make with no argument. In this case it will compile all the classes # that comprise our program. all: $(CLASSES) # A "clean" target allows us to quickly delete all automatically generated # files. This should be run right before you submit your project, and may # be helpful any time you would like to quickly get rid of old files. It # uses a $(RM) variable that we did not define; this is fine, however, as # make includes many predefined variables. clean: $(RM) *.class # This rule tells us that class Foo depends on classes Bar and Baz, and # thus Bar and Baz should be built first. Every class depends on its # source file. The special variable $< refers to the first prerequisite, # which is generally the source file. Foo.class: Foo.java Bar.class Baz.class $(JAVAC) $(JAVAOPTS) $< # Bar and Baz have no other dependencies, and we imagine that this will be # the case with most other source files that we add to our program. The # following target establishes a pattern, saying that any .class file can # be built from its .java file. This pattern will apply whenever we do # not give a specific rule, so we can save those for when we have # dependencies. %.class: %.java $(JAVAC) $(JAVAOPTS) $<