What is cross site scripting?
On this site (archived snapshot) under “The Theory of XSS’, it says:
the hacker infects a legitimate web page with his malicious client-side script
My first question on reading this is: if the application is deployed on a server that is secure (as is the case with a bank for example), how can the hacker ever get access to the source code of the web page? Or can he/she inject the malicious script without accessing the source code?
With cross-site scripting, it's possible to infect the HTML document produced without causing the web server itself to be infected. An XSS attack uses the server as a vector to present malicious content back to a client, either instantly from the request (a reflected attack), or delayed though storage and retrieval (a stored attack).
An XSS attack exploits a weakness in the server's production of a page that allows request data to show up in raw form in the response. The page is only reflecting back what was submitted in a request... but the content of that request might hold characters that break out of ordinary text content and introduce HTML or JavaScript content that the developer did not intend.
Here's a quick example. Let's say you have some sort of templating language made to produce an HTML page (like PHP, ASP, CGI, or a Velocity or Freemarker script). It takes the following page and substitutes "<?=$name?>" with the unescaped value of the "name" query parameter.
<html>
<head><title>Example</title></head>
<body>Hi, <?=$name?></body>
</html>
Someone calling that page with the following URL:
http://example.com/unsafepage?name=Rumplestiltskin
Should expect to see this message:
Hi, Rumplestiltskin
Calling the same page with something more malicious can be used to alter the page or user experience substantially.
http://example.com/unsafepage?name=Rumplestiltskin<script>alert('Boo!')</script>
Instead of just saying, "Hi, Rumplestiltskin", this URL would also cause the page to pop up an alert message that says, "Boo!". That is, of course, a simplistic example. One could provide a sophisticated script that captures keystrokes or asks for a name and password to be verified, or clears the screen and entirely rewrites the page with shock content. It would still look like it came from example.com, because the page itself did, but the content is being provided somewhere in the request and just reflected back as part of the page.
So, if the page is just spitting back content provided by the person requesting it, and you're requesting that page, then how does a hacker infect your request? Usually, this is accomplished by providing a link, either on a web page or sent to you by e-mail, or in a URL-shortened request, so it's difficult to see the mess in the URL.
<a href="http://example.com?name=<script>alert('Malicious content')</script>">
Click Me!
</a>
A server with an exploitable XSS vulnerability does not run any malicious code itself-- its programming remains unaltered-- but it can be made to serve malicious content to clients.
That attacker doesn't need access to the source code.
A simple example would be a URL parameter that is written to the page. You could change the URL parameter to contain script tags.
Another example is a comment system. If the website doesn't properly sanitize the input/output, an attacker could add script to a comment, which would then be displayed and executed on the computers of anyone who viewed the comment.
These are simple examples. There's a lot more to it and a lot of different types of XSS attacks.
It's better to think of the script as being injected into the middle of the conversation between the badly coded web page and the client's web browser. It's not actually injected into the web page's code; but rather into the stream of data going to the client's web browser.
There are two types of XSS attacks:
- Non-persistent: This would be a specially crafted URL that embeds a script as one of the parameters to the target page. The nasty URL can be sent out in an email with the intent of tricking the recipient into clicking it. The target page mishandles the parameter and unintentionally sends code to the client's machine that was passed in originally through the URL string.
- Persistent: This attack uses a page on a site that saves form data to the database without handling the input data properly. A malicious user can embed a nasty script as part of a typical data field (like Last Name) that is run on the client's web browser unknowingly. Normally the nasty script would be stored to the database and re-run on every client's visit to the infected page.
See the following for a trivial example: What Is Cross-Site Scripting (XSS)?