CS 161 Notes Lec 13-14

Course website: https://su21.cs161.org/
Course textbook: https://textbook.cs161.org/

Overview

Lecture 13: Cookies and CSRF

Cookies
1. Parts of a cookie
Cookie policy
1. Setting cookies
2. Sending cookies
Session authentication
Cross-site request forgery(CSRF)
CSRF defenses

Lecture 14: XSS and UI Attacks

XSS
UI
1. Clickjacking
2. Phishing

Cookies

Motivation for cookies come from the fact that HTTP is a request-response protocol, which makes the web server process each request independently of other requests. While we want our responses to be customized, for example, we want the website to remain in dark mode after we’ve set it once. Also, we want our log in to some websites, I want their future responses to be related to my account.

Definition of cookie:

Cookie is a piece of data used to maintain state across multiple requests.

How do we create cookies? (Note that servers, JavaScript in the browser, and users are all capable of creating cookies)

The server can create a cookie by including a Set-Cookie header in its response.
JavaScript in the browser can create a cookie.
Users can manually create cookies in their browser.

How do we store cookies?

Cookies are stored in the web browser (not the web server).
The browser’s cookie storage is sometimes called a cookie jar.

How do we send cookies?

The browser automatically attaches relevant cookies in every request.
The server uses received cookies to customize responses and connect related requests.

Name and Value:
- The actual data in the cookie is stored as a name-value pair.
- The name and value can be any string(But some special values can’t be used, including semicolons).
- For example here’re some possible cookies:
Domain and Path
- The domain attribute and path attribute define which requests the browser should attach this cookie for.
- The domain attribute usually looks like the domain in a URL.
- The path attribute usually looks like a path in a URL.
Secure and HttpOnly:
- The Secure attribute and HttpOnly attribute restrict the cookie for security purposes.
- They can only be True/False.
- If the Secure attribute is True, then the browser only sends the cookie if the request is made over HTTPS (not HTTP).
- If the HttpOnly attribute is True, then JavaScript in the browser is not allowed to access the cookie.
Expires
- The Expires attribute defines when the cookie is no longer valid.
- The expires attribute is usually a timestamp.
- If the timestamp is in the past, then the cookie has expired, and the browser deletes it from the cookie jar.

Name	Theme
Value	Dark
Domain	toon.cs161.org
Path	/xorcist
Secure	True
HttpOnly	False
Expires	12 Aug 2021 20:00:00

Note that there can be other fields

Recall that the server can create a cookie by including a Set-Cookie header in its response. The browser automatically attaches relevant cookies in every request.

An issue to worry about is that evil.com may try to Set-Cookie for google.com.

Another issue is that what if my cookie for google.com login information is somehow sent to the evil.com.

Cookie policy:

It’s a set of rules enforced by the browser.
It mainly considers about two issues:
1. When the browser receives a cookie from a server, should the cookie be accepted?
2. When the browser makes a request to a server, should the cookie be attached?
Note that cookie policy is not the same thing as same-origin policy in cookie.

Now let’s introduce some more stuffs about domain hierarchy.

Separated by dots, domains can be sorted into a hierarchy.
In the graph below, .edu is the top-level domain(TLD), because it is directly below the root of the tree.
We say that eecs.berkeley.edu is a subdomain of berkeley.edu.

Setting cookies

It deals with the issue: When the browser receives a cookie from a server, should the cookie be accepted?

Server with domain X can set a cookie with domain attribute Y if
- The domain attribute is a domain suffix of the server’s domain
  - X ends in Y
  - X is below Y on the hierarchy
  - X is more specific than Y
- The domain attribute Y is not a top-level domain (TLD)
- No restrictions for the Path attribute (the browser will accept any path)
Examples:
- mail.google.com can set cookies for Domain=google.com
- google.com can set cookies for Domain=google.com
- google.com cannot set cookies for Domain=com, because com is a top-level domain

Sending cookies

It deals with the issue: When the browser makes a request to a server, should the cookie be attached?

The browser sends the cookie if both of the following are true
1. The domain attribute is a domain suffix of the server’s domain
2. The path attribute is a prefix of the server’s path

An example would be

Here’s my conclusion of setting & setting cookie policies

Session authentication

Session is a sequence of requests and responses associated with the same authenticated user.

When you are checking your emails, you make many requests to Gmail, so the Gmail server needs your email address and password to verify your authentication. However, this process would take too much time if you have to manually provide authentication for every single request. So, here comes session! It works as a wristband for audience in a concert. Only for the first entrance, should they provide their tickets to the security guard, after that they’ll receive a wristband which allows them to walk in and out of the hall without showing tickets frequently.

“Session” actually means “session token”, which is a cookie provided by server to users. It works in the following logic:

The first time you visit the website:
- Present your username and password.
- If they’re valid, you receive a session token.
- The server associates you with the session token.
When you make future requests to the website:
- Attach the session token in your request.
- The server checks the session token to figure out that the request is from you.
- No need to re-enter your username and password!
When you log out (or when the session times out):
- The browser and server delete the session token.

Security concerns related with session tokens:

If an attacker steals your session token, they can log in as you!
- The attacker can make requests and attach your session token.
- The browser will think the attacker’s requests come from you.
Servers need to generate session tokens randomly and securely.
Browsers need to make sure malicious websites cannot steal session tokens
- Enforce isolation with cookie policy and same-origin policy(recall that same-origin policy claims that only websites with the same origin, that’s same domains && same protocols && same ports can access each other’s resources).
Browsers should not send session tokens to the wrong websites :)
- Enforced by cookie policy.

What attributes should a session token contain?

Domain and Path: Set so that the cookie is only sent on requests that require authentication.
Secure: Can set to True to so the cookie is only sent over secure HTTPS connections.
HttpOnly: Can set to True so JavaScript can’t access session tokens.
Expires: Set so that the cookie expires when the session times out.

Name	token
Value	{random value}
Domain	mail.google.com
Path	/
Secure	True
HttpOnly	True
Expires	{15 minutes later}
(other fields omitted)

Cross-site request forgery(CSRF)

Recall that session tokens will be automatically attached to relevant requests. This means that authentication will no longer be done by username + password, but instead just by the session token. So, what if the attacker tricks the victim into making an unintended request?

Cross-site request forgery(CSRF or XSRF) is an attack that exploits cookie-based authentication to perform an action as the victim.

Steps of a CSRF attack

User authenticates to the server
- User receives a cookie with a valid session token
Attacker tricks the victim into making a malicious request to the server
The server accepts the malicious request from the victim
- Recall: The cookie is automatically attached in the request

Note that the attacker doesn’t even need to know the content of session token, since the browser will automatically attach it.

How may the attacker trick the victim into making malicious request?

To answer this question, let’s first recall the difference between GET and POST request. In fact, I need to add some extra information about these two methods, since lecture 12 didn’t cover them in a clear way.

The GET method:

Used to send request data from a specified resource.
The query string is sent in the URL of a GET request.
- e.g. /test/demo_form.php?name1=value1&name2=value2
GET requests can be cached.
GET requests remain in the browser history.
GET requests can be bookmarked.
GET requests should never be used when dealing with sensitive data.
GET requests have length restrictions.
GET requests are only used to request data (not modify).

The POST method:

Used to send data to a server to create/update a resource.
The data sent to the server with POST is stored in the request body of the HTTP request.
- e.g.
- POST /test/demo_form.php HTTP/1.1
- Host: w3schools.com
- name1=value1&name2=value2
POST requests are never cached.
POST requests do not remain in the browser history.
POST requests cannot be bookmarked.
POST requests have no restrictions on data length.

Now, back to the question “how might the attacker trick the victim into making a malicious request”

How might we trick the victim into making a GET request?
- Strategy #1: Trick the victim into clicking a link.
  - Later we’ll see how to trick a victim into clicking a link.
  - The link can directly make a GET request:
    https://www.bank.com/transfer?amount=100&to=Mallory.
  - The link can also be designed to open an attacker’s website, which contains some JavaScript that makes the actual malicious request.
- Strategy #2: Put some HTML on a website the victim will visit.
  - Example: The victim will visit a forum. Make a post with some HTML on the forum.
  - HTML to automatically make a GET request to a URL:
    - This HTML will probably return an error or a blank 1 pixel by 1 pixel image, but the GET request will still be sent…with the relevant cookies!
How might we trick the victim into making a POST request?
- Strategy #1: Trick the victim into clicking a link.
  - Note: Clicking a link in your browser makes a GET request, not a POST request, so the link cannot directly make the malicious POST request.
  - The link can open an attacker’s website, which contains some JavaScript that makes the actual malicious POST request.
- Strategy #2: Put some JavaScript on a website the victim will visit.
  - Example: Pay for an advertisement on the website, and put JavaScript in the ad.
  - Recall: JavaScript can make a POST request.

Here’s how dangerous CSRF is among the top 25 most dangerous software weakness(2020)

CSRF defenses

How to defend CSRF attacks? It’s actually a task done by the server!

There’re three ways to defend the CSRF attack.

CSRF tokens
Referer validation
SameSite cookie attribute

CSRF tokens

Idea: Add a secret value in the request that the attacker doesn’t know.
- The server only accepts requests if it has a valid secret.
- Now, the attacker can’t create a malicious request without knowing the secret.
CSRF token: A secret value provided by the server to the user. The user must attach the same value in the request for the server to accept the request.
- CSRF tokens cannot be sent to the server in a cookie!
  - The token must be sent somewhere else (e.g. a header, GET parameter, or POST content).
  - The field that transmits the CSRF token should be set to hidden.
- CSRF tokens are usually valid for only one or two requests.

Here’s how it looks like

Referer validation

Idea: In a CSRF attack, the victim usually makes the malicious request from a different website.
Referer header: A header in an HTTP request that indicates which webpage made the request.
- “Referer” is a 30-year typo in the HTTP standard (supposed to be “Referrer”)!
- Example: If you type your username and password into the Facebook homepage, the Referer header for that request is https://www.facebook.com
- Example: If an img HTML tag on a forum forces your browser to make a request, the Referer header for that request is the forum’s URL.
- Example: If JavaScript on an attacker’s website forces your browser to make a request, the Referer header for that request is the attacker’s URL.
Checking the Referer header
- Allow same-site requests: The Referer header matches an expected URL.
  - Example: For a login request, expect it to come from https://bank.com/login.
- Disallow cross-site requests: The Referer header does not match an expected URL.
If the server sees a cross-site request, reject it.
However, there’re issues related to this approach.
- The Referer header may leak private information
  - Example: If you made the request on a top-secret website, the Referer header might show you visited http://intranet.corp.apple.com/projects/iphone/competitors.html
  - Example: If you make a request to an advertiser, the Referer header gives the advertiser information about how you saw the ad.
- The Referer header might be removed before the request reaches the server.
  - Example: Your company firewall removes the header before sending the request.
  - Example: The browser removes the header because of your privacy settings.
- The Referer header is optional. What if the request leaves the header blank?
  - Allow requests without a header?
    - Less secure: CSRF attacks might be possible.
  - Deny requests without a header?
    - Less usable: Legitimate requests might be denied.
  - Need to consider fail-safe defaults: No clear answer.

Idea: Implement a flag on a cookie that makes it unexploitable by CSRF attacks.
- This flag must specify that cross-site requests will not contain the cookie.
SameSite flag: A flag on a cookie that specifies it should be sent only when the domain of the cookie exactly matches the domain of the origin
- SameSite=None: No effect.
- SameSite=Strict: The cookie will not be sent if the cookie domain does not match the origin domain.
- Example: If https://evil.com/ causes your browser to make a request to https://bank.com/transfer?to=mallory, cookies for bank.com will not be sent if SameSite=Strict, because the origin domain (evil.com) and cookie domain (bank.com) are different.
Issue: Not yet implemented on all browsers.

XSS

XSS stands for Cross-Site Scripting. It’s the rank 1 weakness in this Top 25 Most Dangerous Software Weaknesses (2020) form! ;)

Websites use untrusted content as control data

Recall that same-origin policy claims that two webpages with different origins should not be able to access each other’s resources.

Also recall that JavaScript is a programming language running on the web browser, the script itself will be sent by the server, but run on the browser.

So, what if the attacker adds malicious JavaScript to a legitimate website, so that the malicious JavaScript will be sent to users and run on their browsers? Those JavaScript will surely satisfy the same-origin policy, and it can access information on the legitimate website.

This kind of attack is based on the idea of injection. It uses the fact that websites use untrusted content as control data, that’s to say, any html syntax will be parsed as html components even if the server want to just display them in plaintext.

Here’s an example,

Consider a Go HTTP handler

func handleSayHello(w http.ResponseWriter, r *http.Request) {
    name := r.URL.Query()["name"][0]
    fmt.Fprintf(w, "<html><body>Hello %s!</body></html>", name)
}

that runs on https://vulnerable.com
Clearly, if the user makes a request which looks like https://vulnerable.com/hello?name=EvanBot
The corresponding website will look like this

However if the response is something like

https://vulnerable.com/hello?name=<script>alert(1)</script>

Then when the server send the Fprintf result back to the browser, the browser will parse the

This example shows that by sending injected JavaScripts in HTML syntax from server to browser, the attacker can make the users run malicious scripts on their browser.

Two major XSS attacks are

Stored XSS
Reflected XSS

Stored XSS

The main idea is that the attacker’s malicious JavaScript will be stored on the legitimate server and sent to browsers.

Classic example: Facebook pages
- Anybody can load a Facebook page with content provided by users.
- An attacker puts some JavaScript on their Facebook page.
- Anybody who loads the attacker’s page will see JavaScript (with the origin of Facebook).
Stored XSS requires the victim to load the page with injected JavaScript.

A graph demonstrating the process,

Note that 5a and 5b are attacks contained in the JavaScript.

Reflected XSS

Different from Stored XSS, in Reflected XSS, the attacker will trick users to put the malicious JavaScript into a request. Later the request might be reflected(copied) in the response from the server thus reflected back to perform the attack.

Classic example: Search
If you make a request to http://google.com/search?q=blueberry, the response will say “10,000 results for blueberry”.
If you make a request to http://google.com/search?q=<script>alert(1)</script>, the response will say “10,000 results for ”

Here’s a graph illustrating the process,

Note that Reflected XSS is quite similar with CSRF, they both want the victim to make a request to the legitimate website.

In XSS, the response must be parsed as HTML for JavaScript to run.
In CSRF, the request must cause server-side action (e.g. sending money or logging in).

Defense: HTML sanitization

Since XSS is based on the fact that HTML syntax in any context will be recognized and executed by the browser, a way to defend is to use HTML sanitization, which use some sequences to represent special characters that may match with HTML syntax.

Start with an ampersand (&) and end with a semicolon (;)
- Instead of <, use <
- Instead of “, use "
- And many more!
  - It is important to escape all dangerous characters (lists of them can be found), or you will still be vulnerable!

With the HTML sanitization, the response from the XSS example above will instead be transmitted as

<html>
<body>
Hello &lt;script&gt;alert(1)&lt;/script&gt;!
</body>
</html>

Now it looks harmless ;)

Defense: Content Security Policy (CSP)

This defense holds the main idea that browsers should only use resources loaded from specific places.

Standard approach:
- Disallow all inline scripts (prevent inline XSS).
- Only allow scripts from specified domains (prevent XSS from linking to external scripts).
Also works with other content (e.g. iframes, images, etc.).
Concern:
- Relies on the browser to enforce security, so more of a mitigation for defense-in-depth.

UI

UI stands for User Interface. Recall that human factor is always a major security hole ;)

The main idea of UI attacks is to trick the victims into thinking they are taking an intended action, when they are actually taking a malicious action.

Two main types of UI attacks are

Clickjacking: Trick the victim into clicking on something from the attacker.
Phishing: Trick the victim into sending the attacker personal information.

Clickjacking

Clickjacking is to trick users into clicking on something from the attacker.

Here’s a typical webpage which contains too much buttons, so the users might got tricked into clicking on attacker’s button.

There’re many ways to trick users into clicking things they don’t want.

Here’re some examples
- Invisible iframe(hide some buttons beneath harmless content/use tricking messages to cover some real content)
- Temporal attack(since JavaScript is able to know where the cursor is, the attacker can write a script which automatically change the button to another one at the moment when users are about to click on something else).
- Cursorjacking(Attacker can draw the cursor at somewhere else, and hide the original image of the cursor, so the user may perform the clicking on somewhere he/she has no idea of).

Defenses against clickjacking

Enforce visual integrity: Ensure clear visual separation between important dialogs and content.

In windows, the User Account Control window darkens everything else on the screen, which is something can’t be done by attackers on the webpage.

Firefox provides this kind of “out of boundary” message, which clearly marks that this message is not generated by the webpage itself.

Enforce temporal integrity: Ensure that there is sufficient time for a user to register what they are clicking on.

On mac, when downloading files from the website, the dialog that asks user whether to continue will set the “OK” button to un-clickable state for 1 second. This delay ensures that users won’t be tricked by temporal attack to download something they don’t want.

Require confirmation from users
- The browser needs to confirm that the user’s click was intentional.
- Drawbacks: Asking for confirmation annoys users (consider human factors!).
Frame-busting: The legitimate website forbids other websites from embedding it in an iframe
- Defeats the invisible iframe attacks.
- Can be enforced by Content Security Policy (CSP).
- Can be enforced by X-Frame-Options (an HTTP header).

Phishing

Here’s a typical scenario of phishing

You received a warning email from PayPal…

You click the button without seeing where it leads to…

Provide some information…

Provide some more information…

Boom! Now you’re back to the login page…

What just happened? You have input all your critical information on attacker’s fake PayPal website, and in the end they directed you back to the real PayPal website. At this time, they got all the information they need to steal money from you!

This is what we called “phishing”.

Phishing is based on the fact that users sometimes can hardly tell the difference between real websites and the fake ones.

Now let’s take a look at some associated fun facts about phishing.

Firstly, is this a real website for PNC?

No!

Look at the link at the bottom-right of the browser, www.pnc.com/webapp/unsec/homepage.var.cn is in fact an entire domain registered by the attacker!

Another example, is the apple website below a real one?

No!

It’s actually impossible to find anything malicious from this link! It looks just the same as the real https://www.apple.com . But in fact the “apple” word above contains letters from Cyrillic alphabet, which are rendered to be the same on the browser!

Such attack is called homograph attack, which creates malicious URLs that look similar(or the same) to legitimate URLs.

There’s also a funny thing called Browser-in-browser attack, it looks like this

This webpage manually draws those fancy verification messages provided by typical browsers, to trick victims.

Now let’s introduce a defense against phishing, two-factor authentication.

Two-factor authentication

As we know, the goal of phishing is to steal victim’s information, such as password.
So what if we add some more security to the login system, so that login can’t be performed with only password.
Such additional verification is called Two-factor authentication (2FA).
Here are three main ways for a user to prove their identity:
- Something the user knows: Password, security question (e.g. name of your first pet).
- Something the user owns: Their phone, their email, their security token.
- Something the user is: Fingerprint, face ID.
Note that 2FA can be used to at lot of other scenarios where users’ passwords might leak.

Bad news: there are attacks on 2FA!

Relay attacks
- Also called transient phishing.
- The idea is that the attacker sits between user and server as a middle man, and seek whatever information he/she needs from the user. In this way, no matter how many factors for authentication you added, the attacker can always ask the user to provide that for him/her.

A graph illustrating relay attacks

Also, there’s something called social engineering ;) .

Base on these attacks, here’re some possible improvements on 2FA

Authentication token: A device that generates secure second-factor codes
- Something the user owns
- Examples: RSA SecureID and Google Authenticator
- Usage
  - The token and the server share a common secret key k
  - When the user wants to log in, the token generates a code HMAC(k, time)
    - The time is often truncated to the nearest 30 seconds for usability
    - The code is often truncated to 6 digits for usability
  - The user submits the code to the website
  - The website uses its secret key to verify the HMAC
Security key: A second factor designed to defend against phishing
- Something the user owns
- Usage
  - When the user signs up for a website, the security key generates a new public/private key pair and gives the public key to the website
  - When the user wants to log in, the server sends a nonce to the security key
  - The security key signs the nonce, website name (from the browser), and key ID, and gives the signature to the server

Written on May 10, 2023

Comments

Comment on Github