Jonas Magazinius, Andrei Sabelfeld – Chalmers University of Technology
Billy K. Rios – Cylance Inc.
CROSSING ORIGINS BY
CROSSING FORMATS
ABOUT
• PhD Student, Chalmers • until Nov 1st then Dr. Magazinius
• Securing the mashed up web• 10:00 HA4 – Hörsalsvägen, Chalmers
• Co-leader of OWASP Gothenburg
• Part of Cure53
• @internot_
• Father – as some of you might remember
LANGUAGE-BASED SECURITY• Using programming language theory for finding and mitigating security vulnerabilities
• Static vs. dynamic analysis
• Information-flow monitoring
• Declassification
• Decentralized
• Crossing origins by crossing formats
• Byproduct of research
• Joint work with Billy K. Rios
• Greatly inspired by the work of Julia Wolf
BACKGROUND• GIFAR – content smuggling attack
• Billy Rios (@XSSniper), Petko D. Petkov (@pdp)
• Attacker uploads GIF/JAR file
• Cross-origin CSS attack
• Chris Evans (@scarybeasts) et al.
• Attacker injects fragments of CSS into HTML
• Content-type sniffing attacks
• Adam Barth (@adambarth) et al.
• Attacker uploads PS/HTML file
THINGS IN COMMON…
• … mixing formats
• … re-interpretation of the content
POLYGLOT• Definition:
• ”…a person who speaks several languages.”
• ”…a program that is valid in multiple programming languages.”
• Content that can be interpreted as multiple formats
• Example 1 – HTML / JavaScript
• data:text/html,alert('<script src="%23"></script>')
• Example 2 – C / Pascal / PostScript / TeX / Bash / Perl / Befunge98
• (*a/*/ % #)(PostScript)/Helvetica 40 selectfont 9 400 moveto show%v"f"a0 true showpage quit%#) 2>/dev/null;echo bash;exit #*/); int main()/*>"eb"v %a*0)unless print"perl\n"__END__*/{printf("C\n");/*>>#;"egnu">:#,_@;,,,< *)begin writeln(*\output={\setbox0=\box255}\eject\shipout\hbox{\TeX}\end *)('pascal');end.{*/ return 0;}
MALICIOUS POLYGLOTS• Two formats (or more)
• One benign
• One malicious
• GIFAR – GIF/JAVA
• Cross-origin CSS – HTML/CSS
• Content-type sniffing – PS/HTML
• Preferred format characteristics
• Widespread, commonly used format
• Error tolerant parsing, or other ways to hide foreign syntax
• Cross-origin communication
POLYGLOT ATTACKS• Infiltrate
• Syntax injection – Cross-origin CSS attack
• Content smuggling – GIFAR
• Embed
• Context based re-interpretation
• The content-type provided by the server is overridden
• Tags that allow re-interpretation of content:
• CSS – <link>-tag
• Java – <applet>-tag
• Content sniffing – <iframe>-tag
• <object> and <embed> allows arbitrary interpretation based on type attribute
ATTACK VECTORS – SYNTAX INJECTION• A vulnerable webservice reflects parameters into content
• Fragments of syntax is injected resulting in a polyglot
• Polyglot is embedded under the origin of the attacker
• The polyglot has origin of, and can communicate with vulnerable service
• Visitors of the attackers domain are exploited
• Known attack instances
• Cross-origin CSS attack
• (Cross-site scripting)
vulnerable.com
(3) (4)
attacker.com
(1)
(2)
ATTACK VECTORS – CONTENT SMUGGLING• A vulnerable webservice allows users to upload content
• Attacker uploads a polyglot to the vulnerable origin
• Polyglot is embedded under the origin of the attacker
• The polyglot has origin of, and can communicate with vulnerable service
• Visitors of the attackers domain are exploited
• Known attack instances
• GIFAR
• Content sniffing attackattacker.com
(1)vulnerable.com
(2)
(3)
(4) (5)
PAYLOADS – EXPLOITING THE ORIGIN • Cross-origin information leakage
• Request sensitive user information
• Leak to attacker across origins
• Cross-site request forgery
• Traditionally, issue requests with the credentials of the victim
• Protect using tokens
• Impact is far greater if it is possible to read the response
• Extract token
• Make request
• Standardized document format – ISO32000-1
• Container format
• Embed related resources
• Contain foreign syntax by design
• Error tolerant parsing
• Powerful capabilities
• Display text
• Render 2D/3D graphics
• Animations
• Forms
• Launch commands (restricted)
• Execute JavaScript
• Embed Flash – just fantastic
• Issue HTTP-request
• With cookies!!
PORTABLE DOCUMENT FORMAT
• Header
%PDF-1.7
• Objects
1 0 obj << /Length 14>> stream
Content stream
endstream
endobj
• Cross-reference
xref
00000012 0000 n
endxref
• Trailer
• startxref 105
• trailer << /Root 1 0 R >>
• %%EOF
DOCUMENT STRUCTURE
%PDF-1.41 0 obj<< /Type /Catalog /Outlines 2 0 R /Pages 3 0 R>>endobj2 0 obj<< /Type Outlines/Count 0>>endobj3 0 obj<< /Type /Pages /Kids [4 0 R] /Count 1>>endobj4 0 obj<< /Type /Page /Parent 3 0 R /MediaBox [0 0 612 792] /Contents 5 0 R /Resources << /ProcSet 6 0 R >>>>endobj
5 0 obj<< /Length 35 >>streamendstreamendobj6 0 obj[/PDF]endobjxref0 70000000000 65535 f0000000009 00000 n0000000074 00000 n0000000120 00000 n0000000179 00000 n0000000300 00000 n0000000384 00000 ntrailer<< /Size 7 /Root 1 0 R>>startxref408%%EOF
MINIMAL PDF (ACCORDING TO SPECIFICATION)
%PDF1 0 obj<</Pages<<>>>>trailer<</Root 1 0 R>>
…or even shorter…
%PDF trailer% 1 0 obj<</Root 1 0 R/Pages<<>>>>
…or even shorter…
%PDF trailer<</Root% 1 0 obj<</Pages1 0 R>>
%PDF-1.trailer<</Root<</Pages<<>>>>
…or executing JavaScript…
%PDF-1.trailer<</Root<</Pages<<>>/OpenAction<</S/JavaScript
/JS(app.alert(’PDF’))>>>>
MINIMAL PDF (ACCORDING TO INTERPRETER)
Adobe Reader Google Chrome PDF Reader
ERROR TOLERANT PARSING
This text would also be a valid %PDF-1.
With the condition that the
trailer %begins on a new line and that there isn’t
<</too /much /garbage /in /Root<</Pages<<>>>> the dictionary.
• PDF• URL Action – Redirects the browser
• Embedded Flash• Inherits the origin of the document• Two-way communication• Uses its own set of cookies
%PDF-1.trailer <</Root <</Pages<<>> /OpenAction <</S/URI/URI(javascript:alert(location))>>>>>>
• JavaScript• Inherits the origin of the
document• Uses the cookies of the browser• launchURL() – Redirects the
browser• getURL() – Redirects the browser• submitForm() – POST request via
the browser• XML External Entity
• Two-way communication• Patched in latest version of
Adobe Reader (FINALLY)
COMMUNICATION
• Mixes well with just about any format
• Server can verify benign format
• Impact
• CSRF
• Cross-origin leakage
• Easy to inject
• Token-set overlaps with HTML
• Context dependent
• Can extract sensitive information
• CSRF protection token
• User information
• Impact
• CSRF
• Cross-origin leakage
PDF POLYGLOTS
Syntax injection Content smuggling
PDF-BASED SYNTAX INJECTION ATTACK
PDF-BASED CONTENT SMUGGLING ATTACK
• PDF as the malicious format
• User provided content of any kind
• PDF as the benign format
• CV database
• Conference systems
• User supplied content reflected
• XSS vulnerabilities
• JSON
• XML
POTENTIAL TARGETS
Syntax injection Content smuggling
DEMO
http://internot.noads.biz
EVALUATION• Syntax injection
• Approach
• Alexa top100
• Results
• Content smuggling
• Approach
• Results
• Responsible disclosure
ALEXA TOP100
• Determine context
• Send expected content-type as header
• Content-Type: application/pdf
• Content-Type: image/*
• Server decides whether content matches expected content-type
• Gives server control the interpretation of contents
• Error code (404, 500)
• Alternate content
MITIGATION APPROACHES
Forward notification approach
• Browser• Strict enforcement of server
provided content-type• Disallow type-attribute
• Interpreter• Strict(er) parsing?• Limit communication methods
• Syntax injection• Filtering? In general, no!
• Content-smuggling• Serve content from a
sandboxed domain (googleusercontent.com)
MITIGATION APPROACHES
Server side (application) Client side
• Improvements in latest version• Matching first bytes against
know magic values• Already found a bypass!
• Limit worst communication method
• Filtering• PDF tokens and keywords
{ <, >, trailer }
• Content Security Policy
• DO NOT!!!
PDF MITIGATION APPROACHES
Server side Client side
DO NOT!!!
Content-Disposition: attachment; filename="fname.ext”
Content-Type: application/octet-stream
”If this header is used in a response with the application/octet- stream content-type, the implied suggestion is that the user agent should not display the response, but directly enter a `save response as...' dialog.”
• This is NOT respected by Adobe Reader
SUMMARY• Polyglot attacks – New breed of cross-origin attacks
• Syntax injection
• Content-smuggling
• PDF-based polyglot attacks
• Flexible error tolerant format
• Powerful beyond necessity
• Mitigation approaches
• Forward notification approach
• Specific approaches
THANK YOU!
CROSS-ORIGIN CSS ATTACK• Minimal amount of CSS-syntax injected in target HTML-page
• {}#f{font-family:’
• … arbitrary HTML content …
• ’}
• Attacker uses HTML-page as style-sheet in his page
• Victim visits attackers page
• Attacker can extract the arbitrary content from imported style-sheet
GIFAR – CONTENT SMUGGLING ATTACK• GIF-image
• Parsed top-down, content after trailer ignored
• JAR-file
• Based on ZIP-archives
• Parsed bottom-up, content before header ignored
• GIF + JAR = GIFAR
• copy /b benign.gif + malicious.jar gifar.gif
• The GIFAR is uploaded to a vulnerable service,
• The GIFAR is embedded from the vulnerable service on attackers page as an applet
• Any visitor to the attackers page will execute the applet
CONTENT SNIFFING ATTACK• Browser performs content sniffing when server provides unknown content-type
• Content is matched against a series of signtures
• If a match is found the content is interpreted as the matched type
• Attacker creates a “chameleon” file
• Benign format + HTML
• The file is crafted to match HTML signature
• The chameleon is uploaded to a vulnerable service
• The chameleon is embedded in an iframe on the attackers page
• Any visitors will trigger the content sniffing and render the HTML