It's a little less obvious how this is a problem. Obviously, handing over control of the database is bad, but what harm can come from plain HTML? The answer is JavaScript. Because executable JavaScript can be inserted into HTML, it's not just a passive data format—in effect, HTML becomes running code.
For example, consider adding a search engine to your intranet application. First you'd create a simple form to accept the query:
Code:
<%= start_form_tag search_url, :method => :get %>
<p><%= text_field_tag :q %> <%= submit_tag "Search" %>
<% end %>
The action behind search_url might then be implemented like this:
Code:
class SearchController < ApplicationController
def index
@q = params[:q]
@posts = Post.find :all,
:conditions => ["body like :query",
{ :query => params[:q]}]
end
end
And finally, the view displays the results:
Code:
<p>Your search for <em><%= @q %></em>
returned <%= pluralize @posts.size, "result" %>:</p>
<% @posts.each do |post| %>
<li><%= link_to post.title, post_url(:id => post) %>:
<%= exerpt post.body, @q %></li>
<% end %>
Can you spot the security hole? The problem is that user input—notably the search query string—is being directly passed to the page output. That means an attacker can feed arbitrary data, such as JavaScript, into the page. Consider a URL like this, with a JavaScript command in URL-encoded form:
http://example.com/search?q=%3Cscript%3Ealert('XSS ')%3B%3C%2Fscript%3E
If an attacker is able to trick a user of the system to follow that URL (perhaps by including it in an email), then he's able to execute arbitrary JavaScript from the context of a logged-in user. In this example, the attack payload is merely a JavaScript alert. But the injected script could just as easily use Ajax to modify the intranet, or even silently send private information (like the user's session key) back to the attacker. The private system is effectively wide open.
The solution is simple: the h helper, also known as html_escape. This helper (actually provided by the ERb templating system, not Rails itself ) escapes HTML strings by making four simple substitutions: it converts &, ", >, and < into &, ", >, and <, respectively. The result is that any attempt to inject <script> tags becomes harmless like a fly

Use it like any other helper:
Code:
<p>Your search for <em><%= h @q %></em>
<%= link_to h(@user.name), user_url(@user) %>


Search
Categories


Print Article
Bookmark Article
Save as PDF
October 24, 2008, 1:45 pm
Newbie question - what should you do if for some reason you need to present the user-inputted data back on the view at some point?
For example, if your site includes a forum or a wiki, and you want user-added links here to be active?
Is it enough to use url_encode (or u) just as you would use h? Or will you need something more sophisticated, like, say, blacklist-based checks in the model using regular expressions to scrub out urls that contain suspicious javascript?
(and if so, could you suggest somewhere I could find such a thing ready made and tested? Convention over configuration and all that... :D)
Thanks