James Crisp

Software dev, tech, mind hacks and the occasional personal bit

Category: Ruby / Rails (Page 1 of 6)

Rails serving big password protected files – Capistrano Rsync & X-Sendfile

Say you’ve got a few app servers, and you want to serve up some largish files from your rails app (eg, pdfs) behind a login screen. Well, you could put them on s3 and redirect the user to s3 with expiring links. However, this would mean the eventual URL the user gets in their browser is going to be a s3 URL that expires in an ugly way with an XML error when the link expires. And if the link (copied from the address bar) is shared around, it’ll work for non-authorised people for a little while. Then when the s3 link expires, the receivers of the link will never get to see your site/product (maybe they might want to register/subscribe), instead they’ll just get a yucky s3 API looking error in XML, and nowhere to go.

Well, where could we put the files? How about using the file system on the app servers?

Two things to solve.. how to efficiently ship the files to the app servers, and how to serve them without tying up expensive Ruby processes.

1. Shipping the files to the app servers with Capistrano and Rsync

If you’re using docker or similar, you might want to bake the files into the images, if there aren’t too huge.

Assuming you’ve got a more classic style of deploy though..

Welcome old friends Capistrano and Rsync. Using Rsync we can ensure we minimise time / data sending files using binary diffs. We can also do the file transfers simultaneously to app app servers using cap. Here’s the tasks I put together. The deploy_pdfs task will even set up the shared directory for us.

We’re sticking the files into the ‘shared/pdfs’ directory created by capistrano on the app servers. Locally, we have them sitting in a ‘pdfs’ directory in the root of the rails app. This might seem inconsistent (and it is), but the reason is due to limitations/security restrictions with X-Sendfile.

2. Serving the files with X-Sendfile/Apache to let Rails get on with other things

So Rails provides a helpful send_file method. Awesome! Just password protect a controller action and then send_file away! Oh but wait, that will tie up our expensive/heavy ruby processes sending files. Fine for Dev, but not so great for production. The good news is we can hand this work off to our lightweight Apache/nginx processes. Since I use Apache/Ubuntu, that’s what I’ll cover here, but the nginx setup is similar. Using the X-Sendfile header, our rails app can tell the web server a path to a file to send to the client.

How to set up Apache & Rails for X-Sendfile

Ok let’s get Apache rolling:

You need to whitelist the path that files can be sent from, and it can’t be a soft link. It needs to be an absolute path on disk. Hence we are using the ‘shared’ directory capistrano creates, rather than a soft linked directory in ‘current’. X-Sendfile header itself lets you send files without a path (just looks for the files in the whitelisted path), but unfortunately we can’t use this as Rails send_file checks the path exists and raises if it can’t find the file.

In your rails app in production.rb add:

  # Hand off to apache to send big files
  config.action_dispatch.x_sendfile_header = 'X-Sendfile'

In development you probably don’t need this since you won’t be using a server that supports x_sendfile. Without this config, rails will just read the file on disk and send it itself.

In a controller somewhere, just use send_file to hand off to Apache. You’ll need to specify the absolute path to the file in the ‘shared’ directory. I’d suggest putting the path to the shared directory in an environment variable or config file (however you do this usually for your app per environment), and then just append the relevant filename on to it. Also, remember to validate the requested filename (I use a whitelist of filenames to be sure), to avoid the possibility of malicious requests getting sent private files they shouldn’t from elsewhere on disk.

BigDecimal fix for Rails 4 with Ruby 2.4

Rails 4.2.9 works well with Ruby 2.4.2 except for an incompatible change with invalid decimal values and String#to_d. BigDecimal was changed in 1.3.0 (which ships with Ruby 2.4) to throw an exception on invalid values passed to the constructor. This also impacted String#to_d and caused it to raise an exception when it didn’t previously.

If string_amount is an empty String, code like this:

string_amount.to_d

will throw an exception like:

ArgumentError:
  invalid value for BigDecimal(): ""

rather than returning 0 as it did on ruby 2.3 and below.

This is handled in Rails 5 but the change has not been ported back to Rails 4.

If you’re on Rails 4, you’ve got a few options. You could add a monkey patch in your application.rb or initializer:

class String
  def to_d
    begin
      BigDecimal(self)
    rescue ArgumentError
      BigDecimal(0)
    end
  end
end

Or alternatively upgrade the BigDecimal version in your bundle. Ie, add to your Gemfile:

gem "bigdecimal", ">= 1.3.2"

Bundler will ensure you get the later version of BigDecimal rather than the one that ships with Ruby 2.4, and the behaviour was later fixed in BigDecimal 1.3.2 (but Ruby does not include it yet). See the Ruby issue for more details.

Moving to HTTPS, Rails force_ssl and Rollback

Background
We recently moved the Getup site from mixed HTTP/HTTPS to completely HTTPS. The primary driver was to ensure that sessions were never sent in plain text over the wire, to avoid session hijacking. There are other benefits too, such as protecting personal details from eavesdropping over the wire, proving site authenticity and generally simplifying the code. Checking around the web, Twitter is HTTPS only, and even Google search is all HTTPS (when you are logged in).

Rails HTTPS and force_ssl
The easiest and simplest way of moving a Rails 3 app to all HTTPS is to simply set force_ssl = true in relevant environment files. This then causes Rack ssl middleware to be loaded as the first middleware. As you can see from the code, this middleware does a variety of good stuff, that really ensures once HTTPS, always HTTPS!

  • 301 permanent redirect to HTTPS (cached forever in most browsers so you never hit the http url again)
  • Secure on the cookies so they can never be sent over HTTP (important they don’t accidentally go with a redirect for example!)
  • HSTS to ensure that in supported browsers, nobody can ever go to HTTP again!

HSTS
HSTS (HTTP Strict Transport Security) is offered by Chrome and Firefox ensures that for a given time (usually a long time, eg 20 years in the case of Twitter!), it is not possible to go to the site over HTTP again. To activate this, the site only needs to send the Strict-Transport-Security header once. You can check and manage what’s in your HSTS store in Chrome with chrome://net-internals/#hsts

Rollback
When moving to HTTPS, we wanted to ensure we could rollback to a previous release in case of problems. Using force_ssl out of the box precludes this – if you roll back after a 301 redirect or HSTS loaded by a client browser, your site will no longer be accessible!

We used a small monkey patch which turns off HSTS and uses a 302 temporary redirect, rather than a 301. This means that rollback to a previous release works fine. Here’s the patch:

require 'rack/ssl'

module ::Rack
  class SSL

    def redirect_to_https(env)
      req        = Request.new(env)
      url        = URI(req.url)
      url.scheme = "https"
      url.host   = @host if @host
      status     = 302
      headers    = { 'Content-Type'  => 'text/html',
                     'Location'      => url.to_s,
                     'Cache-Control' => 'no-cache, no-store'
                   }
      [status, headers, []]
    end

    def hsts_headers
      {}
    end

  end
end

This patch is only needed temporarily, until you decide that you no longer would want to deploy a release before force_ssl.

Performance
So is HTTPS slower than HTTP? It is to start with in initiating the first request, as ssl needs to be negotiated and set up. This leads to a few more round trips. If your clients and servers are in same country, this is pretty insignificant. Round trips from Australia to USA for example are more significant but not a major stumbling block, as long as you use Keep-Alive on the connection to ensure that later requests re-use the set up from the first request.

Assets are fully cacheable over HTTPS using the usual HTTP headers, and have been even since early Internet Explorer versions. You do need to make sure that you always load HTTPS assets though, to ensure you don’t get mixed-mode warnings in the web browser. We found some useful HTTPS performance tips here.

Load on the servers was not an issue for us as we are using Amazon Elastic Load Balancers (ELB) for our HTTPS implementation. The web/app servers don’t get involved as they are just reverse proxied by the ELB, which manages the HTTPS sessions.

Redirect Gotchas!
We have a few other domains which simply redirect to the canonical www.getup.org.au domain. Out of the box, the rack ssl middleware loads first, before our redirect middleware. This meant that for these additional domains, we got a nasty certificate warning in the browser as it is sent to HTTPS first (on the wrong domain), and then gets the redirect to the canonical domain that has the valid certificate. Changing the order of middleware to do redirects first, and then HTTPS is an easy solution.

Conclusion
The move to full HTTPS has gone smoothly and we didn’t end up needing rollback. However, it was worth having the monkey patch so that rollback was possible as an insurance policy against unexpected major problems.

Talk on Tues: Moving to HTTPS

I’ll be giving a talk at Sydney ALT.NET on Tues:

After recently moving the Getup site fully to HTTPS, James will share with you security pitfalls, the justification for the move from mixed HTTP/HTTPS, lessons learnt, and performance tips. A romp through the protocols of the web with riffs on status codes, HSTS, domain verification, and interesting headers. This talk could save your bacon.

From 6pm at ThoughtWorks Sydney office on Pitt St. Remember to RSVP on the Sydney ALT.NET site to help with catering. See you there!

VPS Performance Comparison and Benchmarking

What VPS / cloud instance provider gives you the best bang for buck for a small linux server? I’ve been focusing on evaluating providers at the low cost end, and reasonable latency to Australia – Linode Tokyo, Octopus Australia, RailsPlayground USA and Amazon Singapore. From the tech specs on a VPS, most providers look similar. However, when you have an instance, you often find huge variation in performance due to contention for CPU and disk on the node. For a simple test, I ran a small ruby script on each server every 20 minutes on cron, and logged the results to a CSV file over the same 5 days.

Here is the script:

start = Time.now

`man ls`
`ls /`

disk_time = Time.now - start

start = Time.now

(1..200000).each { |i| i*i/3.52345*i*i*i%34 }

cpu_time = Time.now - start

total = disk_time + cpu_time
puts "#{total},#{disk_time},#{cpu_time},#{start.strftime('%Y-%m-%d')},#{start.strftime('%H:%M:%S')}"

It uses ls and man to test wait for disk and then some made up maths to test the CPU. It’s not perfect but gives a good indication of performance.

Results
All times are in seconds, and you can download the full data to get a better picture. I’ve provided a summary and notes for each provider below.

Linde 512mb Tokyo (Zen)

  Average Max
Total time 0.81 1.74
Disk 0.03 0.13
CPU 0.78 1.63

Full Results

This is my second Linode VPS. I asked to be moved to a different node as the first one started fast when I got it but performance degraded (due to contention on the node?) within a few days. Nothing else is running on the VPS besides the script. Overall this is consistent, good performance in both CPU and disk. Latency to this VPS is around ~130ms from my ADSL in Sydney.

Octopus 650mb Australia (VMWare)

  Average Max
Total time 0.74 3.40
Disk 0.19 2.88
CPU 0.54 1.08

Full Results

Octopus is running a site for me with a significant cron job on the hour. I’ve therefore removed results collected on the hour. Despite running my live site, Octopus has the fastest average performance of all VPS tested. The higher max time could have been caused by load on my site. Octopus costs a bit more but is hosted in Australia so has very low latency of around 20ms from my ADSL in Sydney.

Amazon Small Instance 1.7gb Singapore (Zen)

  Average Max
Total time 1.25 2.42
Disk 0.20 0.40
CPU 1.04 2.03

Full Results

Amazon EC2 Small instances are now in the same ballpark cost as small VPS instances, when you go for high usage reserved. Many people think that EC2 disk is poor. However, from my tests (and experience) it is not super fast, but it is consistent and reliable. Amazon is also very generous with memory and provides other benefits like free transfer to S3. Where it falls down is processor speed, which, although fairly consistent, is about 50% slower than Linode and Octopus. Latency from Sydney ADSL is around 120ms.

Rails Playground 640mb USA (Virtuozzo)

  Average Max
Total time 3.53 24.12
Disk 2.44 23.42
CPU 1.09 2.66

Full Results

My RailsPlayground VPS is running various services and sites for me but none of them are high load. As you can see, the CPU performance is similar to Amazon and doesn’t vary too much. The problem is disk which varies hugely and can sometimes lead to unresponsive sites and services while waiting for IO. Latency from Sydney ADSL is around 200ms.

Conclusion
If you want a fast VPS with low latency in Australia and are willing to pay a bit more than the other providers listed, Octopus VPS will serve you well.

For lots of memory but slower CPU, Amazon small instance will be good.

For faster CPU but less memory, Linode is your best choice.

It’s worth testing out any new VPS and keeping an eye on performance over time. Contention on your node could change dramatically without you being aware of it, dramatically impacting performance of your VPS.

Rails Refactor is now a Gem!

Rails Refactor, a small command line tool to make rails refactoring more fun, is now available as a gem. To use:
gem install rails_refactor

More info available on github.

VIM is Sydney Rails Devs’ Favourite Editor

Outstanding news! As part of the rails refactor talk at RORO (Sydney Rails Group) tonight (great evening by the way!), I asked for a show of hands on people’s favoured editors. I was amazed to discover the vim has edged out TextMate with just over half of the people at the group using it as their editor of choice! As an aside, Netbeans had one supporter, RubyMine and Emacs had zero. The groundswell of support for vim (and the cheering) was impressive!

PS – this is a very nerdy post, but as a long time vim fan, I had to report on it 🙂

Short Talk on rails_refactor at Rails group

I’ll be giving a short talk with Ryan Bigg on Rails Refactor at the next Sydney Rails Group (RORO) meet (Tuesday, Feb 8 from 7pm) . We’ll be talking about Rails Refactor’s birth at a hack night last year, what it can do for you right now, and its bright future as your refactoring tool of choice. Hope to see you there.

nRake Microsoft Case Study

nRake is now the subject of a Microsoft case study. Check it out here:

UPDATE: Now on the Microsoft Case Study site.

Rails Refactor & Hack Night

During the RORO hack night last Wednesday, Ryan Bigg (@ryanbigg) and I worked on a Rails Refactor, something I’ve been meaning to get going for a long time.

How often have you wasted time doing renames in rails? Sure it’s hard to automate everything without understanding the code, but there sure are a lot of mechanical steps that you can easily automate. Ryan and I took on controller renames and got a fair way in the few hours we spent on the night.

Code is on github

To rename a controller:
$ rails_refactor.rb rename OldController NewController

  • renames controller file & class name in file
  • renames controller spec file & class name in file
  • renames view directory
  • renames helper file & module name in file
  • updates routes

To rename a controller action:
$ rails_refactor.rb rename DummyController.old_action new_action

  • renames controller action in controller class file
  • renames view files for all formats

Looking to extend it with model renames, and then more complex refactoring.
If you like it, please fork and contribute 🙂

Page 1 of 6

Powered by WordPress & Theme by Anders Norén