Software dev, tech, mind hacks and the occasional personal bit

Category: Ruby / Rails Page 1 of 7

Solving mysterious null values in Mysql date columns, stored by a Rails app

A few months ago, when I was doing some detailed database backup and restore testing, I discovered there were, out of millions of records which had user-entered dates, a handful that had null dates in a database column. I scratched my head for a while and couldn’t work out how this had happened, as the field is validated in Rails for empty/null.

Just today, I got an exception report from a different part of the system which does a query based on the user entered date, and it revealed the source of this extremely rare problem! So.. drum roll..

Accidentally, a user had entered a date with the year 20223.

This is valid in Ruby/Rails but too big to be stored in the mysql Date column, so had ended up (silently) being stored as null!

Easily fixed by limiting the date range a bit!

ChatGPT Programming Test

I was writing a little function in Ruby and thought I’d ask ChatGPT to have a go at it.

It is easy to read, explicit, and fairly idiomatic Ruby (though not concise). Not the most performant implementation but nothing terrible. It also mainly works but does have a bug in some cases.

Here is the code:

def working_days_between(start_date, end_date)
  # Make sure the start date is before the end date
  start_date, end_date = end_date, start_date if start_date > end_date
  
  # Calculate the number of days between the two dates
  days = (end_date - start_date).to_i
  
  # Calculate the number of weekends between the two dates
  weekends = ((start_date..end_date).count { |date| date.saturday? || date.sunday? })
  
  # Subtract the weekends from the total number of days to get the number of working days
  working_days = days - weekends
  
  return working_days
end

If you have the start/end date on a weekend, then you get a negative answer. Eg,

working_days_between(Date.parse("Sat, 04 Mar 2023"), Date.parse("Sun, 05 March 2023"))
 => -1

It is because the weekend number of days calculation is including both the start date and the end date. Ie, working_days = 1 – 2 = -1

A human could easily have made the same mistake, mind you.

A better / simpler implementation is:

(from_date...to_date).count { |date| date.on_weekday? }

Note the 3 dots (…) for the date range, which does not include the end date.

Later, I tried asking ChatGPT to regenerate the answer multiple times. It gave me quite a different version every time – some versions with bugs, some with no functions, some with support for public holidays, etc.

Rails ActiveModel, with nested objects and validation

So maybe you have a model that is not backed by a database table? ActiveModel is meant to cover this scenario and give you the usual ActiveRecord goodness and validation.

But the story gets much harder if you want to have nested objects. In a normal ActiveRecord::Base backed model, you can simply say:

accepts_nested_attributes_for :order_lines

and Rails will manage form submitted nested parameters for you.

Life is not so simple, or documented with nested objects on ActiveModel. accepts_nested_attributes_for is not available. But some of the underpinnings are.

So enough talk, how do you make it work? I’ll show you with an Order / OrderLines example.

Note the very special name: order_lines_attributes=. This hooks into the Rails handling of nested form parameters. Also the valid? method propagates the child errors up to the parent object, so that they show up at the top of the form.

Now how do you do the nested form? It’s similar to normal database backed nested records.

Hope this helps, it is not documented anywhere I could find, and worked out mainly though reading the Rails source.

Rails: Removing error divs around labels

Rails makes it very easy to style fields with errors on your form. Unfortunately, the same error DIV with class ‘.field_with_errors’ is applied around labels, as well as inputs/checkboxes/selects. This tends to mess up the layout and double up on error display. To fix this, you can configure the field_error_proc in your application.rb to ignore labels. The code below calls the original error decoration proc for all types of tags except for labels.

Importing Excel 365 CSVs with Ruby on OSX

Up until 2016, Excel for the Mac provided a handy CSV export format called “Windows CSV”, which used iso-8859-1/Windows-1252 encoding. It was reliable, handled simple extended characters like degree signs, and could be read in Ruby with code like:

CSV.foreach(file, encoding:'iso-8859-1', headers:true, skip_blanks:true) do |row|
    ....
end

Unfortunately, support for this format was dropped in Excel 365. Many RiskAssess data files were in this format, as the earlier versions of Excel did not properly export to valid UTF-8.

Excel 365 now has a Save As option of “CSV UTF-8 (Comma-delimited)”. This is UTF-8.. with a few twists! Sometimes it includes a UTF-8 format first character, and sometimes not. Sometimes it includes the UTF-8 format first character plus a BOM (byte order mark), another invisible character. According to Wikipedia “The Unicode Standard permits the BOM in UTF-8 but does not require or recommend its use”. This makes it trickier to import. Code like this:

CSV.foreach(file, encoding:'UTF-8', headers:true, skip_blanks:true) do |row|
    ....
end

will handle the first 2 possibilities, but not the BOM. The BOM ends up as an invisible character in the data, causing trouble. It is possible to import with encoding ‘bom|utf-8’ which will handle this case. Another option is to run data through the dos2unix command (easily installed with brew) which does general tidying including removing the unnecessary BOM.

Also to watch out for, “Windows CSV” format previously used “\r” to denote new lines inside of cells. The new UTF-8 export uses “\n” for new lines inside of cells.

Fixing ‘Invalid query parameters: invalid %-encoding’ in a Rails App

Sometimes users manually edit query strings in the address bar, and make requests that have invalid encodings. Unfortunately Rails does not handle this neatly and the exception bubbles up. Eg,

ActionController::BadRequest
ActionView::Template::Error: Invalid query parameters: invalid %-encoding (%2Fsearch%2Fall%Forder%3Ddescending%26page%3D5%26sort%3Dcreated_at)

from:
/rack/lib/rack/utils.rb:127:in `rescue in parse_nested_query'

[Note: This was with Passenger, which passed the request through to the app – your mileage may vary with other servers]

In the case of my app, these corrupted query strings are not that important, but users are receiving 500 server error pages. Sometimes they end up with a bad query string URL cached in browser history, so they keep going back to it rather than to the home page.

A simple solution, that gives a good user experience for my app, is to simply drop the query string on a request completely if it has invalid encoding. See my implementation using Rack middleware below:

Rails serving big password protected files – Capistrano Rsync & X-Sendfile

Say you’ve got a few app servers, and you want to serve up some largish files from your rails app (eg, pdfs) behind a login screen. Well, you could put them on s3 and redirect the user to s3 with expiring links. However, this would mean the eventual URL the user gets in their browser is going to be a s3 URL that expires in an ugly way with an XML error when the link expires. And if the link (copied from the address bar) is shared around, it’ll work for non-authorised people for a little while. Then when the s3 link expires, the receivers of the link will never get to see your site/product (maybe they might want to register/subscribe), instead they’ll just get a yucky s3 API looking error in XML, and nowhere to go.

Well, where could we put the files? How about using the file system on the app servers?

Two things to solve.. how to efficiently ship the files to the app servers, and how to serve them without tying up expensive Ruby processes.

1. Shipping the files to the app servers with Capistrano and Rsync

If you’re using docker or similar, you might want to bake the files into the images, if there aren’t too huge.

Assuming you’ve got a more classic style of deploy though..

Welcome old friends Capistrano and Rsync. Using Rsync we can ensure we minimise time / data sending files using binary diffs. We can also do the file transfers simultaneously to app app servers using cap. Here’s the tasks I put together. The deploy_pdfs task will even set up the shared directory for us.

We’re sticking the files into the ‘shared/pdfs’ directory created by capistrano on the app servers. Locally, we have them sitting in a ‘pdfs’ directory in the root of the rails app. This might seem inconsistent (and it is), but the reason is due to limitations/security restrictions with X-Sendfile.

2. Serving the files with X-Sendfile/Apache to let Rails get on with other things

So Rails provides a helpful send_file method. Awesome! Just password protect a controller action and then send_file away! Oh but wait, that will tie up our expensive/heavy ruby processes sending files. Fine for Dev, but not so great for production. The good news is we can hand this work off to our lightweight Apache/nginx processes. Since I use Apache/Ubuntu, that’s what I’ll cover here, but the nginx setup is similar. Using the X-Sendfile header, our rails app can tell the web server a path to a file to send to the client.

How to set up Apache & Rails for X-Sendfile

Ok let’s get Apache rolling:

You need to whitelist the path that files can be sent from, and it can’t be a soft link. It needs to be an absolute path on disk. Hence we are using the ‘shared’ directory capistrano creates, rather than a soft linked directory in ‘current’. X-Sendfile header itself lets you send files without a path (just looks for the files in the whitelisted path), but unfortunately we can’t use this as Rails send_file checks the path exists and raises if it can’t find the file.

In your rails app in production.rb add:

  # Hand off to apache to send big files
  config.action_dispatch.x_sendfile_header = 'X-Sendfile'

In development you probably don’t need this since you won’t be using a server that supports x_sendfile. Without this config, rails will just read the file on disk and send it itself.

In a controller somewhere, just use send_file to hand off to Apache. You’ll need to specify the absolute path to the file in the ‘shared’ directory. I’d suggest putting the path to the shared directory in an environment variable or config file (however you do this usually for your app per environment), and then just append the relevant filename on to it. Also, remember to validate the requested filename (I use a whitelist of filenames to be sure), to avoid the possibility of malicious requests getting sent private files they shouldn’t from elsewhere on disk.

BigDecimal fix for Rails 4 with Ruby 2.4

Rails 4.2.9 works well with Ruby 2.4.2 except for an incompatible change with invalid decimal values and String#to_d. BigDecimal was changed in 1.3.0 (which ships with Ruby 2.4) to throw an exception on invalid values passed to the constructor. This also impacted String#to_d and caused it to raise an exception when it didn’t previously.

If string_amount is an empty String, code like this:

string_amount.to_d

will throw an exception like:

ArgumentError:
  invalid value for BigDecimal(): ""

rather than returning 0 as it did on ruby 2.3 and below.

This is handled in Rails 5 but the change has not been ported back to Rails 4.

If you’re on Rails 4, you’ve got a few options. You could add a monkey patch in your application.rb or initializer:

class String
  def to_d
    begin
      BigDecimal(self)
    rescue ArgumentError
      BigDecimal(0)
    end
  end
end

Or alternatively upgrade the BigDecimal version in your bundle. Ie, add to your Gemfile:

gem "bigdecimal", ">= 1.3.2"

Bundler will ensure you get the later version of BigDecimal rather than the one that ships with Ruby 2.4, and the behaviour was later fixed in BigDecimal 1.3.2 (but Ruby does not include it yet). See the Ruby issue for more details.

Moving to HTTPS, Rails force_ssl and Rollback

Background
We recently moved the Getup site from mixed HTTP/HTTPS to completely HTTPS. The primary driver was to ensure that sessions were never sent in plain text over the wire, to avoid session hijacking. There are other benefits too, such as protecting personal details from eavesdropping over the wire, proving site authenticity and generally simplifying the code. Checking around the web, Twitter is HTTPS only, and even Google search is all HTTPS (when you are logged in).

Rails HTTPS and force_ssl
The easiest and simplest way of moving a Rails 3 app to all HTTPS is to simply set force_ssl = true in relevant environment files. This then causes Rack ssl middleware to be loaded as the first middleware. As you can see from the code, this middleware does a variety of good stuff, that really ensures once HTTPS, always HTTPS!

  • 301 permanent redirect to HTTPS (cached forever in most browsers so you never hit the http url again)
  • Secure on the cookies so they can never be sent over HTTP (important they don’t accidentally go with a redirect for example!)
  • HSTS to ensure that in supported browsers, nobody can ever go to HTTP again!

HSTS
HSTS (HTTP Strict Transport Security) is offered by Chrome and Firefox ensures that for a given time (usually a long time, eg 20 years in the case of Twitter!), it is not possible to go to the site over HTTP again. To activate this, the site only needs to send the Strict-Transport-Security header once. You can check and manage what’s in your HSTS store in Chrome with chrome://net-internals/#hsts

Rollback
When moving to HTTPS, we wanted to ensure we could rollback to a previous release in case of problems. Using force_ssl out of the box precludes this – if you roll back after a 301 redirect or HSTS loaded by a client browser, your site will no longer be accessible!

We used a small monkey patch which turns off HSTS and uses a 302 temporary redirect, rather than a 301. This means that rollback to a previous release works fine. Here’s the patch:

require 'rack/ssl'

module ::Rack
  class SSL

    def redirect_to_https(env)
      req        = Request.new(env)
      url        = URI(req.url)
      url.scheme = "https"
      url.host   = @host if @host
      status     = 302
      headers    = { 'Content-Type'  => 'text/html',
                     'Location'      => url.to_s,
                     'Cache-Control' => 'no-cache, no-store'
                   }
      [status, headers, []]
    end

    def hsts_headers
      {}
    end

  end
end

This patch is only needed temporarily, until you decide that you no longer would want to deploy a release before force_ssl.

Performance
So is HTTPS slower than HTTP? It is to start with in initiating the first request, as ssl needs to be negotiated and set up. This leads to a few more round trips. If your clients and servers are in same country, this is pretty insignificant. Round trips from Australia to USA for example are more significant but not a major stumbling block, as long as you use Keep-Alive on the connection to ensure that later requests re-use the set up from the first request.

Assets are fully cacheable over HTTPS using the usual HTTP headers, and have been even since early Internet Explorer versions. You do need to make sure that you always load HTTPS assets though, to ensure you don’t get mixed-mode warnings in the web browser. We found some useful HTTPS performance tips here.

Load on the servers was not an issue for us as we are using Amazon Elastic Load Balancers (ELB) for our HTTPS implementation. The web/app servers don’t get involved as they are just reverse proxied by the ELB, which manages the HTTPS sessions.

Redirect Gotchas!
We have a few other domains which simply redirect to the canonical www.getup.org.au domain. Out of the box, the rack ssl middleware loads first, before our redirect middleware. This meant that for these additional domains, we got a nasty certificate warning in the browser as it is sent to HTTPS first (on the wrong domain), and then gets the redirect to the canonical domain that has the valid certificate. Changing the order of middleware to do redirects first, and then HTTPS is an easy solution.

Conclusion
The move to full HTTPS has gone smoothly and we didn’t end up needing rollback. However, it was worth having the monkey patch so that rollback was possible as an insurance policy against unexpected major problems.

Talk on Tues: Moving to HTTPS

I’ll be giving a talk at Sydney ALT.NET on Tues:

After recently moving the Getup site fully to HTTPS, James will share with you security pitfalls, the justification for the move from mixed HTTP/HTTPS, lessons learnt, and performance tips. A romp through the protocols of the web with riffs on status codes, HSTS, domain verification, and interesting headers. This talk could save your bacon.

From 6pm at ThoughtWorks Sydney office on Pitt St. Remember to RSVP on the Sydney ALT.NET site to help with catering. See you there!

Page 1 of 7

Powered by WordPress & Theme by Anders Norén