Gmane
Gravatar
From: Rick Bradley <rick@...>
Subject: (long!) An "Enterprise" Rails story and a help with nested transaction support
Newsgroups: gmane.comp.lang.ruby.rails
Date: 2005-10-10 17:32:22 GMT (3 years, 38 weeks, 14 hours and 55 minutes ago)
Hello, everyone, 
A tiny bit of background first:  I'm managing one of those so-called
"Enterprise" projects: N * 1,000 dedicated users spread over multiple
states (probably 80+ locations at deployment), 24/7 uptime required,
$multi-million project budget (see also: training), multiple platforms
for both servers and clients, etc.

We started out presuming we were going to be forced to use a Java stack
(primarily due to 3rd parties we are partnered with).  We implemented a
vertical slice of the app (db tables, Hibernate mappings, pojos/beans,
Struts config, JSPs, AJAX stuff, unit tests) for one group of models.
Only, actually we couldn't get that far in a reasonable amount of time:
we began to get bogged down in "shotgun code smell" (aka "the domino
effect") with all the inter-layer coupling, and AJAX was damned-near
impossible to get working with the Struts/JSP layer we were using.  That
and we still hadn't had the time yet to make a decision on which testing
tools to use (JUnit, Cactus, HTTPUnit, JHTTPsomethingorother, etc.).

I came into this project a big Rails fan, but, given the external
requirements, I understood that avenue had been foreclosed.

As we began to run the boat aground on the big Java stack our developers
(who were of their own accord looking at how Rails does the AJAX views
for help in figuring out how to do such things in Java) started asking
"Why don't we just do this stuff in Ruby?"  These are Java developers,
hired for their Java skills, asking why not move to Ruby on Rails?

At about that time, some of the external constraints started to loosen.
It may not be *necessary* to deploy the project in Java, after all.
But, we'd need a good case for switching architectures.

Shortly thereafter our boss (The Director) said, "How would you guys
feel about taking a couple of weeks to implement some of this in Ruby to
see how well it would work?"  There wasn't a hint of a "no" at the
table.

Our technical lead hacked a quick vertical version of one model class in
Rails that had already been implemented in Java.  He found an 8:1
reduction in code size going from Java to Rails.

About a week ago (just a couple of days after our boss gave us the "go"
to try a Rails version of the module we'd already written in Java), we
started Rails development, using 3 developers (2 of which had only a day
or two's worth of knowledge of Ruby), one very part-time DBA, and me
fielding the occasional question nobody else wants to deal with (see
below...).

After a week we have ~85% of the Rails development done for the module,
including a slick AJAX interface (which we never managed in Java), and a
subset of our needed unit/functional tests (see below...).  Best
estimate for equivalent effort we put in on the Java side (because our
data modelling can be reused, clearly) is about 6 weeks.

Here are some figures from today's most recent SVN pulls:

Java version:

10361 lines of Java code
1143  lines of JSP
8082  lines of XML
1267  lines of build configuration
-----------------------------------------------------------
20853 TOTAL lines of stuff

Rails version:

494   lines of code (386 "LOC" per rake stats)
254   lines of RHTML
75    lines of configuration (includes comments in routes.rb)
0     lines of build configuration
-----------------------------------------------------------
823   TOTAL lines of stuff

Code reduction alone is right around 20:1, and overall lines produced
(config, templates, code) the ratio is just over 25:1.  I'm using the
larger (494) number for Rails LOC, because I counted the Java LOC by a
simple 'wc -l' and so doesn't get rid of comments, whitespace, whatever.
We also are using 2 DB platforms, hence the ability to get a full 75
lines of config for the Rails app.

So, bottom line from where we sit:  the guys complaining that reported
10:1 savings over Java was bullsh*t were right on target:  10:1 is way
too low from what we're seeing.

I'm sure there will be plenty of people saying "Well, you should have
used {JSF, Shale, Tapestry, Spring, Echo{1,2}, Castor, Cayenne, etc."
(fwiw, we were using draft EJB3.0 w/ annotations), but I've seen zero
evidence that any combination of the most cutting edge Java components
will get us down to a functional application using <= 823 lines of total
stuff -- really, not even w/in a factor of 5 of that from all I've read
lately.

Note that I haven't even mentioned to this point the app server (JBoss),
which includes a few lines of XML complexity of its own:

& Mon Oct 10 10:49:57 jrbradle <at> rick
~/svn/phoenix/srv$ find . -type f -name '*.xml' | xargs wc -l | grep total 
44472 total

That and add in the sheer heft that is running JBoss up under
CruiseControl and our build server has to be downright stoked with RAM.

  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Enough.  As Arlo Guthrie would say, "But that's not what I came here to
talk to you about." I came to talk to you about nested transactions in
ActiveRecord.  More background?  Glad you asked.

Our project is building out a healthcare application, one which will be
open-sourced within a consortium of mental health care providers around
the US.  The reason the consortium exists (and that it's not just an
open source free-for-all) is that we, a community-based mental
healthcare nonprofit, want to promote research in the field (see also
industry buzzword "evidence-based treatment).  We will give away the
system, even host it if you need it, so long as you provide your patient
data (anonymized!) for use in a research database to be used by
researchers in academia and elsewhere to develop new mental health care
protocols.  The protocols feed back into the system, lather, rinse,
repeat.

Great fun, until you run into HIPAA, Sarbanes-Oxley, and the restrictive
auditing requirements for drug trials.  For anything related to patient
data we not only have to log who did what to the data, but also who saw
the data, and we typically have to preserve the data in the database.
This ultimately requires (among other things) a database-level no
deletions policy.  In our data model one of the ways this is implemented
is by constraints (we're currently targeting both Oracle and PostgreSQL
though others will be supported later) which say for certain tables
"sorry, you can't delete that record!".  Since access to the database
can come from outside the Rails application (there is a policy document
which enumerates what we will be supporting), app-level constraints do
not suffice.

Which brings me to the crux of the matter.  To actually test the
functionality we've implemented, using fixtures, we discovered something
interesting (and, in retrospect) obvious.  When we specify fixtures on a
protected table, the fixtures will load in properly, we use
transactional tests so that our individual unit tests run in isolation,
and then at the end of the test the fixtures attempt to unload from the
database but can't -- because the data can't be deleted from the table.
Further, the next test case which uses those fixtures is in a bit of a
bind because the fixture records in question are already in the table.

This is one of those cases, to me, where we don't have a choice but to
use the constraints in the database.  And it's important for us to be
able to test that deletions can't be performed -- otherwise how will we
know if the application behaves properly under the real production
constraints?  It really looks like we need a further layer of
transactions (a nested transaction) around the overall test case.
Start a transaction, load the fixtures, wrap each unit test in a
transaction (like we're currently doing) with a rollback, and at the end
roll back the big transaction.

So I did some legwork (on the 0.13.1 version, not yet on the svn trunk,
though I spot-checked some things in the Trac browser and didn't see any
eyebrow-raisers).  I see that Transaction::Simple and ActiveRecord::Base
handle the core transactional support, ultimately delegating to the
actual connection adapters to do a "BEGIN"/"ROLLBACK"/"COMMIT".  Support
for nested transactions would introduce the notion of savepoints, and
rolling back thereto.

Note that right now the only thing I'm aware of that's holding us up
from saying "We did in Ruby on Rails in 8 days, in 1/20th the amount of
code, what took us more than 6 weeks to do in Java" is the inability to
test our actual database setup, and that smells like it's just limited
by the lack of nested transactions.  Everything else we we've tried
we've had no problem getting to work.

My fears about adding in nested transaction support entirely by myself:

 1- I'd be tempted to do "just what works", which would probably be
 bastardizing the pretty AR way into just hammering nested transactions
 in for OCI and postgres to get our tests to work.  That would break on
 the next release, and wouldn't be much help to anyone else.

 2- Somebody else is probably working on this already.  In which case
 I'd be working at cross purposes and our version would have to be
 dumped on the next release anyway.

 3- I'd probably invent an unwieldy abstraction that wouldn't work well
 for the rest of AR -- and I probably wouldn't be able to test it well
 for the various adapters in use out there.  I don't really have a 100%
 handle on the Test::Unit changes that AR does, nor quite what
 everything in the fixture flow is doing -- I think I could get there
 quickly though.

 4- SVN trunk is a moving target, that's for sure.

Anyway, I'm looking for some advice on maybe the best way to do
implement this.

If someone's working on this already, can I be of help?  If not, what
would be the best way to communicate savepoint information up and down
through AR to/from the various connection adapters.  What about adapters
which don't support nested transactions?  On the testing side,
presumably there needs to be a class variable to turn things on/off.

Just looking for some guidance.

Thanks for any insights.  Oh, and see you at RubyConf!

Rick
-- 
 http://www.rickbradley.com    MUPRN: 67
                       |  fixes needed" Heh,
   random email haiku  |  it was on my todo list
                       |  for the past few months.