Aug 1, 2013

Spent for Trello - Google Apps integration through Apps Script and Chrome extensions

Note: The apps script backend described here is no longer used as Plus for Trello now supports all those backend features (and much more) without needing the backend. The rest of this post shows how Plus for Trello worked back in 2013 with an apps script backend and google spreadsheet storage.

At Cloudware360 we use Trello to organize and document our development process. Each of our products has a Trello board to keep track of tasks and its stages (columns in Trello as in To do, Doing, Done). Here is a simple example:



However, Trello does not support Spent/Estimate on tasks (called cards in Trello). We found a Chrome Extension called Trello 3000, which does allow assignment of  Spent/Estimate units (hours, days, etc) per card but it was too basic:
  • You had to type the Spent/Estimate in a special format at the beginning of each card title: For example, a card named "(1/9) Upgrade interface" meant that you had advanced 1 unit out of 9 units.
  • There was no way to know who consumed the units, or get any type of reports like Spent units per day per member, Spent vs. Estimate over time, etc.
  • Anyone could change the title and without a historical log there was no traceability.
  • To get information about several boards you had to enter each board manually and collect the information.
  • You couldn't see Spent per users, unless only one user worked on a board, and you had to manually add up all its board's stats.

It was great that Trello 3000 took care of making it look pretty, plus added totals per columns and board, however it was not enough for us. We needed user and card traceability and arbitrary historical reporting.

We already had a system based on Google Spreadsheets with Apps Script but it was double work to enter tasks there and in Trello. We wanted to integrate our existing system with trello, and we saw that S/E card/board/user tracking was a highly requested feature for Trello.

We modified our Apps Script so that data entry would no longer be done from spreadsheets or a separate interface. Instead, you can enter them directly in Trello as card comments:





Our system uses the Trello API to read every minute these special comments. "specialUser" is a special Trello user an admin creates, which receives notifications from the board users. To set up a board all you need to do is add you own special user to the board.

In Trello you can "tag" users with @, so "@specialUser S/E comment" is notifying this special user about its Spent/Estimate progress, timestamped with the comment post time.

From bottom to top, you can see that user "Zig" has advanced 5 hours out of 90 estimated hours.  (5/90)
Next, he advanced 1 more hour, while keeping the estimate equal (1/0).  Spent/Estimate are additive, thus so far the card is at (6/90).
Finally, Zig realized that he needs more time and increased the estimate by 3 hours, leaving the card at (6/93).

Since we interact with trello just with comments, we can also use the Trello mobile app (Android, iOS) to report our Spent/Estmates as well, which is a really great plus.

The system automatically updates the card balances in the card titles. Spents and Estimates are independent for each user, so you can even have a single card shared by multiple users, each with its own estimate. In that case the card estimate is the sum of estimates from all user's estimates.

All this information is also written to Google spreadsheets. There is one spreadsheet per user, and one global "master" spreadsheet, which contains all the information (the union of all the information in all the user spreadsheets). the "Admin" page takes care of all the user setup, which creates and automatically configures all the necessary spreadsheets plus a configuration spreadsheet.

We also updated Trello 3000 which is open source, and made an improved Chrome Extension called "Plus for Trello". This extension includes a weekly Spent report for the user and links to their spreadsheets. Administrators get additional links to the master spreadsheet and system configuration. we also changed its color scheme and a few other fixes:




Users can also see spent/estimates for All Boards in the trello page:




On that same page, users see embedded charts. and lists. Administrators get this view on the Trello home page:



Users get the same view except for the first chart (team spent this week) These reports and charts load very quickly because they use the user's Google credentials to read the data directly from the spreadsheets, without having to call the apps script.

In the weekly report, W31 refers to the current week number and clicking the report opens the spreadsheet with the Dashboard sheet selected:



Users can go from a spreadsheet row directly to the trello card by clicking its URL (note: screenshots dont yet show this.)

The Dashboard sheet has two reports, both built using  the spreadsheet QUERY function. All historical data is stored on the "trello" sheet. There are other sheets with more reports, for example this report shows Spent units for a given week. Changing the orange cell to "2" will show the report from 2 weeks ago (zero shows the current week). QUERY's  filtering, group by and pivot takes care of all the reports, which are more powerful than pivot tables:




There are other reports, like spent all weeks, spent versus estimate, a powerful search by any field, and others.




Administrators get more reports on the master spreadsheet, like burn down charts per board, all cards, units spent per user per week over time, spent per board over time, spent versus estimate, search, etc:

Burn Down




Hours per Day



Hours per week




spent per board per week



Users can also make their own reports. They can read (but not change) the base trello sheet, create new sheets and put their own queries or pivot tables.

Users can use a special format to report past tasks (in case they forget to report them, etc), for example: 
@specialUser -2d 3/0 finished rewrite in cobol
reports 3 spent units for the card, 2 days ago, with comment "finished rewrite in cobol."

Users get a daily mail with their current week statistics, and also get emails when they report S/E with an incorrect format:


There are some error cases, for example when an unregistered user sends a comment to the special Trello reporting user, where we don't know that user's email address but we do know her Trello name. On those cases we send a Trello notification to the user using the Trello API to tell them about the error.

The Apps Script is written in such a way that its able to recover from any possible errors at any critical point in the code. To allow this, we augmented Apps Scripts debugging and logging capabilities by writing our own unit-testing module with a powerful logger.
This module allows us to use asserts and save a log of events with 3 levels of importance (informational, unusual, error). Whenever an error happens, it will also save the call stack of the error in the log, which is great for debugging it later:

if (trelloMasterCache.idSince===undefined) {
      assert(trelloMasterCache.table===undefined);
      assert(dbss.obj.bMaster); //should always be a cache of master
      trelloMasterCache.idSince=idSsSince;
      atStep("After getting trello data, before updating user scriptdb obj");
      saveUserData(user, userData);
            

In this snippet the code is calling assert. Assertion failures cause an exception so it forces the code to stop, plus it writes it to the log along with its call stack for later debugging.
Later you can see we call our atStep function. This declares a critical point in the code, and we have lots of those in the code. Its best to put them before starting any "write" operation that is independent of others.
We have several such states, for example, we read from Trello, then sync that info with the spreadsheets (allowing por partial syncs in case of errors), then updates some ScriptDb objects, then writes new card titles to Trello, then again writes some ScriptDb objects, with many more in-between states that handle recovery from partial syncs. We placed an atStep between all these in-between states, and later we randomly cause exceptions inside atStep thus forcing partial syncs at various possible points in the code. On the next sync, we can see on the log how the code detects and corrects the partial syncs:


Here it shows that an error was detected. This is an actual error caused by a downtime in the Trello servers. After all, we call a urlFetch on Trello 1,440 times a day, so every few days we get this exception. The error contains the call stack, and later you can see an "unusual" entry. This is because it detected that the previous sync failed. Any inconsistencies that are corrected are also logged as "unusual".
Its always good to know they are happening when they should (instead of silently patching some other bug, for example.) Admins can also turn on "full logging", which will also show all informational errors (each atStep all generates one.) This is very useful to see, in real-time, the code execution logic as testers enter card comments and play with it.

A testing module allows for tweaking of testing parameters like: how often should random exceptions happen (say, 10% of the time), and how "deep" to wait until random exceptions happen (say, wait until 5 events until staring to generate errors, or wait randomly between 1 and 20 steps before generating exceptions, etc). There, we also randomize several other global parameters, like buffer and paging sizes for querying the Trello api, among others:

function checkEnableTestMode() {
  if (!g_test.bEnableRandomTests)  //this is controlled by the admin web page
    return;
  
  
  //from time to time, have no errors
  if (randomInt(0,2)==0) {
    g_test.bEnableRandomTests=false;  //enable random failures half the time
    return;
  }
  
  g_test.percentRandomErrorSimulation=70;      //fail on 70% of critical points
  g_test.cStepsBeforeFailure=randomInt(1,40);  //wait this many steps before fail
  
  g_cItemsSSBatchSize=randomInt(1,30);         //spreadsheet process batch size
  g_cCardsBlockOperations=randomInt(1,3);      //card process batch size
  g_cLimitPageSizeTrelloNotif=randomInt(1,20); //pagesize for getting trello data
  
  if (randomInt(0,2)==2) {
    //remove unused cards after 1sec (instead of the normal 30 day wait)
    g_msPreserveRecentCards=1000; 
}


This testing module allows us to define critical points in the code (for example, after inserting blank rows, but before setting the row values), and these critical points are where the random exceptions can happen. We also have a testing trigger that randomly posts new comments to Trello. This way, during testing, the system automatically posts new S/E to cards thus testing is fully automated. 

This allowed us to implement automatic testing. We simply configure the testing parameters, and leave the system running while we use continue to use it. We check the error logs to see if there was anything unexpected like an "assert" failure.

We implemented this robust testing technique because it would have been very hard to test every single possible partial-sync case by hand and over time. After running tests we can later analyze the logs and determine if all cases were simulated and recovered correctly. We still do the regular manual tests, but we have to do much less so.

Our assert function is simply:

function assert(bValue) {
  var type = typeof bValue;
  if (type =="boolean" && bValue) {
  } else if (type =="number" && bValue!=0) {
  } else {
    throw new Error("assertion error. "+type+":"+bValue);
  }
}

The "throw" raises an exception, which is later caught at the top level of our Apps Script sync trigger:

  } catch (e) {
    bLog=true;
    errorMsg=messageFromException(e);  //extract the info along with call stack
  }
  ...
  logError(errorMsg);
}

function messageFromException(e) {
  var errorMsg="";
  if (e.message===undefined)
      errorMsg=Utilities.jsonStringify(e);
    else {
      //exceptions come with a special stack property                      
      errorMsg=e.message+" - "+e.stack;
    }
  return errorMsg;
}

A special testing page allows a tester to setup the step filter in atStep below as well as other parameters like percentRandomErrorSimulation, cStepsBeforeFailure, etc.

The atStep function looks like this:

function atStep(strStep) {
  logInformational(strStep);
  //informational logs only get written if an admin
  // or tester has configured it.
  var strThrowCause=""; //blank means dont throw
  
  //this allows a tester to trigger errors on specific (not random) spots.
  if (g_test.bEnableFailByMatchLog) {
    var txtMatchFail=ScriptProperties.getProperty(g_strPropLogMatchFail);
    if (!isBlankValue(txtMatchFail) &&
     strStep.toLowerCase().indexOf(txtMatchFail)>=0)
      strThrowCause="FailByMatchLog";
  }
  
  //Also supports random failures by number of steps and percent failure.
  if (g_test.bEnableRandomTests) {
    if (g_test.cStepsBeforeFailure>0) {
      g_test.cStepsBeforeFailure--;
    }
    else if (g_test.percentRandomErrorSimulation>0) {
      var percent= 100*Math.random();
      if (percent<=g_test.percentRandomErrorSimulation)
        strThrowCause="FailByRandom";
    }
  }
  
 //this exception will get caught the same way an assert it caught
  //which will end up in the special log.
  if (strThrowCause != "")
    throw new Error(strStep+" - "+strThrowCause);
}

Our logError function will append log data to a special sheet, and also colors the row depending on severity. Besides logError, we also have logInformational and logUnusual. Errors and Unusuals are always logged. Informational messages are logged only of its enabled, plus it has a great feature which, in case of an error, will show the informational messages as well (for the current trigger call), so we can see the exact context of the error. This is accomplished by keeping unlogged messages on a global array, and only outputting those logs when needed.

We used UI Services to quickly make the administration and testing web pages, and the published script also handles calls from the Chrome extension. The service is published as anonymous, so the Trello web page can directly show user statistics and system links:



Here the user reported on the current week #31 10 units on Tuesday, 10 on Wednesday, and zero today (Thursday). Administrators also get the other two links to the master spreadsheet and the administration page:


On this page, administrators can do the initial, one-time configuration of the Trello connection. It handles getting the necessary Trello oauth tokens, as well as configuring and modifying users and roles. There is also a link to the Test page, where the automatic testing framework can be enabled and configured, among others.

Our sync is highly optimized for the most common cases to prevent reaching execution time quotas. Calling sync with no new data only takes 2 seconds to complete, which includes the urlFetch call to the Trello API, and a sync with data takes under 20 seconds to complete. It might take longer to complete a partial sync, as more that one Trello api call might be needed to get everything back in sync. Even if for any reason it times-out, it will pick up where it left off on the previous sync (at most one minute later.)
We use cache services to keep things like options (users, roles, etc) at hand so we dont have to read the configuration spreadsheet and scriptdb for often-used information. 

The Chrome extension is not really needed for the apps script to work, however it makes Trello look better and easier to read, and it provides easy access to the weekly report and spreadsheets right from Trello.

We had a great time with Apps Script and we accomplished all of our goals and more. We also ended up with a robust script that is very easy to copy and install on any Google account, allowing anyone to use this system for free.

The extension "Plus" is already open-sourced and the Plus chrome extension  is published free, you can use it without this script.

In the near future it would be easy to extend the script and, instead of writting to user and master spreadsheets, the rows of data could be appended to a single BigQuery table. We have already done some of this work to append the row data into BigQuery. Given that rows never change in the current system (data is always appended), it would work very well in BigQuery. Partial Syncs would be easier to handle, there would be no storage limits, and all queries could be displayed with Google Charts using Dashboards.

We are also currently adding an extra card classification, so that multiple cards (in other boards or the same board) can be tracked as a single item. We will use this to classify cards by their "Requirement#". Each requirement can have multiple cards, and the requirement is on a spreadsheet column, thus reports and charts can be made grouping or pivoting by it.

Currently you can already enter them by using the #hashtag notation. You can add these #hashtags anywhere in the card title, and may add or remove them at any time by editing the card title. Spent currently just ignores them (strips them from the card title in the spreadsheets), but the extension does display them in a nice way in the Trello web page.