Tuesday, April 17, 2018

AWS Step Functions Early Best Practice Learnings

Let's be honest. Anybody who has used AWS Simple Workflow knows there is nothing simple about it.  The "simple" part is that you don't have to maintain the state engine but it is so feature rich that actually using it can be tricky, which is why there are so many frameworks built on top of it.  (I know, I built my own).

AWS Step Functions is a welcome addition from Amazon that can be thought about as a watered down Simple Workflow that provides a flow language, which removes the need to create custom "deciders".

I've found it covers about 90% of what I need but I do miss some SWF features: N-branches, kicking off child-flows, etc.

One large feature I hope they add to Step Functions is a way to manipulate the output json before sending it to the next step in a process.

Best Practices:

  • Lambda functions should be written to be general, not specific to a single step function flow.
    •  Instead of writing an ExportS3ToCSVFile consider writing a more generic ExportS3ToFile lambda that is configurable for several output file types.
  • Lambda functions should, if possible, add result fields to the input json and output that json
  • Lambda functions should take any json shape as input as long as it's required fields are present.
    • Do not throw a validation error if extra fields are present in the input json.
Rationale:

It is much easier to chain lambdas within a step function flow if the above practices are followed.

Missing Functionality: Json Transformation

One large piece of the puzzle that AWS Step Functions currently does not provide is a way to transform the output json of one lambda before passing it into the next lambda step.

This is unfortunate because it forces you to write your lambdas to be workflow-specific, which violates the "write generalized lambdas" best practice.

Workaround

Step Function does provide an under-appreciated step type called "Pass".  Pass allows you to mock up an output and inject it anywhere in the input json doc.  This injection inspired me on how to create a workaround json transform lambda:


The gist of the idea (sorry) is that you use a Pass step to inject a "transformScript" field into the incoming json document.  That field contains all the transform code in plain javascript.   Then your step function flow calls the JsonTransform Lambda to act on that script.

For example I may have a lambda that returns: 

{
  "trace": "abc123",
  "field1": "value1",
  "field2": "value2"
}

and I want to add a third field that combines field1 and field2 and a dateUpdated field.  So I use a Pass step to inject a transformScript field that produces:

{
  "trace": "abc123",
  "field1": "value1",
  "field2": "value2",
  "transformScript": "event.field3=event.field1 + \" \" + event.field2; event.dateUpdated=new Date()"
}

Forwarding this to the JsonTranfrom lambda produces the expected json:

{
  "trace": "abc123",
  "field1": "value1",
  "field2": "value2",
  "field3": "value1 value2",
  "dateUpdated": "2018-03-19T21:20:08.571Z"
}

Downsides?

Some won't like the code smell of injecting JavaScript into their step-function flow.  My counter argument is:
  • The script is run inside a node.js sandbox so nothing too funky can happen
  • The transforming of json from the output of one lambda to fit the expected shape of the next lambda is flow code since that transform only matters to that particular step function flow.
Another downside is that it takes two step function steps to do a transformation.
  • A "Pass" step to inject the transform script
  • A "Task" step to run the JsonTransform lambda.
Currently there is no way around this.  One thing I do to keep things straight is name both the steps as:

"CreateFields" -> "CreateFields!"

Using the same name but adding a "!" to indicate the transform.  This makes it easier to visualize where transforms are happening in the flow.

Summary

I wrote this up quickly but hopefully it will help inspire more discussion on Step Function best practices and missing functionality Amazon may introduce in the future.  Step Functions are great but very limited in this first release.

Wednesday, August 29, 2012

Create Reminder in AppleScript with Intelligent Date Time Parsing

Although you can make a new reminder object in Mountain Lion's Reminders application directly with Apple Script you lose the cool built-in parsing that extracts the date/time settings.

For example we want to simply write "Bowling with Bob 6PM on Friday" and have Reminders parse out the date and time to create a reminder(name="Bowling with Bob", datetime="2012-09-01 18:00:00")

Here is the script, note that I have it in Alfred-friendly form since I use the Alfred app launcher:

on alfred_script(q)
try
  tell application "Reminders"
    activate
    activate -- Single activate doesn't always work, esp if Reminders is closed.
    show list "Reminders"
   
    tell application "System Events" to keystroke "n" using command down
    tell application "System Events" to keystroke q
    tell application "System Events" to key code 36 -- enter
  end tell
 
on error a number b
  display dialog a
end try
end alfred_script

-- how to call:
--alfred_script("Bowling with Bob 6PM on Friday")

Tuesday, April 5, 2011

Fixing iChat GoogleTalk (gchat) connection issues.

iChat kept dropping my two Google chat accounts so I did some of my own Googling for a fix:

Create a script to reconnect iChat.


I created a file '~/dev/ichat-reconnect.sh':


#!/usr/bin/osascript


if appIsRunning("iChat") then
 tell application "iChat"
        if status is offline then
            log in
        end if
        set originalStatus to the status message
    end tell
end if


on appIsRunning(appName)
 tell application "System Events" to (name of processes) contains appName
end appIsRunning


Create a LaunchAgent


LaunchAgents are the OSX equivalent of cron jobs.

In the folder ~/Library/LaunchAgents/ I created a file 'com.ichat.reconnect.plist':


<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>label</key>
    <string>com.ichat.reconnect</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/gcoller/dev/ichat-reconnect.sh</string>
    </array>
    <key>OnDemand</key>
    <false/>
    <key>Nice</key>
    <integer>1</integer>
    <key>StartInterval</key>
    <integer>5</integer>
    <key>StandardErrorPath</key>
    <string>/tmp/AlTest1.err</string>
    <key>StandardOutPath</key>
    <string>/tmp/AlTest1.out</string>
  </dict>
</plist>


Basically it means run my ichat-reconnect.sh script every 5 seconds.

Tell launch agent about your file


Issue the command:

launchctl load com.ichat.reconnect.plist

To get it started. It will start automatically on reboots.

Thursday, March 17, 2011

Intellij IDEA: Key command for right-click context menu

Ever wanted that right-click menu without having to touch the mouse when using IDEA?

Note: This is for OS X, probably is available for Windows.

Turns out to be easy enough:
1) Open KeyMap in Settings
2) Find "Show Context Menu" in the "Other" folder
3) Assign it a key (I used F13 since it is easy to find and available)

Wednesday, January 5, 2011

Groovy to Scala: Closures

Defining a closure that takes two variables and returns a String.

Groovy:
{ x, y -> ...}

Scala:
(x:Stirng, y:String):String => { ... }

Note, Scala requires types for values x and y.

Groovy to Scala: Regular Expressions

Coming to Scala from Groovy/Java. Scala seems to have a bit more overhead in learning basic concepts so I'm planing on keeping my notes here in a series of short posts. This is not meant by any means to be comprehensive but just a "hello world" for each topic that I wish I had.

Regular Expressions:

Defining: Just add a ".r" after a normal string:

val regEx = "apple*".r

Using: Use in a typical Scala match statement:

val name = ....
name match {
  case regEx => // do your processing here
  case entry => // like default, possibly throw an error
}

Grouping:

val zipMatch = "(\\d+)-(\\d+)"
val zip = "12345-1234"
zip match {
  case regEx(num1, num2) => // num1 == 12345, num2 == 1234
  case entry => // do nothing
}