Thursday, April 23, 2009

Making asychnronous requests appear real time.

For many business processes, an asynchronous approach makes a lot of sense, especially in cases when the user doesn't really care about the execution of a background process. Just the same, many processes need to provide real time or near real time feedback to the user. Async and Real Time don't have to be mutually exclusive, and a recent project challenge I faced illustrates one approach to blurring the line between these two concepts.

Situation
In brief, I was working on a project that retrieved data from an external API. Our requirements stated that we must execute this retrieval in both a batch and on-demand model. The implementation of the API was asynchronous, such that the request for a query would be submitted, and then a call back with the data would fire anywhere from 1 to 10 minutes later, depending on the volume of data being handled.

Batch operation was no problem, and we implemented a background process that ran overnight. In this case, the asynchronous nature of the API was a non-issue. Our process would handle incoming data as it was received.

However, the on-demand case was a little harder, due in part to the following challenges:

  1. Response time for the callback was unpredictable. Some requests might return results in less than a minute, other may take longer.
  2. We had to consider how long the user would wait for the data to be retrieved.
  3. We had a requirement that on-demand data not be persisted to the data store (the batch process handled this task.)
Problem
How can we make an asynchronous process appear real time to a user?

Solution
There were several key components to our solution.

Created a Request Tracking Id
One of the first things we did was work with the API vendor to embed a tracking Id in to the API call and response. This Id allowed us to match up the request with the response, even though they were detached from one another. The Tracking Id itself had a marker in it that would designate a request as 'batch' or 'on-demand'.

The Tracking Id was a one time use value and needed to be 10 characters long (due to how the API vendor would embed it in the response.) I used a partial Guid and a leading character of 'B' for batch and 'D' for on demand:


Dim trackId As String = "D" + Left(Guid.NewGuid.ToString.Replace("-", ""), 9)


Updated Callback Receiver
Since all responses for both batch and on-demand requests we returned to the same callback (in our case a Web Service) we had to update the method to evaluate the Tracking Id in the response and handle the response according to whether it was triggered in a batch or on-demand.

Batch requests would be persisted to an interim data store that a separate ETL process would operate against.

On demand requests would be cached in-memory for 30 minutes on the web-server hosting the web services. The Tracking Id value became the key for the Cache entry.


HttpContext.Current.Cache.Insert(curCCR.TrackingId.ToString, curCCR, Nothing, System.DateTime.MaxValue, New System.TimeSpan(0, 30, 0))


Add a Cache Retrieval Method
Next, we created a new web service method that accepted a Tracking Id value and returned (if present) the cached object representing the response.


<webmethod()> _
Public Function RetrieveHistoryForTrackingId(ByVal trackingId As String) As BusinessObject

Dim newObj As BusinessObject = HttpContext.Current.Cache(trackingId)

Return newObj

End Function


Creating the User Experience
Now that all the back end handling of the response was in place, we turned to the User Interface for the on-demand request.

Our client was an ASP.NET page with some basic search values (first name, last name, date of birth, gender and zip code). We had a GridView control to display the history results if and when they returned, as well as a status Label control to show any messages to the user.

When the user would submit the search form, we would prepare the search and submit it to a web service that was a facade to the API call. Our Tracking Id was generated at this point and included in the request.


Dim response As webservices.Response = ws.SendRequest(encoded)


The next step would be to set up a polling mechanism to check for the response. We determined that 2 minutes would be the maximum tolerance to the user. We then used the following loop to poll the Retrieval web service and check for the received results.


If response.Errors.Length = 0 Then
Dim obj As webservices.BusinessObject = Nothing

Dim timeout As DateTime = DateAdd(DateInterval.Minute, 2, Date.Now)

While DateDiff(DateInterval.Second, Date.Now, timeout) >= 0
obj = ws.RetrieveHistoryForTrackingId(trackId)

'if Object has data, exit loop
If Not obj Is Nothing Then Exit While

System.Threading.Thread.Sleep(30000)

End While

If Not obj Is Nothing Then
grdHistory.DataSource = obj.HistoryItems
grdHistory.DataBind()
Else
'no responses
lblStatus.Text = "History records have not yet been received. If you would like to check for results again, please use the link below. History records are held for 30 minutes upon receipt."
txtTrackingId.Text = trackId
End If
Else
lblStatus.Text = "Error submitting request: " + response.Errors(0).ToString
End If



If the web service call yielded a result set, we loaded up our data grid and completed the rendering of the page. If the timeout occurred, then we simply posted a message to the user, but also put the Tracking Id into a secondary request form that the user could use to make a follow up attempt to retrieve the results.

Tuesday, April 21, 2009

Recommendation: Codespaces.com

One of the first tasks I faced when setting up a new development project was the need to have a version control system. After a bit of searching, I learned of CodeSpaces, and it was a quick decision to sign up.

At its core, Codespaces is a hosted Subversion provider, but their featureset goes well beyond simply version control. They have a very capable Agile-oriented project management platform that is free for a 2 user subscription. Their platform includes defining tasks (called work items), milestones (which can equate to sprints) as well as bug tracking. It has a clean user interface, and the Subversion repository management is integrated in to the product. Very nice.

After creating an account, it was easy to create a new repository and configure TortoiseSVN to use it.

I had also looked at Google Code and Microsoft's CodePlex, but these platforms both require that the projects they host be open source, which mine was not.

My hat goes off to the CodeSpaces team for a great product.

Friday, April 17, 2009

Real life separation of concerns

One of the most frequent questions faced in the process of software design is "Where do I locate a given piece of logic?" In fact, most of the practice of software architecture is focused on this very question.

The concept of Separation of Concerns leads one to think carefully before implementing any given solution out of a purposeful intention to reuse and compartmentalize the parts and pieces of a system. We want loose coupling, for sure, but in real life there are additional considerations that must be taken and applied to the architecture.

In working with a client recently, I found myself pondering where to perform a data retrieval task as part of a larger process. My solution included a Workflow that called out to an API. That separation of concerns was no problem - the API was available to any client. The question arose in regard to whether or not the Workflow package should manage retrieval of the source data for the API, or whether the Workflow should be passed the source data as a parameter.

Considering Scale
The primary discussion about this architectural question centered around whether or not the Workflow would be used to manage additional API calls, or the integration with a single API. If the former, then it would make sense to retrieve data externally, and pass it to the Workflow with an indicator of which API to call. However, if this was a single-purpose utility, then there is no risk of embedding the data retrieval in to the Workflow itself, making a direct reference to the Business Objects and Data Access layers specific to the application domain.

Considering Time
Another contextual element in this decision was the project time line, and whether it would afford a more significant effort in order to provide a more flexible multi-API implementation. In my experience the additional time may be just enough to push a team towards making a shorter-term decision, under the premise that when time permits (perhaps in the next sprint) a revision of the initial decision can be implemented.

Considering The Deployment
Finally, while I strongly believe that practicing good architecture should hold true in any project, it has to be acknowledged that sometimes a project scope would be prone to over-architecting due to the particularities of the project. Certainly a one-off utility that can be agreed to only operate in a given domain may not require an architecture that leads to highly portable outcomes. As a team, it is important to establish a set of criteria for what determines whether a project or component is a one-off or not.

In the end, we decided to pass in data to the Workflow in anticipation of future data sources and applications also leveraging the Workflow. All the same, I felt this was a good example of how to evaluate the architectural needs in the proper context.

Thursday, April 16, 2009

Sending credentials from InvokeWebService activity.

Using Windows Workflow Foundation (WF) to make a call to a web service is pretty straightforward. In most cases, you just drop on the InvokeWebService activity and do some basic configuration.

However, if you are invoking a web service that requires HTTP based authentication before it can be used, you'll need to attach credentials to the outbound request. It took me a while to sort this out, but found its actually pretty easy. You'll know if you need to send credentials because you'll get a HTTP 401: Unauthorized message back if you don't provide them.

The trick is to leverage the InvokeWebServer.Invoking event. Here are the steps:

1. Right click on your invokeWebService activity and click 'Generate Handlers'. This will add two handlers to your code file:


Private Sub invokeWebServiceActivity1_Invoked(ByVal sender As System.Object, ByVal e As System.Workflow.Activities.InvokeWebServiceEventArgs)

End Sub

Private Sub invokeWebServiceActivity1_Invoking(ByVal sender As System.Object, ByVal e As System.Workflow.Activities.InvokeWebServiceEventArgs)

End Sub

2. In the Invoking handler, set your credentials to the WebServiceProxy class.


Private Sub invokeWebServiceActivity1_Invoking(ByVal sender As System.Object, ByVal e As System.Workflow.Activities.InvokeWebServiceEventArgs)
e.WebServiceProxy.Credentials = New System.Net.NetworkCredential("username", "password", "domain")

End Sub


You can also use the NetworkCredential object to impersonate the current user, which would be the user that the Workflow Host is running as.