plainblack.com
Username Password
search
Bookmark and Share

Workflow Activities

Workflow Activities are plugin points into the WebGUI workflow engine. They allow you to do offline or asynchronous tasks, as well as a few online or synchronous tasks. Examples of workflow activities include asking a user a question, sending an email, publishing some content, deleting temporary files, and running external programs.

 

The Workflow Engine

To understand how to write good workflow activities, you really need to understand how the WebGUI workflow engine works. A workflow engine is an event-triggered state machine and execution system. In plain words, something happens to cause a workflow to be executed, the workflow engine keeps track of where it is in the process, and runs each task along the way.

 

Steps of WebGUI Workflow

Everything starts with a trigger: something must kick off the workflow. It could be that it's noon on Saturday, or that a user published some content, or that a user registered on the site, or that you received some email in your inbox. You can even have one workflow be the trigger of another, as shown in the diagram below.

 

 

Then you have the workflow itself. A workflow is a list of activities (or tasks) to be executed in a particular order. A workflow can be a single activity, or many of them strung together.

 

An activity is a task, or a single atomic unit of work, that you need to be done. It could be delete an asset, roll back a version tag, email a user, run an external program, check a folder for a file, etc.

 

Synchronicity

WebGUI's workflow engine is both synchronous and asynchronous. Most workflow systems are one or the other.

 

Synchronous (in the user interface called “realtime”) workflows execute to completion immediately upon being triggered. Because of this, there can't be anything in a realtime workflow that could get stuck waiting for external input, or that would take a long time to execute. For example, if a workflow is set up to be realtime, and you have an approval process in the workflow, the user will have to wait hours, or even days, for the page to return after clicking on a link. In reality, what would happen is that the page would time out and the user would get an error. Obviously, this is unacceptable, so realtime workflows should only be used in situations where you aren't blocking for input, and where things can happen very quickly.

 

Asynchronous workflows, on the other hand, happen in the background. They can take all the time they need to execute, because no user is immediately waiting on the result. In asynchronous workflows, there is a controller, also called a governor, that executes one task in a workflow, then another task perhaps in a completely different workflow, and so on. The controller is in charge of what gets executed and when. Because of this, there can be priorities assigned to workflows, and the system can queue many workflows, and be able to handle many more workflows than could otherwise be handled. It can take advantage of queuing to execute workflows over a longer period of time. For example, if the system had 10,000 workflows start all at once, it's likely that your server would not be able to handle them and crash. However, an asynchronous system just puts all the workflows into a queue, prioritizes them, and slowly begins its work of executing all 10,000 workflows.

 

Workflow Engine Components

The workflow engine consists of a controller that hands out tasks, a worker that executes the tasks, and some data to operate on.

 

 

The controller is called Spectre. It keeps a list of what workflows need to be executed and at what priority. When it's time to execute a workflow, it tells WebGUI to actually do the work. WebGUI is the worker because it already has all the code loaded and waiting to run, so why write/run another worker? It has another advantage, though. Since WebGUI can be load balanced, it makes the workflow engine infinitely scalable.

 

 

Spectre and WebGUI work in conjunction like this to accomplish all their tasks.

 

Note: A common mistake people make is that Spectre is the workflow engine. It is not. Spectre's entire job is to tell the workflow engine when it's time to run the next task. Spectre really has no idea what work is being done or how it's being done. It only knows that something needs to be done, and tells the workflow engine when to do it.

 

Spectre and WebGUI Communicate

Spectre and WebGUI talk to each other. WebGUI lets Spectre know when a new workflow has been triggered by a user or some other mechanism. Likewise, it lets Spectre know when priorities change, or when a user modifies a workflow. In turn, Spectre tells WebGUI when it's time to run a workflow, or when something has been triggered based upon time or date. From a high level, the communication looks like this:

 

 

A more detailed view of the interaction between Spectre (black) and WebGUI (orange) looks like this:

 

 

Let's zoom in a little on that. First, Spectre starts up and asks WebGUI what workflows and cron jobs (events triggered by time) are pending in the queue for each site. Then it starts its workflow and cron job handlers.

 

 

Upon starting the cron handler, it starts checking to see if any cron tasks are ready to be triggered. That process looks like this:

 

 

And if something is ready to be triggered, then it tells WebGUI that it's time to trigger that event.

 

 

At this point WebGUI has to do a little sanity check to make sure that it actually is Spectre making the request, and that the workflow being requested to be triggered exists. If it does, it creates a new workflow instance (a running workflow), which is discussed next.

 

Once Spectre has started its workflow handler, it can start running through its queue to get the next workflow to be run. It also has to keep track of how many workflows it already has running that haven't completed so it doesn't overwhelm WebGUI with a lot of extra requests.

 

 

Then, it can tell WebGUI that it's time to run a particular workflow. It does the same kind of validity handshake that it did with the scheduler.

 

 

Then, WebGUI determines what the next activity to run is, loads the module for that activity, and runs it. Depending upon whether it's the last activity in the workflow or not, and whether it executed successfully, it will tell Spectre if it completed the activity, if it's waiting for input, if it had an error, or if the workflow is done.

 

 

 

WebGUI::Workflow::Activity API

Though there are many methods provided by the WebGUI::Workflow::Activity class, there are five methods that you absolutely must know about in order to write a workflow activity. They are definition(), execute(), and the three state methods ERROR(), WAITING(), and COMPLETE(). Look at the API for the version of WebGUI you're working with for full API details.

 

definition()

If you're familiar with writing assets or form controls, then the definition method should be familiar, because it works in much the same way. The definition method tells the activity what data it consumes, and sets a few properties about the activity. Here's an example from an existing activity:

 

sub definition {

my $class = shift;

my $session = shift;

my $definition = shift;

my $i18n = WebGUI::International->new($session, "Workflow_Activity_CleanTempStorage");

push(@{$definition}, {

name=>$i18n->get("activityName"),

properties=> {

storageTimeout => {

fieldType=>"interval",

label=>$i18n->get("storage timeout"),

defaultValue=>6*60*60,

hoverHelp=>$i18n->get("storage timeout help")

}

}

});

return $class->SUPER::definition($session,$definition);

}

 

As you can see, you set the human readable name of the activity (in this case using internationalization), and set what fields should be displayed to the user as settings.

 

States

When developing your activity, you need a way to communicate to the workflow engine whether or not your activity successfully completed. That's where states come in. There are currently three states that you can use with your activity, and they are called as methods that you would return.

 

return $self->COMPLETE;

Complete is used to let the workflow engine know that you have completed your unit of work successfully and it's okay to move on to the next step in the workflow.

 

return $self->ERROR;

The error state means something went wrong. It could be that you tried to connect to a mail server and it was down. The workflow engine doesn't need to know why your activity failed, only that it did. However, the human system administrator does, so in the case of an error, put something in the log.

 

return $self->WAITING;

Waiting lets the workflow engine know that you didn't encounter an error, but the activity didn't complete just the same. This can happen if you're waiting for a user to give you some input, or perhaps whatever the activity is doing (maybe sending out 100,000 emails) is taking a really long time. Therefore, you need to give the resources back to the workflow engine, and it will call this activity again when there's time.

 

execute()

The execute method is the equivalent of the view method in an asset. It's the work that's going to get done. This is where you put your working code, and also where you use the state methods.

 

The execute method gets three objects passed into it. As with any object, the first object passed in is a reference to the activity itself. With that you can get access to the session, the states, properties set by the workflow editor, etc. The second object is a reference to the data this activity will be operating on. Some activities don't work on an object, but most do. So it might be a WebGUI::Group, a WebGUI::VersionTag, a WebGUI::User, or some other object. The third, and final, object passed in is a reference to the running workflow; this is the instance object.

 

Here's a typical execute method from an existing activity:

 

sub execute {

my $self = shift;

my $versionTag = shift;

my $completion = $versionTag->commit({timeout=>55});

if ($completion == 1) {

return $self->COMPLETE;

} elsif ($completion == 2) {

return $self->WAITING;

}

return $self->ERROR;

}

 

WebGUI::Workflow::Instance API

There is generally not a lot that you need to use out of the instance API except for one category of methods: scratch. Scratch methods allow you to attach data to the workflow instance. This can be useful for a couple of things.

 

The first, and most used, is to maintain state information. For example, let's say your workflow activity sends out 100,000 emails, and that your mail server is capable of accepting a maximum of 10 emails per second. After 60 seconds your workflow activity should give up, and return the resources back to the workflow engine (more on this later in the chapter), so that means you will have only sent out 600 emails. You can set a scratch variable in the workflow instance to remind your activity where it left off the last time it ran.

 

The second is to pass information between two or more workflow activities within a workflow. Since the workflow activities run independently, there's no way for them to pass data from one to the next, at least directly. By setting a scratch variable in one activity, another down the line can read that scratch variable and make use of its data. You must be careful when using scratch variables for this reason. If you set a scratch variable that you don't intend for another workflow activity to use later, then you should delete it when you no longer need it.

 

setScratch()

Use the setScratch method to set a variable. The first parameter is the name of the variable, and the second is the value. Note that scratch variables can only be scalars, not references, hashes, arrays, or objects. Therefore, if you need to store a hash, for example, you should serialize it into a scalar using JSON. Here's an example use of setScratch using JSON:

 

$instance->setScratch( “cars”, JSON::to_json(\%cars) );

 

getScratch()

Use the getScratch method to retrieve a variable. Its only parameter is the name of the variable you wish to retrieve. Here's an example, again using JSON to deserialize a hash:

 

my $cars = JSON::from_json( $instance->getScratch(“cars”) );

 

deleteScratch()

Use the deleteScratch method to delete a variable. Its only parameter is the name of the variable you wish to delete. Here's an example:

 

$instance->deleteScratch(“cars”);

 

Rules to Work By

There are a few things to keep in mind as you design your activities. Let's call them rules to work by.

 

First, no workflow activity should run for longer than 60 seconds at a time. There are two practical reasons for this. One is that workflows are handled between WebGUI and Spectre talking through Apache. Therefore, if it takes longer than 60 seconds, the connection might time out. The other is that workflow is about cooperative multitasking. That means that processes should give up their resources once in a while in case there is something higher priority to get done. Without this, the workflow queue could become bogged down very quickly on busy and complex sites. If you need something to run longer than 60 seconds, then either maintain state using scratch variables so you can pick up where you left off, or hand off the task to a daemon that is designed to handle long running processes.

 

Second, clean up after yourself. Don't leave temporary files, scratch variables, and other data lying around. If you make a mess, clean it up so not to gum up the works.

 

Third, make your activities small and uncomplicated. Rather than making a really complicated activity, break up the activity into small parts. This will make coding and testing easier, and will allow your users to extend the uses of your activities in ways you never dreamed of.

 

Workflow Activity Examples

Here are a few workflow activity examples to get you started.

 

Hello World

The following example demonstrates creating a bare bones workflow activity that simply prints “Hello World” to a file. Print it to a file because workflow activities have no human viewable output.

 

package WebGUI::Workflow::Activity::HelloWorld;



use strict;

use base 'WebGUI::Workflow::Activity';



sub definition {

my ($class, $session, $definition) = @_;

push(@{$definition}, {

name => 'Hello World',

properties => { }

});

return $class->SUPER::definition($session,$definition);

}



sub execute {

my ($self, $object, $instance) = @_;

open my $file, '>', '/tmp/helloworld.txt';

print {$file} "Hello World";

close($file);

return $self->COMPLETE;

}



 

Email Poll Example

Let's say that your boss tells you that she wants to create a poll, but instead of putting it out on the web site, she wants to conduct the poll via email. “Great!” you say, and think to yourself, “How the hell am I going to do that?”

 

Functional Specification

Your boss has given you instructions on how her email poll should work.

 

  1. It should email a group of users from the web site.

  2. The email should contain the question, and a list of possible answers as links.

  3. The users should be able to click on the links to respond to the poll.

  4. The responses should be stored somewhere so reports can be generated later.

  5. The poll should end after one month and reject further responses after that time.

 

Technical Specification

From your boss's list of features, you can draft a simple technical specification.

 

  1. The workflow type for this activity should be WebGUI::Group, so that the group object passed into the activity will be the list of users emailed.

  2. The email will have to be HTML based so you can put links in it.

  3. You can use WebGUI's built-in send email to group feature in WebGUI::Mail::Send to email the group.

  4. You'll have to create some www_ methods in the activity to handle the responses when users click on a link.

  5. The workflow activity should return WAITING during the response period so that it can collect responses.

 

Ideally, at this point you'd also create a database schema for storing the data, a flow chart diagram illustrating the use case, and some sample screen shots of what the email and response pages would look like. Then have your boss sign off on these things before proceeding with development.

 

The Code

Now that your boss has signed off, you can begin coding.

 

Note: Internationalization, heavy comments, and POD have been left out for the sake of clarity in this example, but when you're writing your own activities you should always use them.

 

Always start writing an activity by using the code skeleton provided in lib/WebGUI/Workflow/Activity/_activity.skeleton. Begin with the typical header information.

 

package WebGUI::Workflow::Activity::EmailPoll;

use strict;

use base 'WebGUI::Workflow::Activity';

use WebGUI::Mail::Send;

 

Now, you need to set up your definition. Call the activity something simple yet meaningful: “Email Poll”. Then, define a few properties about the activity that you expect your workflow editor to set. You need to have a subject for the email, the question, and the list of answers. Those are all pretty obvious, but because you know your boss likes to change her mind a lot, also build in a timeout setting, so that if she decides that the poll should last for two weeks or two months instead of four weeks, it's an easy change.

 

sub definition {

my ($class, $session, $definition) = @_;

push(@{$definition}, {

name=>"Email Poll",

properties=> {

subject => {

fieldType=>"text",

label=>"Subject",

hoverHelp=>"Put the email subject here."

},

question => {

fieldType=>"HTMLArea",

label=>"Question",

hoverHelp=>"Put your question/description here."

},

answers => {

fieldType=>"textarea",

label=>"Answers",

hoverHelp=>"Put your answers here. One per line."

},

timeout => {

fieldType=>"interval",

label=>"Timeout",

defaultValue=>60*60*24*7,

hoverHelp=>"How long should we allow people to respond?"

},

}

});

return $class->SUPER::definition($session,$definition);

}

 

Next, create your execute method, which does the work. Because this activity contains a multi-step process, use the execute method as a “main” or controller. You could also break up these steps into separate activities, but that might be too complicated for your first example, so this example keeps them in one activity. This example makes use of scratch variables to maintain the state of which step you're on.

 

sub execute {

my ($self, $group, $instance) = @_;

my $endDate = $instance->getScratch("pollEndDate");

# haven't sent the email yet

if ($endDate eq "") {

$self->sendMessage($group);

$instance->setScratch("pollEndDate", time() + $self->get("timeout"));

return $self->WAITING;

}

# time isn't up

elsif (time() < $endDate) {

return $self->WAITING;

}

# time is up

else {

$instance->deleteScratch("pollEndDate");

return $self->COMPLETE;

}

}

 

Because you defined a sendMessage method in the execute method, you need to define that in your code. Here, you're going to make use of the WebGUI::Mail::Send package to send out your emails to the group. That way you don't actually have to look at the group, find the users, look up their email addresses, and send out a number of individual emails. In the email, define URL's for the responses, and one of those parameters is the “method” name that you're going to use to handle those responses. That method is built next.

 

sub sendMessage {

my ($self, $group) = @_;

my $url = $self->session->url;

my $mail = WebGUI::Mail::Send->create($self->session, {

toGroup => $group->groupId,

subject => $self->get("subject"),

});

my $message = $self->get("question").'<ul>';

foreach my $answer (split("\n", $self->get("answers"))) {

next if $answer =~ m/^\s*$/;

my $url = $url->page(

"op=activityHelper;class=EmailPoll;method=respond;response="

.$url->escape($answer)

.";instanceId=".$self->getId,

1

);

$message .= qq|<li><a href="$url">$answer</a></li>|;

}

$message .= '</ul>';

$mail->addHtml($message);

$mail->queue;

}

 

Now, you can build the handler to deal with users clicking on the response links in their emails. Note that if the workflow instance isn't still valid, then the user won't be able to vote and will get a friendly rejection message.

 

sub www_respond {

my $session = shift;

my $instance = WebGUI::Workflow::Instance->new(

$session,

$session->form->get("instanceId")

);

my $output = q|<h1>Thank You</h1>|;

if (defined $instance) {

my $response = $session->form->process("response", "text");

# store the response somewhere

$session->db->setRow("EmailPollResponses", "instanceId", {

instanceId => $instance->getId,

response => $response

});

$output = q|Thank you for your response.|;

}

else {

$output = q|Thank you for your response, but unfortunately our poll has ended.|;

}

return $session->style->userStyle($output);

}

 

And that's all. With only four subroutines, you've built an email based polling system that can be tied into WebGUI.

 

Configuration

Now that you've built your activity, you simply need to put it in the WebGUI config file so that your users can use it. This is fairly simple. Just add the workflow object type of “WebGUI::Group” to the “workflowActivities” directive in the config file. Then, put your activity in that new object type.

 

"workflowActivities" : {

“WebGUI::Group” : [ “WebGUI::Workflow::Activity::EmailPoll”],

...

},

 

After you've made this change, restart WebGUI for the change to take effect.

 

Exception Handling

As you build more complicated workflow activities you'll probably need to handle exceptions gracefully. For example, let's say you're using WebGUI's workflow engine in an order processing system. You need to submit the order to the warehouse for delivery, but it turns out that the order is missing some necessary information that the warehouse API needs. That's an exception.

 

Rather than letting the activity continuously error out, you can handle the exception gracefully in one of two ways. We can either end this workflow, or make it wait until the error is corrected. Either way we should kick off some other workflow that would get this problem corrected.

 

The following example shows how to kick off another workflow, and end the current one.

 

sub execute {

my ($self, $order, $workflowInstance) = @_;

...

if ($order->hasMissingInfo) {

WebGUI::Worlflow::Instance->create($self->session, {

className => 'MyApp::Order',

methodName => 'new',

parameters => $order->getId,

workflowId => $theIdOfTheExceptionHandlingWorkflow,

priority => 3,

});

$workflowInstance->delete;

return $self->COMPLETE;

}

...

}

 

If you wanted it to wait instead, you could replace these lines:

 

$workflowInstance->delete;

return $self->COMPLETE;

 

with this line:

 

return $self->WAITING(60*60);

 

This tells the current workflow to just wait for an hour and check back to see if we still have missing info. Meanwhile the other workflow has been kicked off to notify someone to fix the problem.

Keywords: event triggers scheduler spectre workflow workflow activities

Search | Most Popular | Recent Changes | Wiki Home
© 2018 Plain Black Corporation | All Rights Reserved