Emlyn Tech

August 19, 2009

Job Processing Engine

Filed under: Uncategorized — emlyn @ 11:11 pm

I blogged about creating an online mp3 to youtube service on my main blog, point7. I’ve been thinking about it a lot, so here’s a start at some design for it.

The basic design is that we have a website, which presents the service. However, the real work is done by one or more Job Processing Engines, which is the subject of this post.

A Job Processing Engine runs on a Linux box, and is comprised of a webservice and a daemon. The webservice is the way the website talks to the engine. The daemon performs jobs that can’t be kicked off by calls to the webservice (eg: notification of startup and shutdown to the website, and scheduled job processing).

As ever, I want to approach this stuff incrementally. The smallest useful piece looks like this:

- Job Processing Library

This is a framework for processing jobs. It will have a set of interfaces (IJob, IJobProcessor)

IJob:
- string JobID; unique id for the job, probably a guid.
- JobState State; // {Created, Started, InProgress, Success, Failed}
- DateTime LastStateChange; last time the state changed
- int ProgressAmount; 0 to 100, or -1 for unknown
- string ErrorMessage; an error message in case of failure
- int ResultCode; a result code, 0 for success, >0 for the error (not sure about bothering with this)
- void Start(); moves the job from Created to Started, and possibly to InProgress
- void Cancel(); moves the job from Started or InProgress to Failed (cancelled)

IJobProcessor
- int CreateJob(string aJobID, string aJobType, string aJobDetails, string aCallback, out int aResultCode, out int ErrorMessage);
    // aJobID must not exist, aJobType must make sense, aJobDetails must be for aJobType. State of new job is Created.
- int GetJob (string aJobID, out IJob aJob, out string aErrorMessage); // just a copy of the job
- int StartJob (string aJobID, out string aErrorMessage); // job must exist, starts it. Returns result code.
- int GetProgress (string aJobID, out int aProgress, out string aErrorMessage );
- int CancelJob (string aJobID, out string aErrorMessage);
- int DeleteJob (string aJobID, out string aErrorMessage); // completely delete all trace of a job

To begin with we need two implementations of IJob, which are EncoderJob (calls mencoder), and YoutubeUploaderJob (uploads a video to youtube). Each of these implementations requires its own JobDetails (set via CreateJob), which is essentially a serialised version of the job.

A base job class could be implemented as a statemachine. That would allow the inherent state to be handled.

The subclasses could then just implement the bits that are specific to them; actually doing the work, and figuring out how to report progress.

The JobProcessor class is a good candidate to use for implementing the webservice, with similar methods. It’s designed to be webservice friendly. I’m thinking that it can actually hold references to instances of all “running” jobs (all jobs that anyone has asked about basically). Oddly enough, because there’s nothing scheduled here, we can have the entire thing running from the webservice. As long as all methods that talk to jobs are essentially asynchronous, that’ll be fine.

There’ll need to be some kind of file upload service, to allow uploading of files (required before our jobs can work on them!)

If processing is interrupted (say the machine reboots), the jobs will not restart. I might just leave this as an issue for now, it sounds like we need something on machine startup (in the daemon?) to look for unfinished jobs and pick them up.

Getting this going would be a nice start. The next pieces after that would involve calling back to the master website (which will of course also require something at the master website that they can call). First, there is an “I’m alive” function, handled by a daemon. On startup it would call the website to tell it that this Job Processing Engine is up, and on shutdown the converse. This allows the master website to know that the engine is available to process jobs. Second, the JobProcessor should be able to call the master website when significant progress occurs (more than X seconds pass and progress has changed since last call?), for any job in the InProgress state. It would report the progress. It would also call to report completion (success or failure).

1 Comment »

  1. [...] 2: I’ve started talking technical details, starting with the Job Processing Engine, on EmlynTech, my tech [...]

    Pingback by Uploading MP3s to Youtube, cloud-wise « point7 — August 19, 2009 @ 11:15 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.