The Robertson Team


 

 
 
 
 
 
  Programmers' Tools >  The Free Stuff >  Free ColdFusion Tutorials >  How Can I Throttle Down CFMAIL's Speed (and do it better)?
 

How Can I Throttle Down CFMAIL's Speed (and do it better)?

The mail trickler is really a very cool tool.  Its found a lot of use in a lot of places, both as a way to reduce server load on shared boxes (I'm told by many users that mail volume on shared servers is still an issue despite ColdFusion's ability to handle it) and to keep legitimate mailings 'under the radar' of antispam systems by dribbling them out over an extended period.

However it has one nasty flaw:  It relies on the browser refreshing the page X number of times.  Typically the user sets it, minimizes the window and forgets it.  The mailer runs and no problem.

So what happens when the user accidentally closes the window?  Or the box blue-screens itself, or the cat finally chews all the way thru the power cord (which is something I'd like to be on hand with a camera to see, but I digress), or ...

You get the idea.  Some form of graceful recovery is necessary.  You need the ability to restart a mailing and have it pick up where it left off.  the original technique will, if it blows up, require that you start over again, which will result in some of your recipients getting double mailings.

So lets fix that.  Before I discuss this code, let me first say that you should familiarize yourself with the original method and how it works, as we will only be covering the new ground here.

Next, I'll say that what I will describe here is by no means the only way to skin this cat (there I go with the cats again.  I don't dislike them.  Much).  What you see here is what I think is the simplest way to lay out the concept.  You can pretty this up significantly by incorporating ancillary programming that builds your list and allows either a full run or a failover restart, which uses persistent structures to hold onto and fill in things like sender email, mail server value etc., pulls the email message from another table and so on.  What we are displaying here is an absolutely generic, dumb system that should work with anything you throw at it, including email messages and subject lines that are unique to each recipient.

Finally, if you get the bright idea of copying the email list itself into a persistent structure, be advised we tried this out first ourselves.  It worked great but wasn't completely fail-safe.  We found the safest way to do this and preserve the absolute ability to pick up where we left off is with a plain old query and a query loop.

So Lets Get On With The Show

This mail trickler system differs principally from the original in that we are using a special table dedicated to holding all information for the mailing.  We are not trying to pull from any other existing permanent list, for reasons that will become apparent.  Here's what the table structure looks like:

The ID field is the only indexed field, and is an Identity field (autoincrement in mySQL, AutoNumber in Access).  The EmailMsg is a long text-type field (i.e. text in mySQL and memo in Access).  The rest are simple character fields.  I picked lengths I thought would always work with some extra space left over, but you can cinch them up if you like.

When you create your mailing, you dump the entire contents of the mailing - to, from, server name, message -- the whole thing -- into this table.  One record per message.  The trickler will use all of this data (pretty efficiently) to send the mail.

It does this in two stages.  Stage 1 is the MailList query, which pulls ONLY the email record's ID numbers.  We'll use this ID number to pull the rest of the data out individually, and give ourselves a record count, which we'll need further on.

Further on you'll see that this time around we are deciding whether or not the job is done by simply asking if any records were returned in the query.  If not the end-of-job screen is displayed.

The next noteworthy item is the query loop itself.  We are restricting it via the startrow and endrow parameters to only run the first X rows... in this case the number is 5.  Very simple and yes I know... not terribly efficient.  This is why we only pulled the ID field for this query and not the entire contents.  You could use SELECT TOP [X] or a variation on it, depending on your platform, to limit the records returned, but unless you really need the resources don't bother, as speed is already fast (as noted below) and this will torpedo your ability to display a progress report.

The first thing we do inside the loop is pull the rest of the data that belongs to the message.  Then we simply plug it in and send it.  Then we delete the record we just emailed.  Lastly, a simple display tells the user how many records remain in the job.  This is another place where you can spiff up the display and use the data at hand to provide things like estimated time to completion and whatnot.

Total execution time for each run, running on a busy shared SQL Server, was between 150 and 250 ms, where we were sending html emails ranging in size from 10k to 70k each.

So How Do I Restart After A Failure?

Simple:  Once you bring your computer and browser back up just call the mail trickler template.  It will pick up right where it left off.  We can be sure of that because we expended the extra resoureces to delete the individual db records item by item in our query loop, immediately after the mail was transmitted.

Where To Go From Here?

What do you do when you have multiple users mailing all at once?  The answer to that is fairly simple.  Add a new (indexed) field to the db identifying the user running the mailing.  Adjust your query filters to account for this and poof.  Instant simultaneous multiuser system.  Want to let one user run more than one mailing?  Add another indexed field for the document ID or somesuch (Our CMPro cms does both of these things).

What about if you want to use this technique for scheduled mailings?  Nobody is going to initiate the browser call to start the trickler in that case.  Things get trickier, but you can do it -- on a Windows server at least -- if you have full server access and are willing to put up with a bit of extra maintenance.  ColdFusion's scheduler will run the template, but only once.  The client-side refreshes you need will be ignored.  Calling a batch file via cfexecute that in turn calls the browser won't work around the problem.  You have to use the Windows scheduler to directly schedule the browser to start itself.  The program to run will look something like this:

"c:\progra~1\intern~1\iexplore.exe http://mydomain.com/mailtrickler.cfm"

You will have to also tell Windows to terminate the process if it is still running after X amount of time (otherwise it won't and the process cannot be run a second time).  For daily processes I choose 22 hours; which is far longer than I need but enough in case a particular job gets really big.

Lastly, even though you told Windows to terminate the process... it won't.  It'll just terminate it in the scheduler.  An instance of iexplore.exe will be left running -- invisible outside of the Task Manager's process list -- forever until you go in and terminate the process.  This will eat about 6-9mb of server memory for every instance, and is obviously something you want to attend to as, while harmless, it will slowly eat up your server memory.  If anyone knows a way to get around this please let me know.  UPDATE: There is a free command line utility that will do this for you and automate this last step here.

Have fun with it,

--Matt Robertson--
MSB Web Systems

<cfsilent>
<!---
Run Rate is the number of seconds between refresh.  To
optimize this make this setting about 2 seconds longer
than the ColdFusion Server's mail spool fetch rate. 
Otherwise mail will pile up in the spooler and partially
defeat the purpose of trickling out the mail
 --->

<cfset variables.RunRate=5>
<!---
Query Run is the number of query rows (email addresses)
that will be processed on each execution of this template
--->

<cfset variables.QueryRun=5>
<!---
pull the ID field so we can get a record count. 
--->

<cfquery
   datasource="#request.myDSN"
   name="MailList">

   SELECT
      myMessages.ID
   FROM myMessages
   WHERE
      myMessages.EmailAddr IS NOT NULL
   ORDER BY
      myMessages.ID ASC
</cfquery>
<!---
Do some stuff to prevent page caching, which was found to
be a problem on some client browsers
--->

<cfheader
   name="Expires"
   value="Sun, 06 Nov 1994 08:49:37 GMT">
<cfheader
   name="Pragma"
   value="no-cache">
<cfheader
   name="cache-control"
   value="no-cache, no-store, must-revalidate">
</cfsilent>

<!---
Are we there yet?
--->

<cfif MailList.RecordCount lt 1>
   <!---
   yes.  Inform the user
   --->

   <html><head><title>FINISHED</title></head>
   <h1>All Done</h1>
   <b>Close this browser window!</b>
   <p><b>Do NOT press your BACK button</b> to leave.</p>
   </body></html>
<cfelse>
   <!---
   display the "in-progress" page
   --->

   <html>
   <cfoutput>
   <!---
   This next line re-runs the template automatically at the run rate.

Note the UUID inserted into the url.  This is another part of ensuring

the browser doesn't cache this page.
   --->

   <meta
      http-equiv="REFRESH"
      content="#variables.RunRate#;URL=#cgi.script_name#
?UniqueURL=#UrlEncodedFormat(CreateUUID())#">

   </cfoutput>
   <head><title>Low Volume Mass Mailer with Failover</title></head>
   <!--- 
   Send the actual mail. 
   --->

   <cfloop
      query="MailList"
      startrow="1"
      endrow="#variables.QueryRun#">

      <cfquery
         name="ThisEmail"
         datasource="#request.myDSN#">

         SELECT
            myMessages.EmailAddr,
            myMessages.EmailMsg,
            myMessages.EmailType,
            myMessages.EmailServer,
            myMessages.EmailSubject,
            myMessages.EmailFrom
         FROM myMessages
         WHERE
         myMessages.ID=
         <cfqueryparam
            cfsqltype="CF_SQL_NUMERIC"
            value="#MailList.ID#">

      </cfquery>
      <cfmail
         to="#ThisEmail.EmailAddr#"
         from="#ThisEmail.EmailFrom#"
         subject="#ThisEmail.EmailSubject#"
         server="#ThisEmail.EmailServer#"
         type="HTML">

      <cfmailparam
         name="Message-ID"
         value="<#CreateUUID()#@#ThisEmail.EmailServer#>">

      #ThisEmail.EmailMsg#
      </cfmail>
      <cfquery
         datasource="#request.myDSN#">

         DELETE FROM myMessages
         WHERE
            myMessages.ID=
            <cfqueryparam
               cfsqltype="CF_SQL_NUMERIC"
               value="#MailList.ID#">
      </cfquery>
   </cfloop>

   <!---
   give the user a visual cue as to where the operation is at the moment.
   You can get qite fancy here with pretty html and time-to-completion
   calculations based on your record count, refresh rate etc.
   --->

   <cfset variables.LeftToGo=MailList.RecordCount-variables.QueryRun>
   <cfoutput>

   <p><b>Total left to send: #variables.LeftToGo#</b>.</p>
   <p>Each run is #variables.RunRate# seconds apart.</p>
   </cfoutput>
   <p>Leave this system alone and wait for this job to complete.</p>
   <p>When it finishes, you will be notified.</p>
   </body></html>
</cfif>

 

Wait... There's More.

Someone on the CF-Talk list had an idea
 


The Robertson Team, TheKing@mysecretbase.com
1.559.360.1717 


HostMySite.com is a leader in ColdFusion web hosting and managed dedicated servers.