Pages

Sunday, November 21, 2010

COBOL Performance Tuning - Using Conditional Statements

This performance tuning tip is one that can be applied to any language that you program in.

When you evaluate a field for more than one possible value, whether you use an IF, EVALUATE, GO TO DEPENDING ON, or some other conditional statement, check for the most common value first, then the next common, and so on.

For example, if you are reading through a file and you only want records that are in "A" Active, "W" Waiting or "P" Pending status, and you know out of the one million records you will be reading that about 900,000 of them are active, 30,000 are waiting, 20,000 are pending and the rest are something else, you will want to something like the following:

IF MY-STATUS = "A"
    PERFORM STATUS-IS-A THRU STATUS-IS-A-EXIT
    GO TO PARA-EXIT
ELSE
    IF MY-STATUS NOT EQUAL "W"AND
         MY-STATUS NOT EQUAL "P"
            GO TO PARA-EXIT
    ELSE
        IF MY-STATUS = "W"
            PERFORM STATUS-IS-W THRU STATUS-IS-W-EXIT
        ELSE
            PERFORM STATUS-IS-P THRU STATUS-IS-P-EXIT
        END-IF
        GO TO PARA-EXIT
    END-IF
    GO TO PARA-EXIT
END-IF.

I don't know how many programmers think about this scenario when they code, but I suspect many do not.

Sunday, November 14, 2010

COBOL Performance Tuning - INITIALIZE Statement And Large Arrays

We have covered the INITIALIZE statement and initializing large arrays, so now let's put them together.

Say you have a large copybook with a large array in it.

01  LARGE-RECORD.
(several 05 levels)
    05  MY-TABLE.
        10  TABLE-ENTRY OCCURS 999 TIMES.
(multiple fields here)
(more 05 levels)

Using the above copybook, you would have to modify it so that you can initialize parts of the copybook, but not the array, as you want to use a separate routine to do that.  Here is what I would do:

01  LARGE-RECORD.
  03  LARGE-RECORD-PART-1.
(several 05 levels)
  03  LARGE-RECORD-PART-2.
    05  MY-TABLE.
        10  TABLE-ENTRY OCCURS 999 TIMES.
(multiple fields here)
  03  LARGE-RECORD-PART-3
(more 05 levels)

Now you can initialize LARGE-RECORD like this:

MOVE SPACES TO LARGE-RECORD.
INITIALIZE LARGE-RECORD-PART-1
    REPLACING NUMERIC DATA BY ZEROS.
INITIALIZE LARGE-RECORD-PART-3
    REPLACING NUMERIC DATA BY ZEROS.

You would then initialize MY-TABLE as discussed in COBOL Performance Tuning - Large Arrays.

Contact me if you know of other ways (especially better ones) of accomplishing this objective.

Friday, November 5, 2010

COBOL Performance Tuning - Large Arrays

When initializing large arrays with default values, most programmers will perform a paragraph or internal loop, moving the default values to each field of each occurs until all of the occurrences are populated.  This is not necessarily the most efficient way to perform this task.

One way to perform this task more efficiently is to move your default values to the fields of the first occurrence.  Then move that first group of occurs to the second group (as a block, not as individual fields).  Then move the first two group of occurs to the third and fourth.  Then move the first four group of occurs to the next four, and so on and so on.  Be careful when you get to the end of your occurs that you do not move data beyond the end of your array, or you will likely clobber code.

Here is a quick example:

01  EXAMPLE-TABLE.
    05  MY-TABLE.
        10  TABLE-ENTRY OCCURS 999 TIMES.
            15  FIRST-NAME         PIC X(15).
            15  LAST-NAME          PIC X(15).
            15  SEX-CODE           PIC X.
            15  DOB.
                20  DOB-YYYY       PIC 9(4).
                20  DOB-MM         PIC 99.
                20  DOB-DD         PIC 99.
            15  SSN                PIC 9(9).
            15  SALARY             PIC S9(9)V99 COMP-3.
...
    MOVE SPACES TO MY-TABLE.
    MOVE ZEROS TO DOB
                  SSN
                  SALARY.
    MOVE MY-TABLE (1:54)
      TO MY-TABLE (55:54).
    MOVE MY-TABLE (1:108)
      TO MY-TABLE (109:108).
    MOVE MY-TABLE (1:216)
      TO MY-TABLE (217:216).

You obviously do not want to hard code like in the example above, but this shows the concept in a simple form. You would want to make your start position and length variable fields and perform this in a loop. How you code it is up to you.

The routine I am familiar with became more efficient than the typical performed loop at about 15 occurs. Yours may differ. You may have to look at the assembler code generated to make that determination. 

If anyone has other techniques to populate large arrays, please share them in a comment or email the concept to me and I'll post it on this blog.

Tuesday, November 2, 2010

COBOL Performance Tuning - The INITIALIZE Statement

The INITIALIZE statement in COBOL is really a convenient statement.  It will move SPACES to alpha numeric fields and ZEROS to numeric fields.  But with this convenience comes a cost when initializing large copybooks.

Here is what happens when you initialize a copybook.  The INITIALIZE statement will initialize each field that is defined with a PIC clause, except FILLER fields.  FILLER fields will remain LOW-VALUES or nulls.  The underlying Assembler code that is created for each field being initialized is at least 3 Assembler statements.  So, if you have a copybook with 500 individual fields being initialized, that becomes at least 1500 Assembler statements that are created when the code is compiled.  These 1500 Assembler statements will be executed each time that INITIALIZE statement is executed.

Here are some simple options to tune code where an INITIALIZE needs to be done.

  • Do not initialize the copybook at all.  You will have to determine if this is an option.
  • If the copybook contains entirely alpha numeric fields that can be initialized as spaces, move SPACES to the 01 level.  This will generate 3 Assembler statements.
  • If the copybook contains a mixture of alpha numeric and numeric, and all of the numeric fields need to be initialized with ZEROS, then perform the following:
    01  BIG-RECORD.              COPY BIGREC.
    .

    .
    .
    PROCEDURE DIVISION
    .
    .
    .
        MOVE SPACES TO BIG-RECORD.
        INITIALIZE BIG-RECORD
          REPLACING NUMERIC DATA BY ZEROES.

  • The previous INITIALIZE statement will only initialize the numeric fields with ZEROS.  This would have had to be done either way, so here the INITIALIZE statement with the REPLACING clause will save you several lines of COBOL code, and will create the same amount of underlying Assembler code.
  • Your FILLER fields will contain SPACES when initialized using the 2nd and 3rd options.
When large arrays are present, there is another technique that can be used in conjunction with this way of initializing fields.  This will be discussed in a later entry.

Monday, November 1, 2010

The Best COBOL Tuning Advice I Can Give You

This performance tuning advice is applicable all programming languages, not just COBOL.  It is such a simple piece of advice, you may just face palm yourself when you hear it, or you think "duh!"  Those familiar with Lean and Six Sigma will be very familiar with this concept.

"Remove the code that is the slowest and do not execute it."

That's it.  It's that simple.  Of course, this is not always an option, but some times it is.

For example, most COBOL modules begin with an initialization paragraph where you are moving zeros, spaces, high values, low values, and so on to many work fields and copybooks.  Many of these fields do not need to be initialized because the code is moving a value to them anyway. 

For example, how many times have you created a field called SUB1 that you are using to access elements in an array?  How many times have you moved zero to SUB1 in the initialization paragraph, only to later do a PERFORM VARYING SUB1 FROM 1 BY 1 UNTIL...?  This statement is moving 1 to SUB1 on the first execution of the statement, so moving zero to SUB1 in the initialization paragraph was a complete waste of a code. 

That one line of code may not seem like much, but when that line of code is executed a millions times per night, each night, for roughly 250 nights each year, it adds up.  Now multiply that for each job you run in your batch cycles, your onlines, and the number of fields that are just like SUB1, and it really begins to make a difference.

Keep this tip in mind when you are coding, what ever you are coding in.  Eliminate code, because MIPS are a terrible thing to waste.

And your CEO will thank you (okay, probably not).