Perl Tutorial


Perl Tutorial Overview

Before You Start

This tutorial assumes...

Perl Tutorial Scenario

You have exported a number of email messages to a text file. You want to extract the name of the sender and the contents of the email, and convert it to XML format. You intend to eventually transform it to HTML using XSLT. To create an XML file from a text source file, you will use a Perl program that parses the data and places it within XML tags. In this tutorial you will:

  1. Install a Perl module for parsing text files containing comma-separated values.
  2. Open the Perl Tutorial Project and associated files.
  3. Analyze parse.pl the Perl program included in the Tutorial Project.
  4. Generate output by running the program.
  5. Debug the program using the Komodo debugger.

Installing Perl Modules Using PPM

One of the great strengths of Perl is the wealth of free modules available for extending the core Perl distribution. ActivePerl includes the Perl Package Manger (PPM) that makes it easy to browse, download and update Perl modules from module repositories on the internet. These modules are added to the core ActivePerl installation.

Running the Perl Package Manager

The Text::CSV_XS Perl module is optional for this tutorial. If the program finds that it hasn't been installed, it will use a "Fake" module that implements the minimum functionality needed for this tutorial.

To install it using PPM:

  1. Open the Run Command dialog box. Select Tools|Run Command.
  2. In the Run field, enter the command:

    ppm install Text::CSV_XS

  3. Click the Run button to run the command. PPM connects to the default repository, downloads the necessary files and installs them.

About PPM

Perl Pointer It is also possible to install Perl modules without PPM using the CPAN shell. See the CPAN FAQ for more information.

Opening Files

Open the Perl Tutorial Project

Select File|Open|Project and choose perl_tutorial.kpf from the perl_tutorials subdirectory. The location differs depending on your operating system.

Windows

<komodo-install-directory>\lib\support\samples\perl_tutorials

Linux

<komodo-install-directory>/lib/support/samples/perl_tutorials

Mac OS X

<User-home-directory>/Library/Application Support/Komodo/3.x/samples/perl_tutorials

The files included in the tutorial project are displayed on the Projects tab in the Left Pane. No files open automatically in the Editor Pane.

Open the Perl Tutorial Files

On the Projects tab, double-click the files parse.pl, mailexport.xml and mailexport.txt. These files will open in the Editor Pane; a tab at the top of the pane displays their names.

Overview of the Tutorial Files

Analyzing the Program

Introduction

In this step, you will examine the Perl program on a line-by-line basis. Ensure that Line Numbers are enabled in Komodo (View|View Line Numbers). Ensure that the file "parse.pl" is displayed in the Komodo Editor Pane.

Setting Up the Program

Line 1 - Shebang Line

Komodo Tip notice that syntax elements are displayed in different colors. You can adjust the display options for language elements in the Preferences dialog box.

Lines 2 to 3 - External Modules

Creating the CSV Parser

Line 7 - the BEGIN line

Lines 8 to 11 - Load Text::CSV_XS, if it's installed

Lines 12 to 64 - Define a Text::CSV_XS::Fake class

Writing the Output Header

Lines 67 to 68 - Open Files

Perl Pointer scalar variables store "single" items; their symbol ("$") is shaped like an "s", for "scalar".

Lines 70 to 74 - Print the Header to the Output File

Setting Up Input Variables

Lines 77 to 80 - Assign Method Call to Scalar Variable

Perl Pointer good Perl code is liberally annotated with comments (indicated by the "#" symbol).

Lines 80 to 81 - Method "getline"

Starting the Processing Loop

Line 82 - "while" Loop

Komodo Tip Click on the minus symbol to the left of line 82. The entire section of nested code will be collapsed. This is Code Folding.

Komodo Tip click the mouse pointer on line 82. Notice that the opening brace changes to a bold red font. The closing brace on line 113 is displayed the same way.

Lines 83 to 87 - Extracting a Line of Input Data

Perl Pointer variable arrays store lists of items indexed by number; their symbol ("@") is shaped like an "a", for "array".

Converting Characters with a Regular Expression

Lines 89 to 93 - "foreach"

Komodo Tip Komodo's Rx Toolkit is a powerful tool for creating and debugging regular expressions. See Regular Expressions Primer for more information.

Combining Field Reference and Field Data

Lines 95 to 97 - hash slice

Perl Pointer variable hashes are indicated by the symbol "%", and store lists of items indexed by string.

Writing Data to the Output File

Lines 99 to 112 - Writing Data to the Output File

Closing the Program

Line 113 - Closing the Processing Loop

Lines 114 to 116 - Ending the Program

Run the Program to Generate Output

To start, you will simply generate the output by running the program through the debugger without setting any breakpoints.

  1. Clear the contents of mailexport.xml Click on the "mailexport.xml" tab in the Editor Pane. Delete the contents of the file - you will regenerate it in the next step. Save the file.
  2. Run the Debugger Click on the "parse.pl" tab in the editor. From the menu, select Debug|Go/Continue. In the Debugging Options dialog box,
    click OK to accept the defaults.
  3. View the contents of mailexport.xml Click on the "mailexport.xml" tab in the editor. Komodo informs you that the file has changed. Click OK to reload the file.

Debugging the Program

In this step you'll add breakpoints to the program and "debug" it. Adding breakpoints lets you to run the program in chunks, making it possible to watch variables and view output as it is generated. Before you begin, ensure that line numbering is enabled in Komodo (View|View Line Numbers).

  1. Set a breakpoint: On the "parse.pl" tab, click in the grey margin immediately to the left of the code on line 70 of the program. This will set a breakpoint, indicated by a red circle.
  2. Run the Debugger: Select Debug|Go/Continue. In the Debugging Options dialog box, click OK to accept the defaults. The debugger will process the program until it encounters the first breakpoint.

Komodo Tip Debugger commands can be accessed from the Debug menu, by shortcut keys, or from the Debug Toolbar. For a summary of debugger commands, see Debugger Command List.

  1. Watch the debug process: A yellow arrow on the breakpoint indicates the position at which the debugger has halted. Click on the "mailexport.xml" tab. Komodo informs you that the file has changed. Click OK to reload the file.
  2. View variables: In the Bottom Pane, see the Debug tab. The variables "$in" and "$out" appear in the Locals tab.
  3. Line 70 - Step In: Select Debug|Step In. "Step In" is a debugger command that causes the debugger to execute the current line and then stop at the next processing line (notice that the lines between 70 and 74 are raw output indicated by "here" document markers).
  4. Line 77 - Step In: On line 77, the processing transfers to the module Text::CSV_XS. Komodo opens the file CSV_XS.pm and stops the debugger at the active line in the module. If you haven't installed Text::CSV_XS, stepping in will take you to line 18 of parse.pl
  5. Line 61 in CSV_XS.pm, line 19 in parse.pl - Step Out: Select Debug|Step Out. The Step Out command will make the debugger execute the function in Text::CSV_XS and pause at the next line of processing, which is back in parse.pl on line 80.
  6. Line 80 - Step Over: Select Debug|Step Over. The debugger will process the function in line 80 without opening the module containing the "getline" function.

Komodo Tip What do the debugger commands do?

  • Step In executes the current line of code and pauses at the following line.
  • Step Over executes the current line of code. If the line of code calls a function or method, the function or method is executed in the background and the debugger pauses at the line that follows the original line.
  • Step Out when the debugger is within a function or method, Step Out will execute the code without stepping through the code line by line. The debugger will stop on the line of code following the function or method call in the calling program.
  1. Line 83 - Set Another Breakpoint: After the debugger stops at line 82, click in the grey margin immediately to the left of the code on line 83 to set another breakpoint.

Perl Pointer The perl debugger will not break on certain parts of control structures, such as lines containing only braces ( { }).
With Perl 5.6 and earlier, the debugger will also not break at the start of while, until, for, or foreach statements.

  1. Line 83 - Step Out: It appears that nothing happened. However, the debugger actually completed one iteration of the "while loop" (from lines 82 to 113). To see how this works, set another breakpoint at line 99, and Step Out again. The debugger will stop at line 99. On the Debug Session tab, look at the data assigned to the $record variable. Then Step Out, and notice that $record is no longer displayed, and the debugger is back on line 53. Step Out again, and look at the $record variable - it now contains data from the next record in the input file.
  2. Line 99 - Stop the Debugger: Select Debug|Stop to stop the Komodo debugger.

Perl Pointer Did you notice that output wasn't written to mailexport.xml after every iteration of the while loop?
This is because Perl maintains an internal buffer for writing to files. You can set the buffer to "autoflush" using the special Perl variable "$|".

More Perl Resources

ASPN, the ActiveState Programmer Network

ASPN, the ActiveState Programmer Network, provides extensive resources for Perl programmers:

Documentation

There is a wealth of documentation available for Perl. The first source for language documentation is the Perl distribution installed on your system. To access the documentation contained in the Perl distribution, use the following commands:

Tutorials and Reference Sites

There are many Perl tutorials and beginner Perl sites on the Internet, such as: