Monday, May 2, 2011

Getting Started :Perl :Chapter 1


Welcome to Teach Yourself Perl 5 in 21 Days. Today you'll

learn about the following:


  • What Perl is and why Perl is useful

  • How to get Perl if you do not already have it

  • How to run Perl programs

  • How to write a very simple Perl program

  • The difference between interpretive and compiled programming

    languages

  • What an algorithm is and how to develop one

What Is Perl?

Perl is an acronym, short for Practical Extraction and Report Language. It was designed by Larry Wall as a tool for writing programs in the UNIX environment and is continually being updated and maintained by him.
For its many fans, Perl provides the best of several worlds. For instance:
  • Perl has the power and flexibility of a high-level programming

    language such as C. In fact, as you will see, many of the features

    of the language are borrowed from C.

  • Like shell script languages, Perl does not require a special

    compiler and linker to turn the programs you write into working

    code. Instead, all you have to do is write the program and tell

    Perl to run it. This means that Perl is ideal for producing quick

    solutions to small programming problems, or for creating prototypes

    to test potential solutions to larger problems.

  • Perl provides all the features of the script languages sed

    and awk, plus features not found in either of these two languages.

    Perl also supports a sed-to-Perl translator and an awk-to-Perl

    translator.

In short, Perl is as powerful as C but as convenient as awk, sed, and shell scripts.


NOTE



This book assumes that you are familiar with the basics of using the UNIX operating system




As you'll see, Perl is very easy to learn. Indeed, if you are familiar with other programming languages, learning Perl is a snap. Even if you have very little programming experience, Perl can have you writing useful programs in a very short time. By the end of Day 2, "Basic Operators and Control Flow," you'll know enough about Perl to be able to solve many problems.

How Do I Find Perl?

To find out whether Perl already is available on your system, do the following:
  • If you are currently working in a UNIX programming environment,

    check to see whether the file /usr/local/bin/perl exists.

  • If you are working in any other environment, check the place

    where you normally keep your executable programs, or check the

    directories accessible from your PATH environment variable.

If you do not find Perl in this way, talk to your system administrator and ask whether she or he has Perl running somewhere else. If you don't have Perl running in your environment, don't despair-read on!

Where Do I Get Perl?

One of the reasons Perl is becoming so popular is that it is available free of charge to anyone who wants it. If you are on the Internet, you can obtain a copy of Perl with file-transfer protocol (FTP). The following is a sample FTP session that transfers a copy of the Perl distribution. The items shown in boldface type are what you would enter during the session.
$ ftp prep.ai.mit.edu

Connected to prep.ai.mit.edu.

220 aeneas FTP server (Version wu-2.4(1) Thu Apr 14 20:21:35 EDT 1994) ready.

Name (prep.ai.mit.edu:dave): anonymous

331 Guest login ok, send your complete e-mail address as password.

Password:

230-Welcome, archive user!

230-

230-If you have problems downloading and are seeing "Access denied" or

230-"Permission denied", please make sure that you started your FTP 

230-client in a directory to which you have write permission.

230-

230-If you have any problems with the GNU software or its downloading, 

230-please refer your questions to <gnu@PREP.AI.MIT.EDU>. If you have any

230-other unusual problems, please report them to <root@aeneas.MIT.EDU>.

230-

230-If you do have problems, please try using a dash (-) as the first 

230-character of your password - this will turn off the continuation

230-messages that may be confusing your FTP client.

230-

230 Guest login ok, access restrictions apply.

ftp> cd pub/gnu

250-If you have problems downloading and are seeing "Access denied" or

250-"Permission denied", please make sure that you started your FTP

250-client in a directory to which you have write permission.

250-

250-Please note that all files ending in '.gz' are compressed with 

250-'gzip', not with the unix 'compress' program.  Get the file README

250- and read it for more information.

250-

250-Please read the file README

250-  it was last modified on Thu Feb 1 15:00:50 1996 - 32 days ago

250-Please read the file README-about-.diff-files

250-  it was last modified on Fri Feb 2 12:57:14 1996 - 31 days ago

250-Please read the file README-about-.gz-files

250-  it was last modified on Wed Jun 14 16:59:43 1995 - 264 days ago

250 CWD command successful.

ftp> binary

200 Type set to I.

ftp> get perl-5.001.tar.gz

200 PORT command successful.

150 Opening ASCII mode data connection for perl-5.001.tar.gz (1130765 bytes).

226 Transfer complete.

1130765 bytes received in 9454 seconds (1.20 Kbytes/s)

ftp> quit

221 Goodbye.

$


The commands entered in this session are explained in the following steps. If some of these steps are not familiar to you, ask your system administrator for help.
  1. The command


    $ ftp prep.ai.mit.edu


    connects you to the main Free Software Foundation source

    depository at MIT.

  2. The user ID anonymous tells FTP that you want to

    perform an anonymous FTP operation.


  3. When FTP asks for a password, enter your user ID and network

    address. This lets the MIT system administrator know who is using

    the MIT archives. (For security reasons, the password is not actually

    displayed when you type it.)

  4. The command cd pub/gnu sets your current working

    directory to be the directory containing the Perl source.

  5. The binary command tells FTP that the file you'll

    be receiving is a file that contains unreadable (non-text) characters.

  6. The get command copies the file perl-5.001.tar.gz

    from the MIT source depository to your own site. (It's usually

    best to do this in off-peak hours to make things easier for other

    Internet users-it takes awhile.) This file is quite large because

    it contains all the source files for Perl bundled together into

    a single file.


  7. The quit command disconnects from the MIT source

    repository and returns you to your own system.

Once you've retrieved the Perl distribution, do the following:
  1. Create a directory and move the file you just received, perl-5.001.tar.gz,

    to this directory. (Or, alternatively, move it to a directory

    already reserved for this purpose.)

  2. The perl-5.001.tar.gz file is compressed to save

    space. To uncompress it, enter the command


    $ gunzip perl-5.001.tar.gz


    gunzip
    is the GNU uncompress

    program. If it's not available on your system, see your system

    administrator. (You can, in fact, retrieve it from prep.ai.mit.edu

    using anonymous FTP with the same commands you used to retrieve

    the Perl distribution.)


    When you run gunzip, the file perl-5.001.tar.gz

    will be replaced by perl-5.001.tar, which is the uncompressed

    version of the Perl distribution file.

  3. The next step is to unpack the Perl distribution. In other

    words, use the information in the Perl distribution to create

    the Perl source files. To do this, enter the following command:




    $ tar xvf - <perl-5.001.tar


    As this command executes, it creates each source file in

    turn and displays the name and size of each file as it is created.

    The tar command also creates subdirectories where appropriate;

    this ensures that the Perl source files are organized in a logical

    way.

  4. Using your favorite C compiler, compile the Perl source code

    using the makefile provided. (This makefile should have been created

    when the source files were unpacked in the last step.)

  5. Place the compiled Perl executable into the directory where

    you normally keep your executables. On UNIX systems, this directory

    usually is called /usr/local/bin, and Perl usually is

    named /usr/local/bin/perl.


You might need your system administrator's help to do this because you might not have the necessary permissions.

Other Places to Get Perl

If you cannot access the MIT site from where you are, you can get Perl from the following sites using anonymous FTP:
North America


SiteLocation

ftp.netlabs.comInternet address 192.94.48.152


Directory /pub/outgoing/perl5.0

ftp.cis.ufl.eduInternet address 128.227.100.198


Directory /pub/perl/src/5.0

ftp.uu.netInternet address 192.48.96.9


Directory /languages/perl

ftp.khoros.unm.eduInternet address 198.59.155.28


Directory /pub/perl

ftp.cbi.tamucc.eduInternet address 165.95.1.3


Directory /pub/duff/Perl

ftp.metronet.comInternet address 192.245.137.1


Directory /pub/perl/sources

genetics.upenn.eduInternet address 128.91.200.37


Directory /perl5


Europe

SiteLocation

ftp.cs.ruu.nlInternet address 131.211.80.17


Directory /pub/PERL/perl5.0/src

ftp.funet.fiInternet address 128.214.248.6


Directory /pub/languages/perl/ports/perl5

ftp.zrz.tu-berlin.deInternet address 130.149.4.40


Directory /pub/unix/perl

src.doc.ic.ac.ukInternet address 146.169.17.5


Directory /packages/perl5


Australia


SiteLocation

sungear.mame.mu.oz.auInternet address 128.250.209.2


Directory /pub/perl/src/5.0


South America

SiteLocation

ftp.inf.utfsm.clInternet address 146.83.198.3


Directory /pub/gnu


You also can obtain Perl from most sites that store GNU source code, or from any site that archives the Usenet newsgroup comp.sources.unix.

A Sample Perl Program

Now that Perl is available on your system, it's time to show you a simple program that illustrates how easy it is to use Perl. Listing 1.1 is a simple program that asks for a line of input and writes it out.


Listing 1.1. A simple Perl program that reads and writes a line of input.
1: #!/usr/local/bin/perl

2: $inputline = <STDIN>;

3: print( $inputline );



$program1_1

This is my line of input.

This is my line of input.

$



Line 1 is the header comment. Line 2 reads a line of input. Line 3 writes the line of input back to your screen.

The following sections describe how to create and run this program, and they describe it in more detail.

Running a Perl Program

To run the program shown in Listing 1.1, do the following:
  1. Using your favorite editor, type the previous program and

    save it in a file called program1_1.

  2. Tell the system that this file contains executable statements.

    To do this in the UNIX environment, enter the command


    $ chmod +x program1_1

  3. Run the program by entering the command


    $ program1_1

When you run program1_1, it waits for you to enter a line of input. After you enter the line of input, program1_1 prints what you entered, as follows:
$ program1_1

This is my line of input.

This is my line of input.

$ 

If Something Goes Wrong

If Listing 1.1 is stored in the file program1_1 and run according to the preceding steps, the program should run successfully. If the program doesn't run, one of two things has likely happened:
  • The system can't find the file program1_1.


  • The system can't find Perl.

If you receive the error message
program1_1 not found

or something similar, your system couldn't find the file program1_1. To tell the system where program1_1 is located, you can do one of two things in a UNIX environment:
  • Enter the command ./program1_1, which gives the system

    the pathname of program1_1 relative to the current directory.

  • Add the current directory . to your PATH

    environment variable. This tells the system to search in the current

    directory when looking for executable programs such as program1_1.

If you receive the message
/usr/local/bin/perl not found

or something similar, this means that Perl is not installed properly on your machine. See the section "How Do I Find Perl?" earlier today, for more details.
If you don't understand these instructions or are still having trouble running Listing 1.1, talk to your system administrator.

The First Line of Your Perl Program: How Comments Work

Now that you've run your first Perl program, let's look at each line of Listing 1.1 and figure out what it does.
Line 1 of this program is a special line that tells the system that this is a Perl program:
#!/usr/local/bin/perl

Let's break this line down, one part at a time:
  • The first character in the line, the # character,

    is the Perl comment character. It tells the system that

    this line is not an executable instruction.

  • The ! character is a special character; it indicates

    what type of program this is. (You don't need to worry about the

    details of what the ! character does. All you have to

    do is remember to include it.)

  • The path /usr/local/bin/perl is the location of the

    Perl executable on your system. This executable interprets

    your program; in other words, it figures out what you want to

    do and then does it. Because the Perl executable has the job of

    interpreting Perl instructions, it usually is called the Perl

    interpreter
    .

If, after reading this, you still don't understand the meaning of the line #!/usr/local/bin/perl don't worry. The actual specifics of what it does are not important for our purposes in this book. Just remember to include it as the first line of your program, and Perl will take it from there.


NOTE



If you are running Perl on a system other than UNIX, you might need to replace the line #!/usr/local/bin/perl with some other line indi-cating the location of the Perl interpreter on your system. Ask your system administrator for details on what
you need to include here.



After you have found out what the proper first line is in your environment, include that line as the first line of every Perl program you write, and you're all set




Comments

As you have just seen, the first character of the line
#!/usr/local/bin/perl

is the comment character, #. When the Perl interpreter sees the #, it ignores the rest of that line.
Comments can be appended to lines containing code, or they can be lines of their own:
$inputline = <STDIN>;    # this line contains an appended comment

# this entire line is a comment

You can-and should-use comments to make your programs easier to understand. Listing 1.2 is the simple program you saw earlier, but it has been modified to include comments explaining what the program does.


NOTE



As you work through the lessons in this book and create your own programs-such as the one in Listing 1.2-you can, of course, name them anything you want. For illustration and discussion purposes, I've adopted the convention of using a name that
corresponds to the listing number. For example, the program in Listing 1.2 is called program1_2.




The program name is used in the Input-Output examples such as the one following this listing, as well as in the Analysis section where the listing is discussed in detail. When you follow the Input-Output example, just remember to substitute your program's
name for the one shown in the example






Listing 1.2. A simple Perl program with comments.
1: #!/usr/local/bin/perl

2: # this program reads a line of input, and writes the line

3: # back out

4: $inputline = <STDIN>;    # read a line of input

5: print( $inputline );     # write the line out



$ program1_2

This is a line of input.

This is a line of input.

$



The behavior of the program in Listing 1.2 is identical to that of Listing 1.1 because the actual code is the same. The only difference is that Listing 1.2 has comments in it
Note that in an actual program, comments normally are used only to explain complicated code or to indicate that the following lines of code perform a specific task. Because Perl instructions usually are pretty straightforward, Perl programs don't need to have a lot of comments.





DO use comments whenever you think that a line of code is not easy to understand.



DON'T clutter up your code with unnecessary comments. The goal is readability. If a comment makes a program easier to read, include it. Otherwise, don't bother.




DON'T put anything else after /usr/local/bin/perl in the first line:


#!/usr/local/bin/perl



This line is a special comment line, and it is not treated like the others.




Line 2: Statements, Tokens, and <STDIN>

Now that you've learned what the first line of Listing 1.1 does, let's take a look at line 2:
$inputline = <STDIN>;

This is the first line of code that actually does any work. To understand what this line does, you need to know what a Perl statement is and what its components are.

Statements and Tokens

The line of code you have just seen is an example of a Perl statement. Basically, a statement is one task for the Perl interpreter to perform. A Perl program can be thought of as a collection of statements performed one at a time.
When the Perl interpreter sees a statement, it breaks the statement down into smaller units of information. In this example, the smaller units of information are $inputline, =, <STDIN>, and ;. Each of these smaller units of information is called a token.

Tokens and White Space

Tokens can normally be separated by as many spaces and tabs as you like. For example, the following statements are identical in Perl:
$inputline = <STDIN>;

$inputline=<STDIN>;

$inputline      =     <STDIN>;


Your statements can take up as many lines of code as you like. For example, the following statement is equivalent to the ones above:
$inputline

=

<STDIN>

;

The collection of spaces, tabs, and new lines separating one token from another is known as white space.
When programming in Perl, you should use white space to make your programs more readable. The examples in this book use white space in the following ways:
  • New statements always start on a new line.

  • One blank space is used to separate one token from another

    (except in special cases, some of which you'll see today).

What the Tokens Do: Reading from Standard Input

As you've seen already, the statement
$inputline = <STDIN>;

consists of four tokens: $inputline, =, <STDIN>, and ;. The following subsections explain what each of these tokens does.

The $inputline and = Tokens

The first token in line 1, $inputline (at the left of the statement), is an example of a scalar variable. In Perl, a scalar variable can store one piece of information.
The = token, called the assignment operator, tells the Perl interpreter to store the item specified by the token to the right of the = in the place specified by the token to the left of the =. In this example, the item on the right of the assignment operator is the <STDIN> token, and the item to the left of the assignment operator is the $inputline token. Thus, <STDIN> is stored in the scalar variable $inputline.
Scalar variables and assignment operators are covered in more detail on Day 2, "Basic Operators and Control Flow."

The <STDIN> Token and the Standard Input File

The next token, <STDIN>, represents a line of input from the standard input file. The standard input file, or STDIN for short, typically contains everything you enter when running a program.
For example, when you run program1_1 and enter
This is a line of input.

the line you enter is stored in the standard input file.
The <STDIN> token tells the Perl interpreter to read one line from the standard input file, where a line is defined to be a set of characters terminated by a new line. In this example, when the Perl interpreter sees <STDIN>, it reads in
This is a line of input.

If the Perl interpreter then sees another <STDIN> in a different statement, it reads another line of data from the standard input file. The line of data you read earlier is destroyed unless it has been copied somewhere else.


NOTE



If there are more lines of input than there are <STDIN> tokens, the extra lines of input are ignored




Because the <STDIN> token is to the right of the assignment operator =, the line
This is a line of input.

is assigned to the scalar variable $inputline.

The ; Token

The ; token at the end of the statement is a special token that tells Perl the statement is complete. You can think of it as a punctuation mark that is like a period in English.

Line 3: Writing to Standard Output

Now that you understand what statements and tokens are, consider line 3 of Listing 1.1, which is
print ($inputline);

This statement refers to the library function that is called print. Library functions, such as print, are provided as part of the Perl interpreter; each library function performs a useful task.
The print function's task is to send data to the standard output file. The standard output file stores data that is to be written to your screen. The standard output file sometimes appears in Perl programs under the name STDOUT.
In this example, print sends $inputline to the standard output file. Because the second line of the Perl program assigns the line
This is a line of input.

to $inputline, this is what print sends to the standard output file and what appears on your screen.

Function Invocations and Arguments

When a reference to print appears in a Perl program, the Perl interpreter calls, or invokes, the print library function. This function invocation is similar to a function invocation in C, a GOSUB statement in BASIC, or a PERFORM statement in COBOL. When the Perl interpreter sees the print function invocation, it executes the code contained in print and returns to the program when print is finished.
Most library functions require information to tell them what to do. For example, the print function needs to know what you want to print. In Perl, this information is supplied as a sequence of comma-separated items located between the parentheses of the function invocation. For example, the statement you've just seen:
print ($inputline);

supplies one piece of information that is passed to print: the variable $inputline. This piece of information commonly is called an argument.
The following call to print supplies two arguments:
print ($inputline, $inputline);

You can supply print with as many arguments as you like; it prints each argument starting with the first one (the one on the left). In this case, print writes two copies of $inputline to the standard output file.
You also can tell print to write to any other specified file. You'll learn more about this on Day 6, "Reading From and Writing To Files."

Error Messages

If you incorrectly type a statement when creating a Perl program, the Perl interpreter will detect the error and tell you where the error is located.
For example, look at Listing 1.3. This program is identical to the program you've been seeing all along, except that it contains one small error. Can you spot it?



Listing 1.3. A program containing an error.
1: #!/usr/local/bin/perl

2: $inputline = <STDIN>

3: print ($inputline);



$ program1_3

Syntax error in file program1_3 at line3, next char (

Execution of program1_3 aborted due to compilation errors. 

$



When you try to run this program, an error message appears. The Perl interpreter has detected that line 2 of the program is missing its closing ; character. The error message from the interpreter tells you what the problem is and identifies the line on which the problem is located


TIP



You should fix errors starting from the beginning of your program and working down.


When the Perl interpreter detects an error, it tries to figure out what you meant to say and carries on from there; this feature is known as error recovery. Error recovery enables the interpreter to detect as many errors as possible at one time,
which speeds up the development process.



Sometimes, however, the Perl interpreter can get confused and think you meant to do one thing when you really meant to do another. In this situation, the interpreter might start trying to detect errors that don't really exist. This problem is known as
error cascading.



It's usually pretty easy to spot error cascading. If the interpreter is telling you that errors exist on several consecutive lines, it usually means that the interpreter is confused. Fix the first error, and the others might very well go away





Interpretive Languages Versus Compiled Languages

As you've seen, running a Perl program is easy. All you need to do is create the program, mark it as executable, and run it. The Perl interpreter takes care of the rest. Languages such as Perl that are processed by an interpreter are known as interpretive languages.
Some programming languages require more complicated processing. If a language is a compiled language, the program you write must be translated into machine-readable code by a special program known as a compiler. In addition, library code might need to be added by another special program known as a linker. After the compiler and linker have done their jobs, the result is a program that can be executed on your machine-assuming, of course, that you have written the program correctly. If not, you have to compile and link the program all over again.
Interpretive languages and compiled languages both have advantages and disadvantages, as follows:
  • As you've seen with Perl, it takes very little time to run

    a program in an interpretive language.

  • Interpretive languages, however, cannot run unless the interpreter

    is available. Compiled programs, on the other hand, can be transferred

    to any machine that understands them.

As you'll see, Perl is as powerful as a compiled language. This means that you can do a lot of work quickly and easily.

Summary

Today you learned that Perl is a programming language that provides many of the capabilities of a high-level programming language such as C. You also learned that Perl is easy to use; basically, you just write the program and run it.
You saw a very simple Perl program that reads a line of input from the standard input file and writes the line to the standard output file. The standard input file stores everything you type from your keyboard, and the standard output file stores everything your Perl program sends to your screen.
You learned that Perl programs contain a header comment, which indicates to the system that your program is written in Perl. Perl programs also can contain other comments, each of which must be preceded by a #.
Perl programs consist of a series of statements, which are executed one at a time. Each statement consists of a collection of tokens, which can be separated by white space.
Perl programs call library functions to perform certain predefined tasks. One example of a library function is print, which writes to the standard output file. Library functions are passed chunks of information called arguments; these arguments tell a function what to do.
The Perl interpreter executes the Perl programs you write. If it detects an error in your program, it displays an error message and uses the error-recovery process to try to continue processing your program. If Perl gets confused, error cascading can occur, and the Perl interpreter might display inappropriate error messages.
Finally, you learned about the differences between interpretive languages and compiled languages, and that Perl is an example of an interpretive language.

Q&A

Q:Is there any particular editor I need to use with Perl?

A:No. Perl programs are ordinary text files. You can use any text editor you like.

Q:Why do I need to enter the chmod +x command before running my program?

A:Because Perl programs are ordinary text files, the UNIX operating system does not know that they are executable programs. By default, text files have read and write permissions granted, which means you can
look at your file or change it. The chmod +x command adds execute permission to the file; when this permission is granted, the system knows that this is an executable program.


Q:Can I use print to print other things besides input lines?

A:Yes. You'll learn more about how you can use print on Day 3, "Understanding Scalar Values."

Q:Why is Perl available for free?

A:This encourages the dissemination of computer knowledge and capabilities.




It works like this: You can get Perl for free, and you can use it to write interesting and useful programs. If you want, you can then give these programs away and let other people write interesting and useful programs based on your programs. This way,
everybody benefits.


You also can modify the source for Perl, provided you tell everybody that your version is a modification of the original. This means that if you think of a clever thing you want Perl to do, you can add it yourself. (However, you can't blame anybody else
if your modification breaks something or if it doesn't work.)


Of course, you don't have to give your Perl programs away for free. In fact, you even can sell your Perl programs, provided you don't borrow anything from somebody else's program.

Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to give you experience in using what you've learned. Try to understand the quiz and exercise answers before continuing to the next day.

Quiz

  1. What do Perl's fans appreciate about Perl?

  2. What does the Perl interpreter do?

  3. Define the following terms:


    a     statement


    b     token


    c     argument


    d     error recovery


    e     standard input file

  4. What is a comment, and where can it appear?

  5. Where is Perl usually located on a UNIX machine?

  6. What is a header comment, and where does it appear in a program?

  7. What is a library function?

Exercises

  1. Modify program1_1 to print the input line twice.

  2. Modify program1_1 to read and print two different

    input lines.

  3. Modify program1_1 to read two input lines and print

    only the second one.


  4. BUG BUSTER: What is wrong with the following program?




    #!/usr/local/bin/perl


    $inputline = <STDIN>;


    print ($inputline)


  5. BUG BUSTER: What is wrong with the following program?





    #!/usr/local/bin/perl


    $inputline = <STDIN>;


    # print my line! print($inputline);


  6. What does the following program do?


    #!/usr/local/bin/perl


    $inputline = <STDIN>;


    $inputline2 = <STDIN>;


    print ($inputline2);


    print ($inputline);






No comments:

Post a Comment