Misplaced Pages

Perl: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 22:35, 13 May 2006 editHarmil (talk | contribs)8,207 editsm Comparitive performance: spelling← Previous edit Revision as of 22:43, 13 May 2006 edit undo-Barry- (talk | contribs)1,472 edits Comparative performance: Wait for RfC to play out. Too much info lost. Unsuppoted claims made. I think I agree, but they could be better worded and supported.Next edit →
Line 321: Line 321:
In Perl 5, database interfaces are implemented by ] modules. The <tt>DBI</tt> (Database Interface) module presents a single, database-independent interface to Perl applications, while the <tt>DBD::</tt> (Database Driver) modules handle the details of accessing some 50 different databases. There are <tt>DBD::</tt> drivers for most ] ] databases. In Perl 5, database interfaces are implemented by ] modules. The <tt>DBI</tt> (Database Interface) module presents a single, database-independent interface to Perl applications, while the <tt>DBD::</tt> (Database Driver) modules handle the details of accessing some 50 different databases. There are <tt>DBD::</tt> drivers for most ] ] databases.


==Benchmarks==
==Comparative performance==
===General===
] that compare Perl to other languages are often quite difficult, as ''working'' code and the most ''efficient'' code are not always the same thing (a result of the "there's more than one way to do it" philosophy). Combined with the fact that benchmarks almost always favor the languages which the benchmark author(s) understand the most, this leads to a strong skepticism of such measurements. However, it can be useful to use benchmarks as a way to understand the very large-scale strengths and weaknesses of different languages.
] are designed to mimic a particular type of workload on a component or system. Many exist on CPAN. The module Benchmark.pm comes with Perl, though one of the used in a 2006 talk by ] editor ] says it "sux." Another slide quotes a page on , warning people about the limitations of benchmarks:


<blockquote>
One such benchmark which is widely cited is the "Computer Language Shootout Benchmarks" which measure the comparative performance of many languages on the ] operating system platform. These benchmarks showed that Perl performed relatively poorly on most tests against most other dynamic languages such as ], Python and ], and well against others such as PHP, ] and ]. Another interesting note from this benchmark is that Perl often performs far better in comparison to another language in memory usage while just the opposite is true of its CPU usage.
How can we benchmark a programming language?<br>
We can't &mdash; we benchmark programming language implementations.<br>
How can we benchmark language implementations?<br>
We can't &mdash; we measure particular programs.
</blockquote>


Other people's benchmark data is sometimes published and may have some value to others, but proper interpretation is important, which brings many ].
Historically, Perl has performed well for tasks which involved textual transformation and the creation and destruction of complex data structures, but poorly for arithmetic and tasks which involve many, very short subroutine or method invocations.

===Comparison===
<div style = "font-size: 13pt; text-align: center; padding-bottom: 2px;">
Number of tests won (Debian : AMD™ Sempron™ / Gentoo : Intel® Pentium® 4)
</div>
{| class="wikitable" align = "left" style = "font-size: 10pt; background-color: rgb(245,245,245); margin-top: 39px; font-weight: bold;"
|-
|-
| Speed
|-
| Memory
|-
| Size
|-
|}

{| class="wikitable" align = "left" style = "font-size: 10pt;"
|-
! Perl !! C (gcc)
|-
| 1/1 || 12/15
|-
| 0/1 || 13/15
|-
| 11/14 || 2/2
|-
|}

{| class="wikitable" align = "left" style = "font-size: 10pt;"
|-
! Perl !! C++ (g++)
|-
| 0/2 || 14/12
|-
| 0/0 || 14/14
|-
| 10/14 || 4/0
|-
|}

{| class="wikitable" align = "left" style = "font-size: 10pt;"
|-
! Perl !! Java JDK Server
|-
| 3/3 || 13/13
|-
| 12/12 || 4/4
|-
| 13/16 || 2/0
|-
|}

{| class="wikitable" align = "left" style = "font-size: 10pt;"
|-
! Perl !! PHP
|-
| 9/8 || 4/6
|-
| 10/10 || 3/5
|-
| 10/11 || 3/4
|-
|}

{| class="wikitable" align = "left" style = "font-size: 10pt;"
|-
! Perl !! Python
|-
| 5/7 || 11/9
|-
| 8/8 || 8/8
|-
| 6/3 || 9/13
|-
|}

{| class="wikitable" align = "left" style = "font-size: 10pt;"
|-
! Perl !! Ruby
|-
| 14/14 || 2/2
|-
| 10/9 || 6/7
|-
| 8/2 || 6/14
|-
|}
<div style = "clear: left;">
<p>
Data comes from benchmarks from from May 7, 2006 and benchmarks from May 10, 2006. The Debian and Gentoo tests used equivalent benchmarks, but on Gentoo, some benchmarks had a higher workload, most language implementations were built from source, and <i>Size</i> tests measured GZip bytes instead of lines of code.
</p>

<p style = "margin-left: auto; margin-right: auto; margin-top: 30px; text-align: justify; background-color: rgb(245,245,245); padding: 8px; border-style: solid; border-width: 1px; width: 90%;">
<em>The computer programs used in these tests may not have been fully optimized, and the relevance of the data is disputed. The only truly relevant benchmark is one that's customized to your particular situation. See page about flawed benchmarks and comparisons.</em>
</p>


==Opinion== ==Opinion==

Revision as of 22:43, 13 May 2006

For other uses, see Perl (disambiguation).
Perl

File:Programming-republic-of-perl.gif
Paradigmfunctional, object-oriented, procedural
Designed byLarry Wall
First appeared1987
Stable release5.8.8 / February 2, 2006
Typing disciplinedynamic
OSCross-platform
LicenseGPL or Artistic License
Websitewww.perl.org
Influenced by
C, shell, awk, sed, lisp
Influenced
Python, PHP, Ruby

Perl, also Practical Extraction and Report Language (a backronym, see below) is a dynamic procedural programming language designed by Larry Wall and first released in 1987. Perl borrows features from C, shell scripting (sh), awk, sed, Lisp, and, to a lesser extent, many other programming languages.

Overview

The perlintro(1) man page states:

Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more.
The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). Its major features are that it's easy to use, supports both procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has a large collection of third-party modules.

Design

The design of Perl can be understood as a response to three broad trends in the computer industry: falling hardware costs, rising labor costs, and improvements in compiler technology. Many earlier computer languages, such as Fortran and C, were designed to make efficient use of expensive computer hardware. In contrast, Perl is designed to make efficient use of expensive computer programmers.

Perl has many features that ease the programmer's task at the expense of greater CPU and memory requirements. These include automatic memory management; dynamic typing; strings, lists, and hashes; regular expressions; introspection and an eval() function.

Larry Wall was trained as a linguist, and the design of Perl is very much informed by linguistic principles. Examples include Huffman coding (common constructions should be short), good end-weighting (the important information should come first), and a large collection of language primitives. Perl favors language constructs that are natural for humans to read and write, even where they complicate the Perl interpreter.

Perl syntax reflects the idea that "things that are different should look different". For example, scalars, arrays, and hashes have different leading sigils. Array indices and hash keys use different kinds of braces. Strings and regular expressions have different standard delimiters. This approach can be contrasted with languages like Lisp, where the same S-expression construct and basic syntax is used for many different purposes.

Perl has features that support a variety of programming paradigms, such as procedural, functional, and object-oriented. At the same time, Perl does not enforce any particular paradigm, or even require the programmer to choose among them.

There is a broad practical bent to both the Perl language and the community and culture that surround it. The preface to Programming Perl begins, "Perl is a language for getting your job done." One consequence of this is that Perl is not a tidy language. It includes features if people use them, tolerates exceptions to its rules, and employs heuristics to resolve syntactical ambiguities. Discussing the variant behaviour of built-in functions in list and scalar context, the perlfunc(1) man page says "In general, they do what you want, unless you want consistency."

Perl has several mottos that convey aspects of its design and use. One is "There's more than one way to do it." (TMTOWTDI, usually pronounced 'Tim Toady'). Others are "Perl: the Swiss Army Chainsaw of Programming Languages" and "No unnecessary limits". A stated design goal of Perl is to make easy tasks easy and difficult tasks possible. Perl has also been called "The Duct Tape of the Internet".

Features

The overall structure of Perl derives broadly from the programming language C. Perl is a procedural programming language, with variables, expressions, assignment statements, brace-delimited code blocks, control structures, and subroutines.

Perl also takes features from shell programming. All variables are marked with leading sigils. Sigils unambiguously identify variable names, allowing Perl to have a rich syntax. Importantly, sigils allow variables to be interpolated directly into strings. Like the Unix shells, Perl has many built-in functions for common tasks, like sorting, and for accessing system facilities.

Perl takes lists from Lisp, associative arrays from awk, and regular expressions from sed. These simplify and facilitate all manner of parsing, text handling, and data management tasks.

In Perl 5, features were added that support complex data structures, first-class functions (i.e. closures as values), and an object-oriented programming model. These include references, packages, and class-based method dispatch. Perl 5 also saw the introduction of lexically scoped variables, which make it easier to write robust code, and modules, which make it practical to write and distribute libraries of Perl code.

All versions of Perl do automatic data typing and memory management. The interpreter knows the type and storage requirements of every data object in the program; it allocates and frees storage for them as necessary. Legal type conversions are done automatically at run time; illegal type conversions are fatal errors.

Applications

Perl has many and varied applications.

It has been used since the early days of the Web to write CGI scripts, and is an integral component of the popular LAMP (Linux / Apache / MySQL / Perl, PHP, and Python) platform for web development. Large projects written in Perl include Slash, early implementations of PHP, and UseModWiki, the wiki software used in Misplaced Pages until 2002. It's known as one of "the three Ps" (Perl, Python, and PHP), which are the most popular server-side, open source scripting languages for the Web, though open source Java and C# implementations as well as Ruby have grown popular in recent years.

Perl is often used as a "glue language", tying together systems and interfaces that were not specifically designed to interoperate. Systems administrators use Perl as an all-purpose tool; short Perl programs can be entered and run on a single command line.

Perl is widely used in finance and bioinformatics, where it is valued for rapid application development, ability to handle large data sets, and the availability of many standard and third-party modules.

Implementation

Perl is implemented as a core interpreter, written in C, together with a large collection of modules, written in Perl and C. The source distribution is, as of 2005, 12 MB when packaged in a tar file and compressed. The interpreter is 150,000 lines of C code and compiles to a 1 MB executable on typical machine architectures. Alternatively, the interpreter can be compiled to a link library and embedded in other programs. There are nearly 500 modules in the distribution, comprising 200,000 lines of Perl and an additional 350,000 lines of C code. Much of the C code in the modules consists of character encoding tables.

The interpreter has an object-oriented architecture. All of the elements of the Perl language—scalars, arrays, hashes, coderefs, file handles—are represented in the interpreter by C structs. Operations on these structs are defined by a large collection of macros, typedefs and functions; these constitute the Perl C API. The Perl API can be bewildering to the uninitiated, but its entry points follow a consistent naming scheme, which provides guidance to those who use it.

The execution of a Perl program divides broadly into two phases: compile-time and run-time. At compile time, the interpreter parses the program text into a syntax tree. At run time, it executes the program by walking the tree. The text is parsed only once, and the syntax tree is subject to optimization before it is executed, so the execution phase is relatively efficient. Compile-time optimizations on the syntax tree include constant folding, context propagation, and peephole optimization.

Perl is a dynamic language and has a context-sensitive grammar that cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, it implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language. It is said that "only perl can parse Perl", meaning that only the Perl interpreter (perl) can parse the Perl language (Perl). The truth of this is attested to by the persistent imperfections of other programs that undertake to parse Perl, such as source code analyzers and auto-indenters.

Maintenance of the Perl interpreter has become increasingly difficult over the years. The code base has been in continuous development since 1994. The code has been optimized for performance at the expense of simplicity, clarity, and strong internal interfaces. New features have been added, yet virtually complete backward compatibility with earlier versions is maintained. The size and complexity of the interpreter is a barrier to developers who wish to work on it.

Perl is distributed with some 90,000 functional tests. These run as part of the normal build process, and extensively exercise the interpreter and its core modules. Perl developers rely on the functional tests to ensure that changes to the interpreter do not introduce bugs; conversely, Perl users who see the interpreter pass its functional tests on their system can have a high degree of confidence that it is working properly.

There is no written specification or standard for the Perl language, and no plans to create one for the current version of Perl. There has only ever been one implementation of the interpreter. That interpreter, together with its functional tests, stands as a de facto specification of the language.

Availability

Perl is free software, and is licensed under both the Artistic License and the GNU General Public License. It is available for most operating systems. It is particularly prevalent on Unix and Unix-like systems (such as Linux, FreeBSD, and Mac OS X), and is growing in popularity on Microsoft Windows systems.

Perl has been ported to over a hundred different platforms, and can, with only six reported exceptions, be compiled from source on all Unix-like, POSIX-compliant or otherwise Unix-compatible platforms, including AmigaOS, BeOS, Cygwin, and Mac OS X (See ports). A special port, MacPerl, is available for Mac OS Classic.

Perl can be compiled from source on Windows, however many Windows installations lack a C compiler, so Windows users typically install a binary distribution, such as ActivePerl or IndigoPerl. Users without a C compiler are also limited to pure Perl modules if they wish to add to the module library that comes with Perl. There's free software that can enable these users to install C modules, however it tends to be poorly documented, especially for beginners.

Language structure

Example Program

In Perl, the canonical "Hello world" program is:

#!/usr/bin/perl
print "Hello, world!\n";

The first line is the shebang, which tells the operating system where to find the Perl interpreter. The second line prints the string Hello, world! and a newline (like a person pressing 'Return' or 'Enter').

The shebang is the usual way to invoke the interpreter on Unix systems. Windows systems may rely on the shebang, or they may associate a .pl file extension with the Perl interpreter.

Here is a one-line, throw-away Perl program that does ROT13 encoding/decoding. It is entered and run directly on the command line:

perl -pe 'tr/A-Za-z/N-ZA-Mn-za-m/' < input_file > output_file

Data types

Perl has three fundamental data types: scalars, lists, and hashes:

  • A scalar is a single value; it may be a number, a string or a reference
  • A list is an ordered collection of scalars (a variable that holds a list is called an array)
  • A hash, or associative array, is a map from strings to scalars; the strings are called keys and the scalars are called values.

All variables are marked by a leading sigil, which identifies the data type. The same name may be used for variables of different types, without conflict.

 $foo   # a scalar
 @foo   # a list
 %foo   # a hash

Numbers are written in the usual way; strings are enclosed by quotes of various kinds.

 $n     = 42;
 $name  = "joe";
 $color = 'red';

A list is written by listing its elements, separated by commas, and enclosed by parentheses where required by operator precedence.

 @scores = (32, 45, 16, 5);

A hash may be initialized from a list of key/value pairs.

 %favorite = (joe => 'red',
              sam => 'blue');

Individual elements of a list are accessed by providing a numerical index, in square brackets. Individual values in a hash are accessed by providing the corresponding key, in curly braces. The $ sigil identifies the accessed element as a scalar.

 $scores      # an element of @scores
 $favorite{joe}  # a value in %favorite

Multiple elements may be accessed by using the @ sigil instead (identifying the result as a list).

 @scores    # three elements of @scores
 @favorite{'joe', 'sam'} # two values in %favorite

The number of elements in an array can be obtained by evaluating the array in scalar context or with the help of the $# sigil. The latter gives the index of the last element in the array, not the number of elements.

 $count = @friends;
 $#friends       # the index of the last element in @friends
 $#friends+1     # usually the number of elements in @friends
                 # this is one more than $#friends because the first element is at
                 # index 0, not 1

There are a few functions that operate on entire hashes.

 @names     = keys   %address;
 @addresses = values %address;

Control structures

Main article: Perl control structures

Perl has several kinds of control structures.

It has block-oriented control structures, similar to those in the C and Java programming languages. Conditions are surrounded by parentheses, and controlled blocks are surrounded by braces:

label while ( cond ) { ... }
label while ( cond ) { ... } continue { ... }
label for ( init-expr ; cond-expr ; incr-expr ) { ... }
label foreach var ( list ) { ... }
label foreach var ( list ) { ... } continue { ... }
if ( cond ) { ... }
if ( cond ) { ... } else { ... } 
if ( cond ) { ... } elsif ( cond ) { ... } else { ... } 

Where only a single statement is being controlled, statement modifiers provide a lighter syntax:

statement if      cond ;
statement unless  cond ;
statement while   cond ;
statement until   cond ;
statement foreach list ;

Short-circuit logical operators are commonly used to effect control flow at the expression level:

expr and expr
expr or  expr

The flow control keywords next, last, return, and redo are expressions, so they can be used with short-circuit operators.

Perl also has two implicit looping constructs:

 results = grep { ... } list
 results = map  { ... } list

grep returns all elements of list for which the controlled block evaluates to true. map evaluates the controlled block for each element of list and returns a list of the resulting values. These constructs enable a simple functional programming style.

There is no switch statement (multi-way branch) in Perl 5. The Perl documentation describes a half-dozen ways to achieve the same effect by using other control structures, none entirely satisfactory. A very general and flexible switch statement has been designed for Perl 6. The Switch module makes most of the functionality of the Perl 6 switch available to Perl 5 programs, although it is often criticised for being a source filter, and thus failure-prone. The next stable version of Perl 5, Perl 5.10, will have the Perl 6 given/when switch-statement.

Perl includes a goto label statement, but it is rarely used. Some consider its use poor coding practice. The implementation is slow, and situations where a goto is called for in other languages don't occur as often in Perl and are often better handled with other control structures, such as labeled loops.

There is also a goto &sub statement that performs a tail call. It terminates the current subroutine and immediately calls the specified sub. Use of this form is culturally accepted but unusual because it is rarely needed.

Subroutines

Subroutines are defined with the sub keyword, and invoked simply by naming them. Subroutine definitions may appear anywhere in the program. Parentheses are required for calls that precede the definition.

foo();
sub foo { ... }
foo;

A list of arguments may be provided after the subroutine name. Arguments may be scalars, lists, or hashes.

foo $x, @y, %z;

The parameters to a subroutine need not be declared as to either number or type; in fact, they may vary from call to call. Arrays are expanded to their elements, hashes are expanded to a list of key/value pairs, and the whole lot is passed into the subroutine as one undifferentiated list of scalars.

Whatever arguments are passed are available to the subroutine in the special array @_. The elements of @_ are aliased to the actual arguments; changing an element of @_ changes the corresponding argument.

Elements of @_ may be accessed by subscripting it in the usual way.

$_, $_

However, the resulting code can be difficult to read, and the parameters have pass-by-reference semantics, which may be undesirable.

One common idiom is to assign @_ to a list of named variables.

my($x, $y, $z) = @_;

This effects both mnemonic parameter names and pass-by-value semantics. The my keyword indicates that the following variables are lexically scoped to the containing block.

Another idiom is to shift parameters off of @_. This is especially common when the subroutine takes only one argument.

my $x = shift;

Subroutines may return values.

return 42, $x, @y, %z;

If the subroutine does not exit via a return statement, then it returns the last expression evaluated within the subroutine body. Arrays and hashes in the return value are expanded to lists of scalars, just as they are for arguments.

The returned expression is evaluated in the calling context of the subroutine; this can surprise the unwary.

sub list  {      (4, 5, 6)     }
sub array { @x = (4, 5, 6); @x }
$x = list;   # returns 6 - last element of list
$x = array;  # returns 3 - number of elements in list
@x = list;   # returns (4, 5, 6)
@x = array;  # returns (4, 5, 6)

A subroutine can discover its calling context with the wantarray function.

sub either { wantarray ? (1, 2) : "Oranges" }
$x = either;    # returns "Oranges"
@x = either;    # returns (1, 2)

Regular expressions

See also: Perl regular expression examples

The Perl language includes a specialized syntax for writing regular expressions (REs), and the interpreter contains an engine for matching strings to regular expressions. The regular expression engine uses a backtracking algorithm, extending its capabilities from simple pattern matching to string capture and substitution.

The Perl regular expression syntax was originally taken from Unix Version 8 regular expressions. However, it diverged before the first release of Perl, and has since grown to include many more features. Some other languages and applications are now adopting Perl compatible regular expressions in favor of POSIX regular expressions.

The m// (match) operator introduces a regular expression match. (The leading m may be omitted for brevity.) In the simplest case, an expression like

 $x =~ m/abc/

evaluates to true if and only if the string $x matches the regular expression abc.

Portions of a regular expression may be enclosed in parentheses; corresponding portions of a matching string are captured. Captured strings are assigned to the sequential built-in variables $1, $2, $3, ..., and a list of captured strings is returned as the value of the match.

 $x =~ m/a(.)c/;  # capture the character between 'a' and 'c'

The s/// (substitute) operator specifies a search and replace operation:

 $x =~ s/abc/aBc/;   # upcase the b

Perl regular expressions can take modifiers. These are single-letter suffixes that modify the meaning of the expression:

 $x =~ m/abc/i;      # case-insensitive pattern match
 $x =~ s/abc/aBc/g;  # global search and replace

Regular expressions can be dense and cryptic. This is because regular expression syntax is extremely compact, generally using single characters or character pairs to represent its operations. Perl provides some relief from this problem with the /x modifer, which allows programmers to place whitespace and comments inside regular expressions:

 $x =~ m/a     # match 'a'
         .     # match any character
         c     # match 'c'
          /x;

One common use of regular expressions is to specify delimiters for the split operator:

 @words = split m/,/, $line;   # divide $line into comma-separated values

The split operator complements string capture. String capture returns the parts of a string that match a regular expression; split returns the parts that don't match.

Database interfaces

Perl is widely favored for database applications. Its text handling facilities are good for generating SQL queries; arrays, hashes and automatic memory management make it easy to collect and process the returned data.

In early versions of Perl, database interfaces were created by relinking the interpreter with a client-side database library. This was somewhat clumsy; a particular problem was that the resulting perl executable was restricted to using just the one database interface that it was linked to. Also, relinking the interpreter was sufficiently difficult that it was only done for a few of the most important and widely used databases.

In Perl 5, database interfaces are implemented by Perl DBI modules. The DBI (Database Interface) module presents a single, database-independent interface to Perl applications, while the DBD:: (Database Driver) modules handle the details of accessing some 50 different databases. There are DBD:: drivers for most ANSI SQL databases.

Benchmarks

General

Benchmarks are designed to mimic a particular type of workload on a component or system. Many benchmark modules exist on CPAN. The module Benchmark.pm comes with Perl, though one of the slides used in a 2006 talk by Perl Review editor Brian D Foy says it "sux." Another slide quotes a page on http://shootout.alioth.debian.org, warning people about the limitations of benchmarks:

How can we benchmark a programming language?
We can't — we benchmark programming language implementations.
How can we benchmark language implementations?
We can't — we measure particular programs.

Other people's benchmark data is sometimes published and may have some value to others, but proper interpretation is important, which brings many challenges.

Comparison

Number of tests won (Debian : AMD™ Sempron™ / Gentoo : Intel® Pentium® 4)

Speed
Memory
Size
Perl C (gcc)
1/1 12/15
0/1 13/15
11/14 2/2
Perl C++ (g++)
0/2 14/12
0/0 14/14
10/14 4/0
Perl Java JDK Server
3/3 13/13
12/12 4/4
13/16 2/0
Perl PHP
9/8 4/6
10/10 3/5
10/11 3/4
Perl Python
5/7 11/9
8/8 8/8
6/3 9/13
Perl Ruby
14/14 2/2
10/9 6/7
8/2 6/14

Data comes from Debian : AMD™ Sempron™ benchmarks from from May 7, 2006 and Gentoo : Intel® Pentium® 4 benchmarks from May 10, 2006. The Debian and Gentoo tests used equivalent benchmarks, but on Gentoo, some benchmarks had a higher workload, most language implementations were built from source, and Size tests measured GZip bytes instead of lines of code.

The computer programs used in these tests may not have been fully optimized, and the relevance of the data is disputed. The only truly relevant benchmark is one that's customized to your particular situation. See this page about flawed benchmarks and comparisons.

Opinion

Perl engenders strong feelings among both its proponents and its detractors.

Pro

Programmers who like Perl typically cite its power, expressiveness, and ease of use. Perl provides infrastructure for many common programming tasks, such as string and list processing. Other tasks, such as memory management, are handled automatically and transparently. Programmers coming from other languages to Perl often find that whole classes of problems that they have struggled with in the past just don't arise in Perl. As Larry Wall put it,

What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?

Besides its practical benefits, many programmers simply seem to enjoy working in Perl. Early issues of The Perl Journal had a page titled "What is Perl?" that concluded:

Perl is fun. In these days of self-serving jargon, conflicting and unpredictable standards, and proprietary systems that discourage peeking under the hood, people have forgotten that programming is supposed to be fun. I don't mean the satisfaction of seeing our well-tuned programs do our bidding, but the literary act of creative writing that yields those programs. With Perl, the journey is as enjoyable as the destination ...

Whatever the reasons, there is clearly a broad community of people who are passionate about Perl, as evidenced by the thousands of modules that have been contributed to CPAN, and the hundreds of design proposals that were submitted as RFCs for Perl 6.

Con

A common complaint is that Perl is ugly. In particular, its prodigious use of punctuation puts off some people; Perl source code is sometimes likened to "line noise". In The Python Paradox, Paul Graham both acknowledges and responds to this:

At the mention of ugly source code, people will of course think of

Perl. But the superficial ugliness of Perl is not the sort I mean. Real ugliness is not harsh-looking syntax, but having to build

programs out of the wrong concepts. Perl may look like a cartoon character swearing, but there are cases where it surpasses Python conceptually.

Another criticism is that Perl is excessively complex and compact, and that it leads to "write-only" code, that is, to code that is virtually impossible to understand after it has been written. It is, of course, possible to write obscure code in any language, but Perl has perhaps more than the usual share of terse, complex and arcane language constructs to exacerbate the problem. Perl supports many such features for backward compatibility, and for use where maintainability is expressly not a concern, such as programs that are entered and run directly on the command line.

In a 2003 study titled Are Scripting Languages Any Good? A Validation of Perl, Python, Rexx, and Tcl against C, C++, and Java, Lutz Prechelt wrote:

...the potential uglyness of any aspect of Perl has carefully been balanced with the power that can be derived from it. Nevertheless, Perl makes it relatively easy to shoot oneself in the foot or to produce programs that are very short but also very difficult to understand. For this reason, somewhat like for C++, writing good Perl programs depends more strongly on the knowledge and discipline of the programmer than in most other languages.

The free-wheeling language style that delights some Perl programmers concerns and dismays others. For example, the Perl 5 object model does not enforce data security: access to private data is restricted only by convention, not the language itself. An object created in one place may easily be modified in another; there may not be any single place where its state is definitively established. There are techniques for addressing these issues, but they are non-native and little used.

There's also criticism of a less technical nature that may be no less important to some. Perl's popularity has declined. As of May, 2006, the TCPI Long Term Trends chart of the ten most popular programming languages shows that Perl's popularity is at its lowest since before June, 2001 (the earliest date plotted), and has dropped more than any other language over the past year. In addition, OSCON — the open source convention sponsored by book publisher O'Reilly — is much less Perl-oriented than it used to be. Randal L. Schwartz, author of several Perl books published by O'Reilly, has said that OSCON's organizers are openly hostile to Perl, and that Perl isn't interesting to O'Reilly anymore.

Origins

General

Larry Wall began work on Perl in 1987, and released version 1.0 to the comp.sources.misc newsgroup on December 18, 1987. The language expanded rapidly over the next few years. Perl 2, released in 1988, featured a better regular expression engine. Perl 3, released in 1989, added support for binary data.

Until 1991, the only documentation for Perl was a single (increasingly lengthy) man page. In 1991, Programming Perl (the Camel Book) was published, and became the de facto reference for the language. At the same time, the Perl version number was bumped to 4, not to mark a major change in the language, but to identify the version that was documented by the book.

Perl 4 went through a series of maintenance releases, culminating in Perl 4.036 in 1993. At that point, Larry Wall abandoned Perl 4 to begin work on Perl 5. Perl 4 remains at version 4.036 to this day.

Development of Perl 5 continued into 1994. The perl5-porters mailing list was established in May 1994 to coordinate work on porting Perl 5 to different platforms. It remains the primary forum for development, maintenance, and porting of Perl 5.

Perl 5 was released on October 17, 1994. It was a nearly complete rewrite of the interpreter, and added many new features to the language, including objects, references, packages, and modules. Importantly, modules provided a mechanism for extending the language without modifying the interpreter. This allowed the core interpreter to stabilize, even as it enabled ordinary Perl programmers to add new language features.

On October 26, 1995, the Comprehensive Perl Archive Network (CPAN) was established. CPAN is a collection of web sites that archive and distribute Perl sources, binary distributions, documentation, scripts, and modules. Originally, each CPAN site had to be accessed through its own URL. Today, the single URL http://www.cpan.org automatically redirects to a CPAN site.

As of 2006, Perl 5 is still being actively maintained. It now includes Unicode support. The latest stable release is Perl 5.8.8.

Name

Perl was originally named "Pearl", after "the pearl of great price" of Matthew 13:46. Larry Wall wanted to give the language a short name with positive connotations; he claims that he looked at (and rejected) every three- and four-letter word in the dictionary. He also considered naming it after his wife Gloria. Wall discovered before the language's official release that there was already a programming language named PEARL and changed the spelling of the name.

The name is normally capitalized (Perl) when referring to the language and uncapitalized (perl) when referring to the interpreter program itself since Unix-like filesystems are case sensitive. Before the release of the first edition of Programming Perl it was common to refer to the language as perl; Randal L. Schwartz, however, forced the uppercase language name in the book to make the name stand out better when typeset. The case distinction was subsequently adopted by the community.

It is not appropriate to write "PERL", as it is not an acronym. The spelling of PERL in all caps is therefore used as a shibboleth for detecting community outsiders. However, several backronyms have been suggested, including the humorous Pathologically Eclectic Rubbish Lister. The more serious Practical Extraction and Report Language has prevailed in many of today's manuals, including the official Perl man page. It is also consistent with the old name "Pearl": Practical Extraction And Report Language.

The camel symbol

Perl is generally symbolized by a camel, which was a result of the picture chosen by camel book publishers O'Reilly Media as the cover picture of Programming Perl, which consequently acquired the name The Camel Book. O'Reilly owns the symbol as a trademark, but claims to use their legal rights only to protect the "integrity and impact of that symbol" . O'Reilly allows non-commercial use of the symbol, and provides Programming Republic of Perl logos (see above) and Powered by Perl buttons.

Future

Main article: Perl 6

At the 2000 Perl Conference, Jon Orwant made a case for a major new language initiative. This led to a decision to begin work on a redesign of the language, to be called Perl 6. Proposals for new language features were solicited from the Perl community at large, and over 300 RFCs were submitted.

Larry Wall spent the next few years digesting the RFCs and synthesizing them into a coherent framework for Perl 6. He has presented his design for Perl 6 in a series of documents called apocalypses, which are numbered to correspond to chapters in Programming Perl ("The Camel Book"). The current, unfinalized specification of Perl 6 is encapsulated in design documents called Synopses, which are numbered to correspond to Apocalypses.

Perl 6 is not intended to be backward compatible, though there will be a compatibility mode.

In 2001, it was decided that Perl 6 would run on a cross-language virtual machine called Parrot. This will mean that other languages targeting the Parrot will gain native access to CPAN and will allow some level of cross-language development.

In 2005 Audrey Tang created the pugs project, an implementation of Perl 6 in Haskell. This was and continues to act as a test platform for the Perl 6 language (separate from the development of the actual implementation) allowing the language designers to explore. The pugs project resulted in an active Perl/Haskell cross-language community centred around the Freenode #perl6 irc channel.

A number of features in the Perl 6 language now show similarities with Haskell, and Perl 6 has been embraced by the Haskell community as a potential scripting language.

As of 2006 Perl 6, Parrot, and pugs are under active development.

CPAN

Main article: CPAN

CPAN, the Comprehensive Perl Archive Network, is a collection of mirrored web sites that serve as a primary archive and distribution channel for Perl sources, distributions, documentation, scripts, and—especially—modules. It is commonly browsed with the search engine http://search.cpan.org/.

There are currently over 8,800 modules available on CPAN, contributed by over 2,500 authors. Modules are available for a wide variety of tasks, including advanced mathematics, database connectivity, and networking. Essentially everything on CPAN is freely available; much of the software is licensed under either the Artistic License, the GPL, or both. Anyone can upload software to CPAN via PAUSE, the Perl Authors Upload Server.

Modules on CPAN can be downloaded and installed by hand. However, it is common for modules to depend on other modules, and following module dependencies by hand can be tedious. Both the CPAN.pm module (included in the Perl distribution) and the improved CPANPLUS module offer command line installers that understand module dependencies; they can be configured to automatically download and install a module and, recursively, all modules that it requires.

Since many Windows installations don't include a C compiler, Windows users may be limited to pure Perl modules when downloading from CPAN.

Fun with Perl

As with C, obfuscated code competitions are a popular feature of Perl culture. The annual Obfuscated Perl contest makes an arch virtue of Perl's syntactic flexibility. The following program prints the text "Just another Perl / Unix hacker", using 32 concurrent processes coordinated by pipes. A complete explanation is available on the author's Web site.

 @P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
 @p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
 ($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^/&&
 close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d;sleep rand(2)if/\S/;print

Similar to obfuscated code but with a different purpose, "Perl Poetry" is the practice of writing poems that can actually be compiled by perl. This hobby is more or less unique to Perl due to the large number of regular English words used in the language. New poems are regularly published in the Perl Monks site's Perl Poetry section.

Another popular pastime is "Perl Golf." As with the physical sport, the goal is to reduce the number of strokes that it takes to complete a particular objective, but here "strokes" refers to keystrokes rather than swings of a golf club. A task, such as "scan an input string and return the longest palindrome that it contains", is proposed and participants try to outdo each other by writing solutions that require fewer and fewer characters of Perl source code.

Another tradition among Perl hackers is writing JAPHs, which are short obfuscated programs that print out the phrase "Just another Perl hacker,". The "canonical" JAPH includes the comma at the end, although this is often omitted, and many variants on the theme have been created (example: , which prints "Just Another Perl Pirate!").

One interesting Perl module is Lingua::Romana::Perligata. This module translates the source code of a script that uses it from Latin into Perl, allowing the programmer to write executable programs in Latin.

The Perl community has set aside the "Acme" namespace for modules that are fun or experimental in nature. Some of the Acme modules are deliberately implemented in amusing ways. Some examples:

See also

References

Books

Perl

(Also see Books under External links, below.)

Perl man pages

The Perl man pages are included in the Perl source distribution. They are available on the web from http://perldoc.perl.org/. Some good starting points are:

Web pages

External links

Major Resources

Development

Books

(Also see Books under References, above.)

Support

Distributions

History

Humor

Miscellaneous

Template:Major programming languages small

Categories: