Perl
Practical Extraction and Report Language is the oldest of the interpreted languages, python being its 3 years younger sibling. The perl interpreter is written in C, a compiled language.
Perl is flexible and can be used to write web applications, command line applications, or services.
Contents
- 1 Basics
- 1.1 Development Environment
- 1.2 Your first program
- 1.3 Variables & Data Types
- 1.4 Boolean Logic
- 1.5 Loops
- 1.6 User Input
- 1.7 User-Defined Functions
- 2 Helpful Libraries
Basics
Development Environment
To develop in perl you will need only a perl interpreter and a text editor. For those of you who find un highlighted perl, there are a variety of windows & linux text editors with syntax highlighting support.
Windows:
- notepad++
- cygwin's vim implementation
- gvim for windows
Linux:
- vim
- nano
- emacs
- geany
- gedit
Linux & Unix
On most distributions, perl and cpan come bundled by default. In the case it is not, a simple apt-get, emerge, yum install, pacman, or any other package manager should install it quickly. You can determine if perl is installed by typing `which perl' at the bash command line. If a filename is returned, you're good to go.
Windows
You can do everything we're going over by installing perl on cygwin. CYGWIN is available at http://www.cygwin.com/install.html
For compilation to .exe, we recommend "pp", you can install this by `typing cpan -i pp' from your cygwin shell.
There is also a perl implementation for Windows written by activestate, searching for "activestate perl" in any search engine will find it.
CPAN
CPAN is the module and package installer for perl. It can be accessed on most distributions simply by typing `cpan'. On windows, you can access it by typing `cpan' in your CYGWIN shell. Note: If `cpan' does not work, try `perl -e 'shell' -mCPAN'. If this does not work, your installation may be broken.
Linux & Unix
Windows
CPAN
CPAN is the module and package installer for perl. It can be accessed on most distributions simply by typing `cpan'. On windows, you can access it by typing `cpan' in your CYGWIN shell. Note: If `cpan' does not work, try `perl -e 'shell' -mCPAN'. If this does not work, your installation may be broken.
Your first program
Code
Code
To run this code, you'll only need to put it in a text file. Save it as "hello.pl", and then you can execute the following to run it from either cygwin or bash:
- chmod +x hello.pl
- ./hello.pl
Alternatively you can simply type:
- perl hello.pl
<syntaxhighlight lang="perl">#!/usr/bin/perl use strict; use warnings; print "Hello world!\n";</syntaxhighlight> |
Analysis
The shebang declares the location of the code's interpreter. I.e. if you're writing bash, you'll need to put:
#!/bin/bash
at the top of your file. In perl, it's typically:
#!/usr/bin/perl
This should be the first line in any perl you write. You can also use:
#!env perl
If you are unsure of the path and you have it in your environment variables. If for some reason `#!env perl' and `#!/usr/bin/perl' do not work, running `which perl' from the bash command line will return the proper path.
This is only required if you want to directly execute your script (i.e. ./script.pl). If you get permissions errors when attempting this, you can execute it via `perl script.pl' or running `chmod +x script.pl' before running `./script.pl'.
With perl in particular, its real easy for ugliness to occur. To counter this, the next lines are:
<syntaxhighlight lang="perl">use strict; use warnings;</syntaxhighlight> |
Strict perl forces you to maintain some semblence of syntax. Without the strict usage, you can basically run amok with code, perl will not care.
The print "Hello world!\n" line simply prints "Hello world!" with a newline character on the end. On windows you may need to change "\n" to "\r\n", depending on which interpreter you've installed.
You can also reference the hex code for this via a \x character, "\x0a\x0d".
Analysis
Perl/Basics/Analyzing Your First Program
Variables & Data Types
In strict perl, variables must be declared using the "my" or "our" operators. "my" is used implicitly in non-shared memory, whereas "our" is used explicitly for shared memory to pass data between threads.
Scalars
Scalars in perl are prefixed with a $. A scalar may contain any string, integer, or floating point value. It may also contain a reference pointer. An example declaration:
<syntaxhighlight lang="perl">my $message = "Hello world!\n"; print $message;</syntaxhighlight> |
Arrays
Arrays (or lists) have elements. Typically an array in perl can contain anything - each element can be something different. An array element may be a hash, hash reference, scalar, or another array.
Arrays are prefixed by the @ character:
<syntaxhighlight lang="perl">my @messages = ("Hello world!\n","I like perl!\n"); print $messages[0]; print $messages[1]; print "Size of messages array: ". $#messages . "\n"; </syntaxhighlight> |
You can access and modify array elements directly:
<syntaxhighlight lang="perl"> $messages[0] = "Hello world!\n"; </syntaxhighlight> |
Helper Functions
join()
Join will compile an array into a scalar. Using the array example above, @messages, the following code will generate the string "Hello world!\n, I like perl!\n" as a scalar:
<syntaxhighlight lang="perl">my @messages = ("Hello world!\n","I like perl!\n"); my $joined_message = join(", ",@messages); print $joined_message;</syntaxhighlight> |
split()
Split takes a scalar and converts it to an array using a delimiter. Using our string from earlier:
<syntaxhighlight lang="perl">my $joined_message = "Hello world!\n, I like perl!\n"; my @messages = split('/, /',$joined_message); print $messages[0]; print $messages[1]; print "Size of messages array: ". $#messages . "\n";</syntaxhighlight> |
push()
The push() function is used to append an element or elements to the end of an array, similar to the push instruction in assembly and treats the array like a stack.
my @array; push(@array,'element one'); push(@array,('element two','element three')); |
$array[$#array] = "new element";
pop()
The pop() function is similar to the pop instruction in assembly and treats the array like a stack.
my @array; $array[$#array] = 1; $popped = pop(@array); |
The same affect can be acheived with:
$popped = $array[$#array--]; |
Executing pop() on an array will delete the highest order array element. |
unshift()
The unshift() function is like the inverse of the push() function and treats the array like a stack. In stead of pushing to the top of the stack, this function operates against the bottom of the stack.
my @array; $array[0] = 1; unshift(@array,0); # $array[0] now contains "0" and $array[1] now contains [1]. |
shift()
The shift() function is like the inverse of the pop() function and treats the array like a stack. In stead of popping from the top of the stack, this function operates against the bottom of the stack.
my @array = (0,1); my $first_element = shift(@array); # $array[0] now contains one, and @array only contains one element |
Hashes
A hash is very similar to a struct in C.
Introduction
Hashes are prefixed by the % character. Hash element values are prefixed by $. A hash element may contain another hash, an array, or a scalar.
- You can directly modify the key inside of a hash
<syntaxhighlight lang="perl">$hash{'key'} = 'value';</syntaxhighlight> |
- You can also create a key => value pair on declaration
<syntaxhighlight lang="perl">my %hash = ('key' => 'value', 'key2' => 'value2');</syntaxhighlight> |
- Example:
<syntaxhighlight lang="perl">my %user; $user{'username'} = "hatter"; $user{'network'} = "irc.blackhatacademy.org"; print "The user " . $user{'username'} . " is connected to " . $user{'network'} . "\n"; </syntaxhighlight> |
Helper Functions
each()
"while my each" can be used to isolate $key => $value pairs from a hash as follows with our %user hash:
<syntaxhighlight lang="perl">while(my($key,$value) = each(%user)) { print "Key: $key, Value: $value\n"; };</syntaxhighlight> |
keys
This uses a foreach() loop and casting. We can isolate $key=>$value pairs the same as above using keys in stead of each:
<syntaxhighlight lang="perl">foreach my $key (@{sort keys %user}) { print "Key: $key, Value: ". $user{$key} ."\n"; };</syntaxhighlight> |
References
A reference is very similar to a pointer in C.
Hash References
A hash reference is a scalar created using the \ operator as follows:
my %user; $user{'name'} = "hatter"; $user{'network'} = "irc.blackhatacademy.org"; my $hashref = \%user; |
Once you've created a hashref (hash reference) you must use pointers to access a key:
print $user->{'name'} . "\n"; print $user->{'network'} . "\n"; |
Callback References
This involves user-defined functions. User-defined functions are covered later in this article. A callback reference is a scalar that points to a function. To create a callback reference:
my $callback = \&function_name; |
To execute the callback function and pass it arguments:
$callback->($arg1, $arg2); |
Casting
Casting is the process of transitioning from one data type to another. This is typically done using curly brackets {} preceeded by a data type designator ($,%, or @).
- To cast a hash reference back to a hash:
my %hash; my $hashref = \%hash; #create the hash reference my %casted = %{$hashref}; #Cast back to a hash. |
- To cast a list of keys in a hash into an array:
my @casted = @{keys %hash}; |
- To cast a scalar value to an integer:
my $integer = int($scalar); |
Scalars
Perl/Basics/Variables and Data Types/scalars
Arrays
Arrays (or lists) have elements. Typically an array in perl can contain anything - each element can be something different. An array element may be a hash, hash reference, scalar, or another array.
Arrays are prefixed by the @ character:
<syntaxhighlight lang="perl">my @messages = ("Hello world!\n","I like perl!\n"); print $messages[0]; print $messages[1]; print "Size of messages array: ". $#messages . "\n"; </syntaxhighlight> |
You can access and modify array elements directly:
<syntaxhighlight lang="perl"> $messages[0] = "Hello world!\n"; </syntaxhighlight> |
Helper Functions
join()
Join will compile an array into a scalar. Using the array example above, @messages, the following code will generate the string "Hello world!\n, I like perl!\n" as a scalar:
<syntaxhighlight lang="perl">my @messages = ("Hello world!\n","I like perl!\n"); my $joined_message = join(", ",@messages); print $joined_message;</syntaxhighlight> |
split()
Split takes a scalar and converts it to an array using a delimiter. Using our string from earlier:
<syntaxhighlight lang="perl">my $joined_message = "Hello world!\n, I like perl!\n"; my @messages = split('/, /',$joined_message); print $messages[0]; print $messages[1]; print "Size of messages array: ". $#messages . "\n";</syntaxhighlight> |
push()
The push() function is used to append an element or elements to the end of an array, similar to the push instruction in assembly and treats the array like a stack.
my @array; push(@array,'element one'); push(@array,('element two','element three')); |
$array[$#array] = "new element";
pop()
The pop() function is similar to the pop instruction in assembly and treats the array like a stack.
my @array; $array[$#array] = 1; $popped = pop(@array); |
The same affect can be acheived with:
$popped = $array[$#array--]; |
Executing pop() on an array will delete the highest order array element. |
unshift()
The unshift() function is like the inverse of the push() function and treats the array like a stack. In stead of pushing to the top of the stack, this function operates against the bottom of the stack.
my @array; $array[0] = 1; unshift(@array,0); # $array[0] now contains "0" and $array[1] now contains [1]. |
shift()
The shift() function is like the inverse of the pop() function and treats the array like a stack. In stead of popping from the top of the stack, this function operates against the bottom of the stack.
my @array = (0,1); my $first_element = shift(@array); # $array[0] now contains one, and @array only contains one element |
Helper Functions
join()
Join will compile an array into a scalar. Using the array example above, @messages, the following code will generate the string "Hello world!\n, I like perl!\n" as a scalar:
<syntaxhighlight lang="perl">my @messages = ("Hello world!\n","I like perl!\n"); my $joined_message = join(", ",@messages); print $joined_message;</syntaxhighlight> |
split()
Perl/Basics/Helper Functions/Split
push()
Perl/Basics/Helper Functions/Push
pop()
Perl/Basics/Helper Functions/Pop
unshift()
Perl/Basics/Helper Functions/Unshift
shift()
Perl/Basics/Helper Functions/Shift
Hashes
Helper Functions
each()
keys
References
Hash References
Callback References
Perl/Basics/References/Callback
Casting
Boolean Logic
Operators
Mathematical
Perl/Basics/Operators/Mathematical
Regular Expression
Perl/Basics/Operators/Regular Expressions
Statements
if
unless
AND and OR
Perl/Basics/Statements/And and Or
switch
Golfing
Perl/Basics/Statements/Golfing
Helper Natives
exists
Perl/Basics/Helper Natives/Exists
defined
Perl/Basics/Helper Natives/Defined
undef
Perl/Basics/Helper Natives/Undef
Bitwise Manipulations
Perl/Basics/Bitwise Manipulations
AND
Perl/Basics/Bitwise Manipulations/AND
NOT
Perl/Basics/Bitwise Manipulations/NOT
OR
Perl/Basics/Bitwise Manipulations/OR
XOR
Perl/Basics/Bitwise Manipulations/XOR
Bit Shifting
Perl/Basics/Bitwise Manipulations/Bit Shifting
Bit Rotation
Perl/Basics/Bitwise Manipulations/Bit Rotation
Loops
A loop is a block of code that continues to execute until a condition is met.
While
- A while loop executes while a condition is true.
my $switch; my $counter; while (undef $switch) { print $counter; $counter++; $switch = 1 if ($counter > 100); } |
The above code will execute until $switch is defined.
It is possible to create an infinite loop using while (1) { ... }.
Until
- An until loop executes until a condition is true.
my $switch; my $counter; until (defined $switch) { print $counter; $counter++; $switch = 1 if ($counter > 100); } |
The above code will execute until $switch is defined.
For
- A for loop has a built-in counter and stops at a pre-defined number.
my @messages = ("Hello world!\n","I like perl!\n"); for (my $counter = 0; $counter < $#array; ++$counter) { print $messages[$counter]; } |
The above code will iterate through every element in an array.
It is possible to create an infinite loop using for (;;) {...}.
Foreach
- A foreach loop is built specifically for array handling and iterates through all of the elements in an array.
my @messages = ("Hello world!\n","I like perl!\n"); foreach my $message (@messages) { print $message; } |
The above code will iterate through every element in an array.
While
- A while loop executes while a condition is true.
my $switch; my $counter; while (undef $switch) { print $counter; $counter++; $switch = 1 if ($counter > 100); } |
The above code will execute until $switch is defined.
It is possible to create an infinite loop using while (1) { ... }.
Until
- An until loop executes until a condition is true.
my $switch; my $counter; until (defined $switch) { print $counter; $counter++; $switch = 1 if ($counter > 100); } |
The above code will execute until $switch is defined.
For
- A for loop has a built-in counter and stops at a pre-defined number.
my @messages = ("Hello world!\n","I like perl!\n"); for (my $counter = 0; $counter < $#array; ++$counter) { print $messages[$counter]; } |
The above code will iterate through every element in an array.
It is possible to create an infinite loop using for (;;) {...}.
Foreach
- A foreach loop is built specifically for array handling and iterates through all of the elements in an array.
my @messages = ("Hello world!\n","I like perl!\n"); foreach my $message (@messages) { print $message; } |
The above code will iterate through every element in an array.
User Input
Command Line Arguments
Command line arguments are passed at execution time; e.g.
perl script.pl -a arg1 -b arg2 ...
Getopt::Std
This requires Getopt::Std. The perldoc is here.
Code
use strict; use warnings; use Getopt::Std; my %opts; getopts('m:b',\%opts); print $opts{m} . "\n"; print "The boolean -b option was set!\n" if defined $opts{b}; print "The boolean -b option was not set!\n" if undef $opts{b}; |
Analysis
The getopts() function takes a string of flags to parse as well as a hash reference. You can execute the script as follows:
perl script.pl -m "hello" -b perl script.pl -m "hello"
In the above example, we see the line:
getopts('m:b',\%opts);
The 'm:b', the first argument to the function, designates what command line arguments to parse. The colon after the 'm' specifies that it takes an additional parameter, in this case, the message to say. The -b does not have a colon; we are using it to demonstrate a flag that does not require an additional parameter.
The second argument is a hash reference to designate where the return data is stored; in this case, $opts{m} contains "hello" and opts{b} is either defined or undefined based on whether or not it was present in the flags when the script was executed.
Getopt::Long
This requires Getopt::Long. The perldoc is here.
Code
use strict; use warnings; use Getopt::Long; my $message, $boolean; GetOptions('message=s' => \$message, 'boolean' => \$boolean); print $message . "\n"; print "The boolean -b option was set!\n" if defined $boolean; print "The boolean -b option was not set!\n" if undef $boolean; |
Analysis
The GetOptions() function receives message formats and references for variable assignment. You can execute the script as follows:
perl script.pl --message "hello" --boolean perl script.pl --message "hello"
In the above example, we see the line:
GetOptions('message=s' => \$message, 'boolean' => \$boolean);
You can see from the execution pattern above that the GetOptions() function provides an interface for the "double-dash" style command line arguments. The GetOptions() function receives a hash. The =s after message designates that the --message parameter receives a string data type. An =i will change it to integer. Simple no = will set the flag to a boolean; similar to an argument without a colon in Getopt::Std. Notice each variable is passed as a reference.
Getopt::Std
This requires Getopt::Std. The perldoc is here.
Code
use strict; use warnings; use Getopt::Std; my %opts; getopts('m:b',\%opts); print $opts{m} . "\n"; print "The boolean -b option was set!\n" if defined $opts{b}; print "The boolean -b option was not set!\n" if undef $opts{b}; |
Analysis
The getopts() function takes a string of flags to parse as well as a hash reference. You can execute the script as follows:
perl script.pl -m "hello" -b perl script.pl -m "hello"
In the above example, we see the line:
getopts('m:b',\%opts);
The 'm:b', the first argument to the function, designates what command line arguments to parse. The colon after the 'm' specifies that it takes an additional parameter, in this case, the message to say. The -b does not have a colon; we are using it to demonstrate a flag that does not require an additional parameter.
The second argument is a hash reference to designate where the return data is stored; in this case, $opts{m} contains "hello" and opts{b} is either defined or undefined based on whether or not it was present in the flags when the script was executed.
Code
use strict; use warnings; use Getopt::Std; my %opts; getopts('m:b',\%opts); print $opts{m} . "\n"; print "The boolean -b option was set!\n" if defined $opts{b}; print "The boolean -b option was not set!\n" if undef $opts{b}; |
Analysis
The getopts() function takes a string of flags to parse as well as a hash reference. You can execute the script as follows:
perl script.pl -m "hello" -b perl script.pl -m "hello"
In the above example, we see the line:
getopts('m:b',\%opts);
The 'm:b', the first argument to the function, designates what command line arguments to parse. The colon after the 'm' specifies that it takes an additional parameter, in this case, the message to say. The -b does not have a colon; we are using it to demonstrate a flag that does not require an additional parameter.
The second argument is a hash reference to designate where the return data is stored; in this case, $opts{m} contains "hello" and opts{b} is either defined or undefined based on whether or not it was present in the flags when the script was executed.
Getopt::Long
This requires Getopt::Long. The perldoc is here.
Code
use strict; use warnings; use Getopt::Long; my $message, $boolean; GetOptions('message=s' => \$message, 'boolean' => \$boolean); print $message . "\n"; print "The boolean -b option was set!\n" if defined $boolean; print "The boolean -b option was not set!\n" if undef $boolean; |
Analysis
The GetOptions() function receives message formats and references for variable assignment. You can execute the script as follows:
perl script.pl --message "hello" --boolean perl script.pl --message "hello"
In the above example, we see the line:
GetOptions('message=s' => \$message, 'boolean' => \$boolean);
You can see from the execution pattern above that the GetOptions() function provides an interface for the "double-dash" style command line arguments. The GetOptions() function receives a hash. The =s after message designates that the --message parameter receives a string data type. An =i will change it to integer. Simple no = will set the flag to a boolean; similar to an argument without a colon in Getopt::Std. Notice each variable is passed as a reference.
Code
use strict; use warnings; use Getopt::Long; my $message, $boolean; GetOptions('message=s' => \$message, 'boolean' => \$boolean); print $message . "\n"; print "The boolean -b option was set!\n" if defined $boolean; print "The boolean -b option was not set!\n" if undef $boolean; |
Analysis
The GetOptions() function receives message formats and references for variable assignment. You can execute the script as follows:
perl script.pl --message "hello" --boolean perl script.pl --message "hello"
In the above example, we see the line:
GetOptions('message=s' => \$message, 'boolean' => \$boolean);
You can see from the execution pattern above that the GetOptions() function provides an interface for the "double-dash" style command line arguments. The GetOptions() function receives a hash. The =s after message designates that the --message parameter receives a string data type. An =i will change it to integer. Simple no = will set the flag to a boolean; similar to an argument without a colon in Getopt::Std. Notice each variable is passed as a reference.
STDIN (Standard Input)
Reading from standard input in perl is very simple.
print "Enter your name :"; my $name = <>; print "Your name is $name\n"; |
User-Defined Functions
A function is defined by the programmer to create re-usable code. In our example, we will make an is_integer function that returns either 1 or undef depending on whether the scalar passed is an integer or not.
sub is_integer { my $scalar = shift; return 1 if (int($scalar) == $scalar); return undef; } |
Usage:
print "This scalar is an integer.\n" if (defined is_integer($scalar)) else print "This is not an integer.\n"; |
return($scalar,@array); |
my ($scalar,@array) = function(); |