open
and the <>
operatorOpening and reading files with Perl is simple. Here's how to open a file, read it line-by-line, check it for text matching a regular expression, and print the lines that match.
open( my $fh, '<', $filename ) or die "Can't open $filename: $!"; while ( my $line = <$fh> ) { if ( $line =~ /wanted text/ ) { print $line; } } close $fh;
Always check the return code from open()
for truthiness. If there's a failure, the result is in $!
.
chomp
Lines read from a file have their trailing linefeed still attached. If you have a text file where the first line is
Aaron
"Aaron", is actually "Aaron\n", six characters. This code will fail:
my $line = <$fh>; if ( $line eq 'Aaron' ) { # won't reach here, because it's really "Aaron\n"; }
To remove the "\n"
, and any other trailing whitespace, call chomp
.
my $line = <$fh>; chomp $line;
Now $line
is five characters long.
$/
It's possible to change your input record separator, $/
. It's only set to "\n" by default.
Set $/
to read a paragraph at a time. Set $/
to undef
to read the entire file at once. See perlvar for details.
Typically you'll see novices read a file using one of thse two methods:
open (FILE,$filename) || die "Cannot open '$filename': $!"; undef $/; my $file_as_string = <FILE>;
or
open (FILE,$filename) || die "Cannot open '$filename': $!"; my $file_as_string = join '', <FILE>;
Of those two, choose the former. The second one reads all the lines into an array, and then glomps together a big string. The first one just reads into a string, without creating the intervening list of lines.
The best way yet is like so:
my $file_as_string = do { open( my $fh, $filename ) or die "Can't open $filename: $!"; local $/ = undef; <$fh>; };
The do()
block returns the last value evaluated in the block. This method localizes the $/
so that it gets set back outside the scope of the block. Without localizing $/
, it retains the value being set to it and another piece of code might not be expecting it to have been set to undef
.
Here's another way:
use File::Slurp qw( read_file ); my $file_as_string = read_file( $filename );
File::Slurp is a handy module for reading and writing a file at a time, and it does magic fast processing on the back end.
glob()
Use standard shell globbing patterns to get a list of files.
my @files = glob( "*" );
Pass them through grep
to do quick filtering. For example, to get files and not directories:
my @files = grep { -f } glob( "*" );
unlink
to remove a fileThe Perl built-in delete
deletes elements from a hash, not files from the filesystem.
my %stats; $stats{filename} = 'foo.txt'; unlink $stats{filename}; # RIGHT: Removes "foo.txt" from the filesystem delete $stats{filename}; # WRONG: Removes the "filename" element from %stats
The term "unlink" comes from the Unix idea of removing a link to the file from the directory nodes.
Even though Unix uses paths like /usr/local/bin and Windows uses C:\foo\bar\bat, you can still use forward slashes in your filenames.
my $filename = 'C:/foo/bar/bat'; open( my $fh, '<', $filename ) or die "Can't open $filename: $!";
In this case, Perl magically changes the C:/foo/bar/bat to C:\foo\bar\bat before opening the file. This also prevents the problem where an unquoted backslash screws up a filename, as in:
my $filename = "C:\tmp";
In this case, $filename
contains five characters: 'C', ':', a tab character, 'm' and 'p'. Instead, it should have been written as one of:
my $filename = 'C:\tmp'; my $filename = "C:\\tmp";
Or, you can let Perl take care of it for you with:
my $filename = 'C:/tmp';
Submit a PR to github.com/petdance/perl101
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.