Extract Mail Adresses from CSV

By :m) on November 20, 2011 12:05 PM

What do you think about the code below?
I have a file containing information about people, where the fifth element of the tab separated line contains the mail address.
Each mail address appears multiple times, I need to print them unique.

my %mails;
open my $csv, '<', 'my.csv' or die $!;
while (<$csv>){
    my  $mail = ( split(/\t/) )[4]; 
    $mails{$mail} = 1
}
say foreach sort keys %mails;

~~(and how do I indent code on blogs.perl.org?)~~
thx @Aristotle

4 comments

4 Comments

Aristotle | November 20, 2011 1:49 PM | Reply

how do I indent code on blogs.perl.org?

The way you do it in HTML – with <pre> around <code>. I fixed it for you, check your post.

wtgee | November 20, 2011 5:09 PM | Reply

Not really what you are asking, but assuming you don't want to use some standard modules (Text::CSV) and don't want to validate your email addresses and that this isn't part of a larger program, you can do this right in the shell with:

cat my.csv | cut -f 4 | sort | uniq

Just an FYI.

http://www.butteredham.com/blog/ | November 21, 2011 6:06 PM | Reply

Your example should work fine, as long as none of your fields ever contain a tab character. If they do, then in proper CSV they will be quoted in some manner, and sorting out things like that is what modules like Text::CSV are for. But for a one-off task where you know what your data looks like, there's nothing wrong with this.

:m) | November 21, 2011 6:17 PM | Reply

Thanks for your answers. Very interesting!

What I especially like is
$mail = ( split(/\t/) )[4];
First split $_ at the tabs, then make it a list with (), then take the slice and assign to $mail.

At the moment I am confident that I will not have to think more than five seconds to understand it.

But I would not directly assign to the hash:
$mails{ ( split(/\t/) )[4] } = 1

Name

Email Address

URL

Remember personal info?

Comments (You may use HTML tags for style)

About :m)

I blog about Perl.

More info »

:m)