November 2010 Archives

Adopt Randy Kobes's modules

This autumn, Randy Kobes, perhaps best known for the alternate CPAN web interface kobesearch, passed away.

If you care about one of the modules that he maintained, adopt it! Write to modules@perl.org to let one of the admins know that you'd like to be a maintainer.

I've updated his PAUSE account so I get his bug reports and so on, but I'm unlikely to do any maintenance on his modules unless I can merely apply the patch and …

What do you care what type it is?

I'm about to start writing about a bunch of stuff that will definitely show my lack of a computer science background. Unlike many of my posts, this is your chance to correct me rather than me explain things to you. This has been on my desktop for awhile, so I'm cutting bait and posting what I have instead of working on it more.

I've been reading Seven Languages in Seven Weeks: A Pragmatic Guide to Learning Programming Languages, which is an enjoyable book except for the parts where he starts to talk about types and, ahem, types of programming languages. It's mostly distracting, not very useful, probably misguided, it not outright wrong.

So, Ovid posts a clever summary of type arguments. This reminded me of the smart, educated, and quite entertaining "Strong Typing" talk that Mark Jason Dominus gave to several Perl mongers groups. It also reminds me that no one seems to think the same things about the same terms. Also see mjd's message in comp.lang.perl.moderated, in which he summarizes the several competing definitions of strong typing.

Wikipedia's entry on Strongly-typed Programming Languages isn't any help. Indeed, the discussion page, where mjd shows up, is better than the main article (and also demonstrates the underlying weakness of Wikipedia). The article did point me toward Types and Programming Languages (Google Books), which Amazon is already sending to me. I like the look of that book since it goes back to the math.

The math is where it's at, and although I don't have a background in Computer Science, I have a lot of experience with abstract algebra, which defines sets, groups, and so on, and what happens when they interact with each other.

I think this is why I get confused with most people's explanations. Most of the explanations I find come from people trying to explain a concept that they don't fully understand based on their limited experience. That is, more concretely, people think the C programming language means something when it comes to types. I like Real World Haskell's approach if only because it defines the term. They could have just as well said Haskell is a "blue language" because the particular word doesn't matter when you provide your own definition for it:

When we say that Haskell has a strong type system, we mean that the type system guarantees that a program cannot contain certain kinds of errors.

That provides an easy way for Haskell to compare itself to other languages. In Haskell, certain classes of errors can't occur in a valid program. In other languages, maybe those classes or errors can. The question is, does that matter to you, both personally as a matter of beauty, and economically, as a productive use of your time?

And now comes the bit where I try to do better and will fail.

There's this big mess of terms: strong, weak, loose, static, dynamic, concrete, abstract, data, variable, and so on. I like what Richard Feynman learned about bird names from his father. Dr. Feynman says:

You can know the name of a bird in all the languages of the world, but when you're finished, you'll know absolutely nothing whatever about the bird... So let's look at the bird and see what it's doing -- that's what counts. I learned very early the difference between knowing the name of something and knowing something.

The video is more interesting:


He tells the same story to an interviewer in "Take the World from Another Point of View":


In this telling, he adds one important bit to that story:

Names don't constitute knowledge. That's caused me a certain trouble since because I refuse to learn the name for anything. ... What he forgot to tell me was knowing the names of things is useful if you want to talk to someone else.

The names of birds, however, only matter if people call the bird by the same name.

Types are just a kind of thing, and not at all like birds. It doesn't matter how we define that thing or how it works. The type is not the algebra. Forget about the terms, which no one can agree on (mostly), and figure out what you what to know and why you want to know it. It doesn't really matter what you call it as long as you get what you want.

What can I put in this variable?

A lot of programmers immediately think of int, float, or char as types. That's fine. However, when they don't see those types, they tend to turn up their nose because they think something is type deficient. The sorts of types that you have really has nothing to do with it. Indeed, most of those types come from architecture-dependent factors, like exposing the storage and format details at the higher levels. The people that want these sorts of types are looking to define the set of data that belong in that type. However, that does not mean that larger sets are not also types.

Programmers typically want this so they have something that protects them from storing invalid values.

How soon do I find out about type errors?

Do I have to wait until I run the program or will the compiler tell me? Consider this Perl example:

 push $array, qw(a b c);

Is that a type error? Is it a type error in Perl 5.12? What about Perl 5.14? When does Perl find out about that error in each of those versions? Is it good or bad that it does that?

Can I change the type?

Is the type fixed, or can programmers play tricks to cast or coerce the thingy to void, or Object, or whatever, from which they can then recast the thingy to whatever they want? Do you want to allow or forbid that sort of thing?

When do I know the types?

People get confused about when the compiler (or interpreter) knows what the type is. Can you check it without a compiler (as with PPI or other static analysis tools), which really means can you infer all of the information that you need about types without actually running the program?

Mostly, people want to find errors though type-checking (so, the terms "type safety" and "type security"). The earlier you know about the types, the sooner your program can report problems. Some people don't even want the program to compile if there is a type mismatch.

What is the operator?

Some languages choose the operator by type, even if you type the same literal text for the code. How does that make you feel? Would you rather see the operation explicitly so you don't have to read through several lines of code to determine the type to know the operation, or do you want to be able to look at isolated statements and know what is going on?

What type is the result?

I always hated this about FORTRAN. Dividing 10 by 3 gave back 3, because, in someone's mind, a integer divided by an integer had to be an integer. Why can't some other type come out? That goes back to the algebra. Any operation in an algebra has to return another member of the group.

What ______ is _______?

There's more to this topic than I can imagine at the moment.

Get real random numbers from Perl's rand()

On Stackoverflow, someone asks how to get 100 random numbers without a loop. It's one of those dumb homework problems that tries to forbid only one of many things instead of specifying the technique it really wants the student to practice. In Perl land, that leaves the door open for the sick and twisted minds of people such as Tom Christiansen and Sinan Ünür. I wonder if the teacher would even understand their solutions, much less accept them.

Many of the answers went to great pains to avoid certain Perl keywords while still creating loops. Some people debated if map is a loop. Some people used recursion, forgetting that every time you recurse in Perl, God kills a kitten. Maybe someday people will realize that recursion in Perl isn't the same kung fu they see in other languages. Perl actually must recurse because it has no way of knowing if it's going to call the same function definition.

Most of the solutions used Perl's built-in rand, which I think ignores half of the problem, the random numbers themselves. I use rand too, but I replace its definition to use the random.org web service to get lists of random numbers generated from atmospheric noise. Not only that, but I change rand in very sick and twisted ways, adding a list context.

My Programming-Related Todo List

Dave Rolsky posted his Programming-Related Todo List, and here's mine:

  • Make a proper programming-related to-do list
  • Write that new CPAN client I was talking about
  • I've been indexing BackPAN just fine, but I have to figure out a way to make all of the data available to people in a sane way. The plain text uncompressed data for about 140,000 distros is several gigabytes. I have half of a web service written, and even have mycpan.com to host it. I just need the tuits.
  • Get my CPAN stats and trending project going again. It used to be part of The Perl Review, but I just need to set up the cron jobs again and maybe use some prettier charting tools.

There are some modules that I want to write, or fool someone else into writing:

  • Something to give a Perl interface to Mac OS X's mdls, to both get and set attributes, especially file labels.
  • A URL shortener Perl module for goo.gl, but also with the ability to retrieve the stats into a hash.
  • A time duration module that has nothing to do with clocks or calendars. It's the sort of thing that iTunes does with the sum of durations for all audio files. I'd like to collect and partition times with no rollovers. I can't see how to do this with any of the Time-related modules already on CPAN. They are overly concerned with the clocks. Are maybe I'm missing something
  • A blog aggregator proxy that can recognize the same content showing up in different feeds and filter out the duplicates. Maybe I need a better client, but I mostly like NetNewsWire otherwise.
  • A blogging engine that is more insane than anything Jon Rockway would make. I'm calling it Annoyed Porpoise.
  • Learn a lot more about foreign function interfaces in Perl so I can hook up some C libraries to Perl. I'm particularly curious about doing that with wireshark right now.

There's a lot more writing I want to do (although wanting is not having):

  • Start some mild reorganization and reduction of the perl pods. A lot of stuff has accumulated, so a fresh start on some of it might be good. And, with 150 files in pod/, isn't it time for subdirectories?
  • I'm writing a post a week for The Effective Perler, and I think I might do that again for 2011.
  • Update Learning Perl for Unicode throughout and Perl 5.14
  • Update Intermediate Perl for Perl 5.14, new module best (or good enough) practives, and some (just some) Moose (which Randal has already done, really, with his translation of the Animal chapters to Moose for Linux Magazine).
  • Update Mastering Perl as an eBook, for Devel::NYTProf, new regex hotness, and many other things.
  • Update Learning Perl Student Workbook for a new Learning Perl.
  • A secret project that chromatic and I might do.
  • Some iOS apps (native or otherwise) to put Perl docs, etc, in the lightweight and easily updatable way.
  • Many cool things for The Perl Review.

For conferences, I want to visit lots of new groups. That might mean I skip some of the usual suspects. So far I'm looking at YAPC::Riga and YAPC::Asia. I hear rumors of a YAPC::India too. I'll have to see about South America. For part of this, I need to find new, undiscovered groups of people who deserve a White Camel Award.

About brian d foy

user-pic I'm the author of Mastering Perl, and the co-author of Learning Perl (6th Edition), Intermediate Perl, Programming Perl (4th Edition) and Effective Perl Programming (2nd Edition).