Improving the CPAN experience (a GSoC summer tale)
What will MetaCPAN offer that other services don’t?
- Instant availability (new uploads are indexed within a minute)
- Personalisation - “follow your favourites”
- Searchable metadata
- Mashup of other CPAN related services
- Unified (REST) API
- Back-end for Android/iPhone apps, command line tools etc.
- MetaCPAN.local for companies
- Includes BackPAN as well
- Open-source and free
Apply for GSoC to get this thing up and running
MetaCPAN is being developed by a group of perl coders who have jobs and all kinds of stuff on their minds. This means it is hard to get the momentum up. I got very much infected by the idea of having an API to CPAN that everyone could use and a front-end that could eventually replace search.cpan.org. So I joined the MetaCPAN group and started coding. And since I’m still a student, GSoC is a great opportunity to delve even deeper into the guts of MetaCPAN and do some serious work.
Community feedback to complete proposal
In order to finish my GSoC application I want to collect as much input as possible from the community. I compiled a list of features that I feel are nice to have and will improve the experience with CPAN. Though not all of them might be feasible or even desirable.
My application will consist of two subprojects. Improving the backend and writing a state-of-the art frontend. While search.metacpan.org is nice, it doesn’t add any additional functionality to search.cpan.org. I’d like to change that and leverage the power of metacpan.
- Follow your favorite Modules / Authors
- Get instant notifications on updates
- with a diff of the Changes file
- Add discussions to modules
- Tag modules as installed, broken, author unresponsive etc.
- Add metadata to your own distribution (e.g. “Looking for maintainer”, deprecated etc.)
- “CPAN of trust”
Improved search results
Currently search.cpan.org does a decent job on searching. However, it can be improved. For example it doesn’t show previews of the search results and the relevance of the returned results is sometimes questionable.
Evaluate third-party data
The following resources can be used to adjust the scoring of search results:
- CPAN Testers
- CPAN Ratings
Using the dependency chain, one can create a graph of modules and calculate a PageRank for each module. This will greatly enhance search results since modules with a high degree of centrality will be ranked higher.
- A full-text search that previews the relevant segments of the document
- Optionally limit search to a release / distribution
- Search for exact matches in the module name (autocompletion)
- Search for authors based on email, name and pauseid
- Exclude results with certain dependencies (e.g. modules using Moose or XS code)
- Keyboard navigation and shortcuts for super fast and mouse-less browsing
- Integrate grep.cpan.me
- Rate distributions from inside the new front-end no need to leave the page and re-login
- and many many more features
MetaCPAN for companies
minicpan has made it easy for companies to take control over their local CPAN requirements, but they can’t search either their local minicpan, or their own internal code.
- Will be a distribution that can be installed in your company network
- With all the features of MetaCPAN
- Add internal company modules to the index
- Either index the company’s minicpan or fall back to the live CPAN
- Every front-end developed for MetaCPAN will just work for MetaCPAN.local too
Nobody is going to use the MetaCPAN backend if there is no documentation which guides you through the basic steps of querying the metabase or setting up your own front-end.
I’m very excited to hear your ideas. Please don’t think too much about implementation details. Let the developer in you rest for a moment and ask youself:
- What do I need to access CPAN more easily?
- What information do I want to access through MetaCPAN?
- What data is required to further improve tools like cpanm?
- What am I missing from search.cpan.org?
- Basically, what can MetaCPAN and its front-end do for you?