Post on 11-Jan-2015
description
transcript
CPANTS
Kwalitative website and its tools
Kenichi Ishigaki (charsbar)
@YAPC::EU 2012 August 22, 2012
Kenichi Ishigaki (charsbar)
From Shibuya.pm, Tokyo, Japan.
Freelancer
- Perl programmer - Writer/Translator
Around 40 CPAN distributions
DBD::SQLite
Acme::CPANAuthors
We have been enjoying the
CPANTS game since 2005.
輝け!全日本最強 CPAN Author 決定選手権
by Koichi Taniguchi
http://blog.livedoor.jp/nipotan/archives/16108466.html
He picked up Japanese authors
by eye.
Our names are easy to find.
There were not so many authors.
- Total: ~4000
- Japanese: ~50
YAPC::Asia increased the number of
Japanese authors.
YAPC::Asia / Japanese authors
2006 (Mar) 98 2007 (Apr) 154 2008 (May) 191 2009 (Sep) 228 2010 (Oct) 255 2011 (Oct) 270
Needed something to pick
up Japanese authors more
easily.
That's why I created a list of Japanese authors
and a script to maintain it.
I've been reporting the
Japanese top 10 authors since
2008.
I've been adding something new
every year.
2008: sum of the kwalitee scores
per author
2009: authors who released
most in the year
2010: authors/ population ratio
2011: launched a website (finally)
acme.cpanauthors.org
It had one big problem.
No data.
The official CPANTS site had
been down for some time.
I needed to set up mine.
I created a private repository and put everything
into it.
Merged recent commits from
domm's repository.
Added a few columns.
Tweaked Catalyst/DBIC
stuff.
It worked.
Warnings were left.
I needed to find some tuits to remove them.
Perl QA Hackathon
Warnings were removed.
Ported some of the changes I did locally to daxim's
repository.
Showed a new acme.cpanauthors.org
featuring CPANTS info.
Unfortunately, the porting took too much time.
I didn't merge the changes back to my repository.
OSDC.TW
I finally merged the changes.
Got several reports that CPANTS was
broken.
What broke CPANTS was a small change.
"modules" : [ { "file" : "lib/Path/Extended.pm", "in_basedir" : 0, "in_lib" : 1, "module" : "Path::Extended", "uses" : { "Sub::Install" : 1, "strict" : 1, "warnings" : 1 } } ]
I don't think this change is bad.
Module::CPANTS::ProcessCPAN shouldn't have died by this.
It should have had tests.
Is should have run faster.
It should have been easier to fix
analysis.
Enough issues for a summer.
What should we do?
- We need tests. - we need to find
test cases. - we need to do it
many times.
Making it run faster is the first priority.
I wrote a barebone script to store data in
parallel.
JSON
create table if not exists analysis ( id integer primary key autoincrement, path text unique, distv text, author text, json text, duration integer );
Raw SQL statements
Parallel::ForkManager
SQLite queue
Beware a race condition
my ($id) = $dbh->selectrow_array(" SELECT id FROM queue WHERE status = 0 LIMIT = 1 "); $dbh->do(" UPDATE queue SET status = 1 WHERE id = ? ", undef, $id);
sqlite_update_hook
my $id; my $dbh->sqlite_update_hook(sub { (undef, undef, undef, $id) = @_; });
$dbh->do(" UPDATE queue SET status = 1, WHERE id IN ( SELECT id FROM queue WHERE status = 0 LIMIT 1 ) ");
Archive::Any::Lite
Archive::Any::Plugin::Bzip2
WorePAN
- Bundling is bad - We need a specific version - Derived from OrePAN
use WorePAN; my $worepan = WorePAN->new( root => 'path/to/a/directory/', files => [qw( I/IS/ISHIGAKI/WorePAN-0.01.tar.gz )], use_backpan => 1, no_network => 0, cleanup => 1, );
use WorePAN; my $worepan = WorePAN->new( root => 'path/to/a/directory/', files => [qw( I/IS/ISHIGAKI/WorePAN-0.01.tar.gz )], local_mirror => '/home/ishigaki/minicpan/', no_network => 1, cleanup => 1, );
use WorePAN; my $worepan = WorePAN->new( root => 'path/to/a/directory/', dists => { 'Catalyst-Runtime' => 5.9, 'DBIx-Class' => 0, }, cleanup => 1, );
Bonus features
my $worepan = WorePAN->new( root => 'path/to/a/CPAN/mirror/', cleanup => 0, ); my $authors = $worepan->authors; my $modules = $worepan->modules; my $file = $worepan->files; my $dists = $worepan->latest_distributions;
$worepan->add_files(qw{ /path/to/a/local/distribution-0.01.tar.gz }); $worepan->update_indices;
Now we have enough tools.
Processing time is significantly decreased.
What's next?
::Site refactoring
I'm preparing the data now.
Creating more databases/tables.
Merging information from external sources.
- CPAN indices - CPAN uploads database
Calculating scores on prerequisite
modules.
It will be this year's something new in my annual
report.
And then, I'll move on to fixing
the metrics.
Some of them are badly broken.
"versions" : { "lib/Data/Phrasebook.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Debug.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Generic.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader/Base.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Loader/Text.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/Plain.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/SQL.pm" : "use vars qw($VERSION);¥n", "lib/Data/Phrasebook/SQL/Query.pm" : "use vars qw($VERSION);¥n" },
Error is not a stash.
"error" : { "easily_repackageable" : "easily_repackageable_by_fedora", "easily_repackageable_by_fedora" : "fits_fedora_license", "metayml_conforms_spec_current" : [ "1.4", "Expected a map structure from data string or file. [Validation: 1.4]" ], "metayml_conforms_to_known_spec" : [ "1.0", "Expected a map structure from data string or file. [Validation: 1.0]" ], "no_pod_errors" : " home cpants tmp analyze 11442 8001be43fb65..." }
Should have initialize/finalize phases.
Module::CPANTS::Kwalitee::Distros
doesn't clean up after mirrored Debian CPANTS file
https://rt.cpan.org/Ticket/Display.html?id=51514
There are much more to do.
- JSON API for metacpan.org and so on. - Email Reporting like CPAN Testers - Evaluate new Kwalitee indicators - New metrics like portable filename - Blog about recent tendency - More comprehensive tests - Analysis per perl version/architecture - Cover Perl::Critic, CPAN::Critic::Module::Abstract - 35 RT tickets and several github isses
Resources
github.com/charsbar/www-cpants github.com/charsbar/worepan
github.com/daxim/Module-CPANTS-Analyse
Questions?
Thank you