Blog RSS Feed Subscribe

Jordi Boggiano

Jordi Boggiano Passionate web developer, specialized in web performance and php. Partner at Nelmio, information junkie and speaker.

Categories

Common files in PHP packages

This one started in a peculiar way. Paul M. Jones announced a new version of his Producer tool, I had a look at it and saw that it recommended having a changelog called CHANGES.md by default. This irked me a bit because I always use CHANGELOG.md and hardly ever see that as a file name (it's the little things that matter, right?).

My first thought was to report an issue asking to change the default, but then I thought it's Paul, he will not just take my word for it, he will want hard facts. So here I am two days later. I queried GitHub's API for the file listing (only the root directory) of all PHP packages listed on packagist.org.

Show me the data!

What this let me do is look at what files are commonly present (and not), which is quite interesting to get a picture of the whole ecosystem.

In total, this includes file listings from 78'992 packages (no GitHub API was harmed in the making of this blog post though). And here are a few interesting things that surfaced:

Common Directories

  • 58% of packages include a src/ directory and 5% a lib/ one. That's surprisingly low to me, that means a lot have the code simply in the root folder.
  • 8% have a DependencyInjection/ directory, which I believe indicates Symfony bundles, that's 6780 of them.
  • 4% have a bin/ directory, including some sort of CLI executables.
  • 3.6% have a examples/ and 3.5% a docs/ directory, not a whole lot of extensive out-of-README documentation out there it seems. Definitely something that could be improved.

Common Files

  • 55% have a LICENSE file, that's.. pretty disastrous but hopefully a lot of those that don't at least indicate in the README and composer.json
  • 49% have some file or directory indicating the presence of tests (phpunit.xml & co). I am not sure if this is good or bad news to be honest, that depends on your expectations.
  • 35% show a presence of a CI system running their tests (.travis.yml & co)
  • 14% have committed their composer.lock. As I have said in the past for libraries it is not really necessary to commit it, and it seems most prefer not to. I hope you commit it in your private projects though!
  • 9% have a CHANGELOG, and that is composed of 8.5% CHANGELOG and 0.5% CHANGES, so there goes my answer for Paul ;)
  • 8% show a presence of some code quality/style CI (scrutinizer, codeclimate, styleci). That's not a lot but some might be running thoes tools as part of their regular CI so the numbers are not necessarily valid.

If you would like to access the full data to look at other numbers, you can get a readable version of the top 100 dirs and top 100 files plus a file containing the whole data set with file name => package counts.

April 21, 2016 // PHP

Post a comment

Subscribe to this RSS Feed Comments

2016-04-21 16:06:23

Tomáš Votruba

Thanks for summing up. How long did that take your to analyze the data?

I'd love to read more articles like this. Sth like SocialBakers for Packagist :)

2016-04-21 16:28:27

Christophe Coevoet

> 58% of packages include a src/ directory and 5% a lib/ one. That's surprisingly low to me, that means a lot have the code simply in the root folder.

This does not seem that weird to me: any package written in the target-dir times needed to do that. And the official recommendation for Symfony bundles still advocate this, even though PSR-4 is now a thing.

> 3.6% have a examples/ and 3.5% a docs/ directory, not a whole lot of extensive out-of-README documentation out there it seems. Definitely something that could be improved.

The official Symfony recommendation for bundles being Resources/doc (due to the previous point), this might be a bit better in reality. And have you also counted the singular doc/ ? but I agree it will stay quite low.

> 55% have a LICENSE file, that's.. pretty disastrous but hopefully a lot of those that don't at least indicate in the README and composer.json

The old Symfony recommendation about Resources/meta/LICENSE might give a few more percents, but this is indeed quite bad.
Do you have a stat about which percentage of packagist packages have the license configured in the composer.json to compare it ? This should be easy to extract from the database.
However, given the state of open-source packages on github 1 year ago, 55% is not the worse (it would be great if github could provide the updated metrics): https://github.com/blog/1964-open-source-license-usage-on-github-com

> 8% show a presence of some code quality/style CI (scrutinizer, codeclimate, styleci). That's not a lot but some might be running thoes tools as part of their regular CI so the numbers are not necessarily valid.

some tools (for instance Scrutinizer and SensiolabsInsight) can also be used without having a config file in the project root

Last modification : 2016-04-21 - 16:54:59

2016-04-21 16:46:42

Seldaek

Tomáš: It didn't take too long, just quite some time to query 80K calls from the GitHub API, but now that I have the data on disk I can run stats over it in a few minutes.