Typo Squatting and Packagist

Earlier this month an article was published summarizing Nikolai Philipp Tschacher's thesis about typosquatting. In short typosquatting is a way to attack users of a package manager by registering a package with a name similar to a popular package, hoping that someone will accidentally typo the name and end up installing your version of it that contains malware.

The thesis mentions https://packagist.org as a good example as we use vendor namespaces:

[...] it is much more secure, if a package is named ntschacher/GoogleScraper instead of just GoogleScraper. The reason is: If the package name is misspelled and not the author name, this will not have any consequences, because the typo version cannot be registered in this namespace, since this author name is already reserved. [...] Because package names are much longer with two attributes, it is more likely that users will copy and paste the package name instead of remembering it.

Despite this mitigating fact, it is still technically possible to squat the vendor name, so I wanted to take a look at our repository data and see if I could spot any bad actors. I wrote a script that basically does the following:

  • Read the list of all vendor names which have packages with at least 1000 downloads, as the others are unlikely targets or at least low value targets.
  • Check the levenshtein distance of every vendor name against all others.
  • If the distance is 1, then it checks for package names within those two vendors to see if they have any intersecting names. Those are then candidates for being typosquatters.

What did I find? 21 vendor pairs that conflict to some degree. Only one that looked like an actual typosquatting attempt, momolog/monolog, and it even had in the package description that it was a demonstration of typosquatting. I deleted it along with 5 others packages that were useless, but the others are still in place. A lot of it is just due to people renaming their vendor names, or simply people that picked similar names but don't seem to be abusing anything.

In the future it would be nice to automate this, or prevent the creation of vendors that are too similar to popular ones. However it is reassuring to see that there is no widespread abuse going on.

June 29, 2016 // PHP

Post a comment

Subscribe to this RSS Feed Comments

2016-06-29 21:07:49


Maybe it will catch now we all get the idea :)

2016-06-30 01:35:26


Thanks for keeping on top of this!

Would usage of signed packages somehow help with this?

2016-06-30 08:26:05


@gggeek I don't think signed packages would help very much, unless we make it mandatory that you add the signatures of each vendor by hand but that would be a huge UX hurdle. If signatures are automatic then they wouldn't help here because you would just get a perfectly signed package, but it would be the wrong one.

2016-09-05 09:11:04


Isn't it still possible to impersonate a vendor? I could create a Github repo with a composer.json containing
{ "name": "symfony/get-rich-now" }
, then submit it to Packagist as symfony/get-rich-now. I think there would be enough people browsing the package database and giving it a try. How can Packagist prevent this kind of vendor namespace mimicry?

2016-09-05 09:23:36


@Alex: Nope you can't, because the symfony vendor exists already and we prevent new people from submitting packages to existing vendors.

2016-09-05 10:00:07


I see, good to know. So this means that the vendor prefix is implicitely bound to and reserved for the user account who uses it first?

2016-09-05 10:20:32


That is correct.

2016-10-09 13:21:20


Cool.. didn't know that levenshtein distance was already a PHP function.