Skip to content

fix parsing of debian/copyright files #4708

@tmuehlbacher-bnr

Description

@tmuehlbacher-bnr

What happened:

Too many Debian packages contain poorly parsed copyright information.
It's even contained in at least one test case:

// note: this should not capture #, Permission, This, see ... however it's not clear how to fix this (this is probably good enough)

What you expected to happen:

Only parse debian/copyright files according to the machine-readable format if they make the claim to be in that format.

https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ specifies that there is a mandatory field called Format: that contains a (http or https) link to the spec.

Only files that contain this field should be parsed as machine-readable. All other files should instead have their entire content put into the .text.content of a license object.

Steps to reproduce the issue:

It's part of a test case in the repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood-first-issueGood for newcomers

    Type

    No type

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions