Databases are generally protected by copyright law as compilations. Under the Copyright Act, a compilation is defined as a "collection and assembling of preexisting materials or of data that are selected in such a way that the resulting work as a whole constitutes an original work of authorship." 17. U.S.C. § 101. The preexisting materials or data may be protected by copyright, or may be unprotectable facts or ideas (see the BitLaw discussion on unprotected ideas for more information).
(I did not use AI, but this appeared at the top of my search and I think the search engine used AI to generate it):
In the European Union, databases are protected under the Database Directive, which provides legal protection based on the originality of the selection or arrangement of their contents...Some countries offer additional protections for databases that do not meet the originality requirement, often through sui generis rights.
That means the organization and selection of data is copyrightable, and only if they are creative. If you write your own tags for the codes, and makes a compilation of them all, none of that will cover your database.
Also, I think the bitlaw interpretation is incorrect. “Sweat of the brow” doesn’t magically produce copyright protection, and they don’t mention that.
Taking their example, if you had a collections from quotes from presidents, and I got a bunch of similar collections, then made my own ultimate definitive collection based partially on your list, then there’s very little chance I’d be liable for violating your copyright. If I copied the list and typesetting verbatim, you’d have a better case.
Also, modern rulings about LLM training (the topic of this thread) certainly mean copyrights on compilations of facts don’t survive training + inference cycles.
that was changed
https://www.bitlaw.com/copyright/database.html
Databases are generally protected by copyright law as compilations. Under the Copyright Act, a compilation is defined as a "collection and assembling of preexisting materials or of data that are selected in such a way that the resulting work as a whole constitutes an original work of authorship." 17. U.S.C. § 101. The preexisting materials or data may be protected by copyright, or may be unprotectable facts or ideas (see the BitLaw discussion on unprotected ideas for more information).
(I did not use AI, but this appeared at the top of my search and I think the search engine used AI to generate it):
In the European Union, databases are protected under the Database Directive, which provides legal protection based on the originality of the selection or arrangement of their contents...Some countries offer additional protections for databases that do not meet the originality requirement, often through sui generis rights.