The link field is not well specified, leading to several issues.

Semantic
What does mean link?
- Is it the link to the official product’s page from the producer’s website?
- Is it a link to the producer’s website homepage?
- Can it be another link?
The documentation says “Link to the product page on the official site of the producer”. But in practice, a majority of addresses are related to the producer’s site and not the product page.
As of today (2025-04-04):
- a
linkvalue has been entered for 73,500+ products - 37,500+ (51%) are not related to a product page, but are related to the producers’ website (I used the following regexp to catch them:
^(https?)?(www\.)?(.*)?(\.)([^\/]*)?\/?$)
What do we really want?
- any link if it’s related to the food manufacturer
- the official product page
- the home page of the producer
- the customer service page
- another link (please explain)
I would say that 3 could be the given rule, as it’s the most stable address (and it’s already representing 51% of the values). But we could accept other addresses linking to the producer’s website. We could regularly check the addresses against 404 or scam.
Format
As of today, 52,300+ (71%) are beginning with “http”.
- https://example.com/ should be ok as it’s the complete protocol + address
- when the address is the root of the website, the leading slash might be automatically removed to improve comparison or aggregation
- is www.example.com ok?
A “data quality error” facet should list bad links, eg. gttp://bad-protocol.net.
Should we create a link_tags field to normalize the link field:
- it would contain https://www.example.com instead of www.example.com
- it would contain https://www.example.com instead of https://www.example.com/
- etc.
Name
Isn’t the database field name link too confusing? Shouldn’t be producer_link?
How should it work?
I think that Open Food Facts should not endorse these links. In Wikipedia, external links use the HTML attribute rel="nofollow" to tell search engine they don’t endorse external links. I think we should do the same.
Any more ideas about this field?