API yields illegal XML

Hi - I am writing an Android application that includes a feature to lookup product info by scanning EAN bar codes. I was trying to use the OpenFoodFacts API for this but had to learn that the XML that is returned is rejected by my parser:

This can be easily reproduced using any modern browser:
E.g. requesting “https://world.openfoodfacts.org/api/v2/product/9001508014026.xml”
(the code is that of a beloved Austrian sweet) yields a browser page stating:

This page contains the following errors:
error on line 150 at column 27: Namespace prefix en on maybe-vegetarian is not defined error on line 152 at column 20: Namespace prefix en on non-vegan is not defined error on line 154 at column 19: Namespace prefix en on palm-oil is not defined error on line 340 at column 8: StartTag: invalid element name

My application’s parser yields the exact same error messages. So, it’s really the produced XML that is buggy. Where or whom can I contact re. this issue?

Hi @mmo,

Thanks for reporting.

My best advice would be: use JSON instead of XML.

We have XML support but it’s a bit neglected because we have very few usage. Indeed we might better drop it.

Is it ok for you to switch to JSON ?

As a reminder, please read Introduction to Open Food Facts API documentation - Product Opener (Open Food Facts Server) before anything.

We might fix the issue if it’s impossible for you to switch to JSON.

FTR, the problem is there:

    <ingredients_analysis>
      <en:maybe-vegetarian>en:e322</en:maybe-vegetarian>
      <en:maybe-vegetarian>en:natural-flavouring</en:maybe-vegetarian>
      <en:non-vegan>en:skimmed-milk-powder</en:non-vegan>
      <en:non-vegan>en:butterfat</en:non-vegan>
      <en:palm-oil>en:palm-fat</en:palm-oil>
    </ingredients_analysis>

Yes, I can switch to JSON. Not a big deal.

To fix it you would need to add an xmlns-namespace definition for the “en” namespace (either in the XML header or on the “ingredients_analysis”-element). No idea, how easy or complex that would be in your setup.

Looking at it again: could it be, that this “ingredients_analysis”-element was meant to contain plain text only and someone “smuggled-in” some XML snippet here which happens to contain these “en:… -elements” and they are now simply emitted here verbatim and thereby corrupt the produced XML?

I think the solution is that we should

  1. either remove the en: (it comes from the taxonomy)
  2. or use a different separator like _

IMO 1. is the best solution.