Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to read values if possible CDATA #307

Closed
ejurgensen opened this issue Nov 5, 2023 · 2 comments
Closed

How to read values if possible CDATA #307

ejurgensen opened this issue Nov 5, 2023 · 2 comments
Assignees
Labels
question General usage question

Comments

@ejurgensen
Copy link

First of all thanks for making and maintaining mxml. It's been used by OwnTone (previously forked-daapd) for many years, and has been really solid.

I take care of OwnTone, where this issue came recently. It involves mxml parsing RSS XML with a tag that has CDATA. This creates a bug, since OwnTone just reads values with mxmlGetOpaque, which doesn't return anything when it's CDATA. So now I am seeking advice on what might be the best way to read a value, given that it is not known whether the input is CDATA or not.

Right now I'm thinking I should add a convience function like the below getStringValue(). However, it seems a bit crude to me. Is there a better way to do this?

#define CDATAXML "<?xml version=\"1.0\" ?><foo>FOO</foo><bar><![CDATA[BAR]]></bar>"

const char *
getStringValue(mxml_node_t *xml, const char *key)
{
  mxml_node_t *parent = mxmlFindElement(xml, xml, key, NULL, NULL, MXML_DESCEND);
  mxml_node_t *child  = mxmlGetFirstChild(parent);
  mxml_type_t type    = mxmlGetType(child);

  if (type == MXML_ELEMENT)
    return mxmlGetCDATA(child);

  return mxmlGetOpaque(child);
}

int
main(int argc, char *argv[])
{
  mxml_node_t *xml = mxmlLoadString(NULL, CDATAXML, MXML_OPAQUE_CALLBACK);

  printf("foo is '%s'\n", getStringValue(xml, "foo"));
  printf("bar is '%s'\n", getStringValue(xml, "bar"));
  return 0;
}
@michaelrsweet
Copy link
Owner

🤷‍♂️ That's probably as clean as you'll be able to make it.

@michaelrsweet michaelrsweet self-assigned this Nov 6, 2023
@michaelrsweet michaelrsweet added the question General usage question label Nov 6, 2023
@ejurgensen
Copy link
Author

Fwiw I ended up with these wrappers:

mxml_node_t *
xml_get_node(mxml_node_t *top, const char *path)
{
  mxml_node_t *node;
  mxml_type_t type;

  // This example shows why we can't just return the result of mxmlFindPath:
  // <?xml version="1.0""?><rss>
  //	<channel>
  //		<title><![CDATA[Tissages]]></title>
  // mxmlFindPath(top, "rss/channel") will return an OPAQUE node where the
  // opaque value is just the whitespace. What we want is the ELEMENT parent,
  // because that's the one we can use to search for children nodes ("title").
  node = mxmlFindPath(top, path);
  type = mxmlGetType(node);
  if (type == MXML_ELEMENT)
    return node;

  return mxmlGetParent(node);
}

// Walks through the children of the "path" node until it finds one that is
// not just whitespace and returns a trimmed value (except for CDATA). Means
// that these variations will all give the same result:
//
// <foo>FOO FOO</foo><bar>\nBAR BAR \n</bar>
// <foo>FOO FOO</foo><bar><![CDATA[BAR BAR]]></bar>
// <foo>\nFOO FOO\n</foo><bar>\n<![CDATA[BAR BAR]]></bar>
const char *
xml_get_val(mxml_node_t *top, const char *path)
{
  mxml_node_t *parent;
  mxml_node_t *node;
  mxml_type_t type;
  const char *s = "";

  parent = xml_get_node(top, path);
  if (!parent)
    return NULL;

  for (node = mxmlGetFirstChild(parent); node; node = mxmlGetNextSibling(node))
    {
      type = mxmlGetType(node);
      if (type == MXML_OPAQUE)
	s = trim(mxmlGetOpaque(node));
      else if (type == MXML_ELEMENT)
        s = mxmlGetCDATA(node);

      if (s && *s != '\0')
	break;
    }

  return s;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General usage question
Projects
None yet
Development

No branches or pull requests

2 participants