`clone_node()` doesn't duplicate nonstandard tag names
stevecheckoway opened this issue · comments
Stephen Checkoway commented
While investigating #3098, I noticed that Gumbo's clone_node()
function doesn't make a copy of nonstandard tag names.
I think this is the fix
diff --git a/gumbo-parser/src/parser.c b/gumbo-parser/src/parser.c
index 67812b23..c3e5e038 100644
--- a/gumbo-parser/src/parser.c
+++ b/gumbo-parser/src/parser.c
@@ -1377,6 +1377,9 @@ static GumboNode* clone_node (
*new_node = *node;
new_node->parent = NULL;
new_node->index_within_parent = -1;
+
+ if (node->v.element.tag == GUMBO_TAG_UNKNOWN)
+ new_node->v.element.name = gumbo_strdup(node->v.element.name);
// Clear the GUMBO_INSERTION_IMPLICIT_END_TAG flag, as the cloned node may
// have a separate end tag.
new_node->parse_flags &= ~GUMBO_INSERTION_IMPLICIT_END_TAG;
but I'd like to understand why this hasn't been causing a bunch of memory leaks first.
Mike Dalessio commented
@stevecheckoway It looks like clone_node
isn't being called for an unknown tag in the test suite.
Here's the patch I used:
diff --git a/gumbo-parser/src/parser.c b/gumbo-parser/src/parser.c
index 06f096f8..180ee746 100644
--- a/gumbo-parser/src/parser.c
+++ b/gumbo-parser/src/parser.c
@@ -20,6 +20,7 @@
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
+#include <stdio.h>
#include "ascii.h"
#include "attribute.h"
@@ -1396,6 +1397,11 @@ static GumboNode* clone_node (
*new_node = *node;
new_node->parent = NULL;
new_node->index_within_parent = -1;
+
+ if (node->v.element.tag == GUMBO_TAG_UNKNOWN) {
+ fprintf(stderr, "MIKE: unknown tag %s\n", node->v.element.name);
+ }
+
// Clear the GUMBO_INSERTION_IMPLICIT_END_TAG flag, as the cloned node may
// have a separate end tag.
new_node->parse_flags &= ~GUMBO_INSERTION_IMPLICIT_END_TAG;
and it never prints anything!