How to use `$`?

Question

How to use `$`?

vitalydolgov opened this issue 3 years ago · comments

It's not actually an issue, rather a question on usage of Lambdasoup. For some reason I cannot use selector $ after taking a node by number (the second statement after binding). But it works well if I convert node to string and parse it again, or take element of node explicitly.

Is it an intentional behavior? In the source code I see no restriction on the node type, so I'm a bit confused...

# #require "lambdasoup";;

# open Soup;;

# let s = "<p class=\"txtRed\">AA * A<span class=\"txtNormal\">B</span> * A<span class=\"txtNormal\">C</span></p>";;
val s : string = ...

# s |> parse $ "p" |> children |> R.nth 2 |> to_string;;
- : string = "<span class=\"txtNormal\">B</span>"

# s |> parse $ "p" |> children |> R.nth 2 $? "span";;
- : element node option = None

# s |> parse $ "p" |> children |> R.nth 2 |> R.element |> name;;
- : string = "span"

# s |> parse $ "p" |> children |> R.nth 2 |> to_string |> parse $ "span" |> to_string;;
- : string = "<span class=\"txtNormal\">B</span>"

Anton Bachin · Answer 1 · Mon Feb 14 2022 22:13:41 GMT+0800 (China Standard Time)

In

# s |> parse $ "p" |> children |> R.nth 2 $? "span";;

$? selects from the descendants of the given node, in other words it is searching the DOM corresponding to the string B, and of course there are no elements at all to find there.

The reason this might be confusing is because the top-level node returned by parse is not the <p> element, but a "soup" (document) node which contains the <p> element as its child. It is done that way because, in general, the string you pass to parse may contain multiple elements, and indeed multiple nodes, since it might contain text at the top level.

Anton Bachin · Answer 2 · Mon Feb 14 2022 22:15:57 GMT+0800 (China Standard Time)

Likewise, when you convert your span DOM to a string and pass it back to parse, you get back a DOM consisting of a document whose child is the span element. I guess it's pretty annoying and non-algebraic that trying to round-trip an element through the parser doesn't give back an element, but a document containing that element.

Vitaly Dolgov · Answer 3 · Mon Feb 14 2022 22:57:40 GMT+0800 (China Standard Time)

@aantron thank you for the quick answer, now I get it. That's not a problem, the library is very convenient to use 😊