facebook / starlark-rust

A Rust implementation of the Starlark language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to unpack list, dicts, and structs?

GaryBoone opened this issue · comments

In the documentation at https://docs.rs/starlark/0.4.0/starlark/, there are several examples of reading simple Starlark types into Rust, then unpacking them to Rust primitives. But these are limited to unpack_str() and unpack_int(). I searched the #[test] code but didn't find anything. Please correct me if I've missed it.

What is the intended way of working in Rust with lists, dicts, structs? Do we read them into Rust Vecs, HashMaps, enums(?), or do we use starlark-rust API functions to access their parts? Naturally, these will likely be nested in practice, so can you show how to access parts of a nested object, like:

def animal(id):
    return {
        "kind": "giraffe",
        "name": "giraffe-%s" % id,
        "feeding": [
            {
                "name": "feeder",
                "image": "photos-%s" % id,
                "commands": [
                    "lift",
                    "roll-over",
                ],
            },
        ],
    }

Would you show some example code that illustrates how to work with the Starlark types parsed into Rust?

There are two issues here - how do you do it, and where would you expect the function/documentation to live. Let's work on the first, and once we have the right content, we can figure out where to put it.

Given a Value, there are two main ways to deconstruct it:

  • With primitive unpacking operations in O(1). Two examples are unpack_str and unpack_int. For more complex types, e.g. dict/list, https://docs.rs/starlark/0.4.0/starlark/values/trait.FromValue.html#tymethod.from_value is usually the way to go. As an example, Dict::from_value(x) will return Some containing a Dict if the type really is a dict. Once you have a Dict there is the content field which gives you the real stuff as a Rust-level map.
  • With complex unpacking operations, which might clone a portion of the data type into Rust-friendly types, using UnpackValue - https://docs.rs/starlark/0.4.0/starlark/values/trait.UnpackValue.html. For example, unpack_value() could produce a SmallMap<String, Value> in the above case, or a Vec<String>. You could even define your own Rust type and an UnpackValue instance for it, to produce an Animal struct. These usually do some allocation (irrelevant unless you are going for absolute maximum performance) and are built with the primitive operations.

Thanks, Neil. With some Starlark like

people = {
    "Alice": 22,
    "Bob": 40,
    "Charlie": 55,
    "Dave": 14,
}

...the following extracted it into a Rust-level map as you said:

    let p2 = module.get("people").unwrap();
    println!("p2.get_type()) = {}", p2.get_type()); // dict
    let d = Dict::from_value(p2).unwrap();
    let d2 = &Dict::from_value(p2).unwrap().content; // Extract SmallMap
    println!("d2.len() = {}", d2.len()); // 4
    println!("d2.get_index(2) = {:?}", d2.get_index(2).unwrap()); // SmallMap methods work

Great. Next, how should unpack_value() be held? The docs at UnpackValue say that "The heap argument is usually not required." So what should be used as the parameter?

With the animal example above, ending with animal("Joe") to return a concrete value:

    let res: Value = eval.eval_module(ast).unwrap();
    let h: Heap = Heap::new(); // ?
    let animal = SmallMap::<String, Value>::unpack_value(res, &h).unwrap();
    println!("animal = {:?}", animal);

...panics because unpack_value() returns None.

The heap argument for unpack_value can be most things, creating a new heap there is fine, although the eval.heap() would be the most correct thing to do. In HEAD we've actually eliminated the heap argument to unpack_value because it was confusing and unhelpful.

For debugging why with animal("Joe") it doesn't work, I suggest you println!("{:?}", res) and see what you get before unpacking. I tried this, and added it as a test in 5ce27c5 - it works for me on HEAD.

Ok, repeating with some deets:

use starlark::collections::SmallMap;
use starlark::values::{UnpackValue, Value};

// If cargo.toml contains
//    starlark="0.4.0"
// then use
// let animal = SmallMap::<String, Value>::unpack_value(res, eval.heap()).unwrap();
//
// But the heap argument in unpack_value() is no longer needed in HEAD. If cargo.toml contains
//     starlark={ git = "https://github.com/facebookexperimental/starlark-rust" }
// then
let animal = SmallMap::<String, Value>::unpack_value(res).unwrap();
println!("animal = {:?}", animal);

Now on to the third way mentioned above, defining your own type... When you say to define your own unpack_value, does that mean using additional unpack_values to do the extractions, so in essence manually extracting each field? Like this:

use starlark::values::list::ListOf;
use starlark::values::{UnpackValue, Value};

#[derive(Debug)]
struct Animal {
    kind: String,
    name: String,
    feeding: Vec<Feed>,
}

#[derive(Debug)]
struct Feed {
    name: String,
    image: String,
    commands: Vec<String>,
}


impl UnpackValue<'_> for Animal {
    fn unpack_value(value: Value) -> Option<Animal> {
        let animal_sm = SmallMap::<String, Value>::unpack_value(value)?;
        let kind = animal_sm.get("kind")?.to_string();
        let name = animal_sm.get("name")?.to_string();
        let feeding_val = animal_sm.get("feeding")?;
        let feeding_list = <ListOf<Value>>::unpack_value(*feeding_val)?;
        let mut feeding = Vec::new();
        for f_val in feeding_list.to_vec() {
            let feed_sm = SmallMap::<String, Value>::unpack_value(f_val)?;
            let name = feed_sm.get("name")?.to_string();
            let image = feed_sm.get("image")?.to_string();
            let commands_val = feed_sm.get("commands")?;
            let commands_list = <ListOf<Value>>::unpack_value(*commands_val)?;
            let mut commands = Vec::new();
            for fs in commands_list.to_vec() {
                commands.push(fs.unpack_str()?.to_string())
            }
            feeding.push(Feed {
                name,
                image,
                commands,
            });
        }
        Some(Animal {
            kind,
            name,
            feeding,
        })
    }
}

...then calling like:

let animal = Animal::unpack_value(res).unwrap();
println!("animal = {:?}", animal);

.. producing:

animal = Animal { kind: "giraffe", name: "giraffe-Joe", feeding: [Feed { name: "feeder", image: "photos-Joe", commands: ["lift", "roll-over"] }] }

Is this the intent? Or is there a simpler way?

  1. Could this extraction process be automated? I'm thinking of serde_yaml, where you would annotate your structs with #[derive(Debug, Serialize, Deserialize)] then just call

    let animal: Animal = serde_yaml::from_reader(f).unwrap();

  2. Finally, would you comment on the intended use of the various structs and collections? It seems like Values, including Dicts, are intended for working in Starlark, while collections, including SmallMap, are intended for use in Rust? Is that right? (So for example, I found that I couldn't use a Dict in my own unpack_value because it didn't pass in a heap, which was needed to allocate a key for get().)

So the full example with details at the top - are you saying that works for you? Or fails? If it fails, what is the print line producing? If it works, I guess this solves that?

Your unpack code looks reasonable. There are approximately two ways to go:

  • Use dict/list in Starlark to represent your type, much as you did.
  • Define a custom Animal type that shows up in Starlark as type "animal". Then you can unpack it easily, but the Starlark users have to call the animal function to create it etc.

Which one makes sense is very much dependent on what you are trying to do.

For point number 2, absolutely that would be possible, probably exactly using the Serde framework. That would be a cool contribution - Serde conversion.

For point number 3, there is Dict::get_str. Does that solve your issue of trying to unpack it? The types like Dict, List etc are accessible from Rust and Starlark, but definitely tailored to Starlark. The type SmallMap can be used like HashMap or IndexMap in most places, so it's a general container, but I wouldn't particularly recommend it in normal Rust code - there are usually more Rust-appropriate types. But when working with Starlark, they can come in handy.

Yes, all of the above works correctly for me, thanks! Next, I'm trying your most recent suggestion: 'Define a custom Animal type that shows up in Starlark as type "animal".' That looks straightforward enough as there's an example in the starlark-rust docs, here. But wait, how do you use the new type in a starlark file? The example defines a and b on the Rust side, injecting them into the module with set. If I try to move them into the starlark like:

a = complex(
    real = 1,
    imaginary = 8,
)
b = complex(
    real = 4,
    imaginary = 2,
)
str(a + b)

...and try various syntaxes, I get Variable 'complex' not found. What's the right way to use custom-defined type in Starlark?

See https://docs.rs/starlark/0.4.0/starlark/#collect-starlark-values - usually you'd define a function:

#[starlark_module]
fn my_module(builder: &mut GlobalsBuilder) {
    fn complex(x: Value) -> Complex {
        ...
    }
}

Then do something like:

let globals = GlobalsBuilder::new().with(my_module).build();

That lets you add complex into scope.

Ah, ok! Following that example now makes sense: You define your own functions or objects in a module, then inject them using with() into the globals. Got it. So then the full animals() looks like:

// ...include the Animal and Feed struct definitions from above, but change the traits for Feed to
// `#[derive(Debug, Clone)]`.

starlark_simple_value!(Feed);

// How we display them in Rust.
impl fmt::Display for Feed {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.write_str(&format!(
            "{}: {}, {:?}",
            self.name, self.image, self.commands
        ))
    }
}

impl<'v> StarlarkValue<'v> for Feed {
    starlark_type!("feed");

    // How we display them in Starlark.
    fn collect_repr(&self, collector: &mut String) {
        collector.push_str(&format!(
            "{}: {}, {:?}",
            self.name, self.image, self.commands
        ))
    }
}

starlark_simple_value!(Animal);

impl fmt::Display for Animal {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.write_str(&format!("{}: {}, {:?}", self.kind, self.name, self.feeding))
    }
}

impl<'v> StarlarkValue<'v> for Animal {
    starlark_type!("animal");

    fn collect_repr(&self, collector: &mut String) {
        collector.push_str(&format!("{}: {}, {:?}", self.kind, self.name, self.feeding))
    }
}

#[starlark_module]
fn feed_module(builder: &mut GlobalsBuilder) {
    fn feed(name: Value, image: Value, commands: Value) -> Feed {
        let commands_list = <ListOf<Value>>::unpack_value(commands).unwrap();
        Ok(Feed {
            name: name.unpack_str().unwrap().to_string(),
            image: image.unpack_str().unwrap().to_string(),
            commands: commands_list
                .to_vec()
                .iter()
                .map(|l| l.to_string())
                .collect(),
        })
    }
    fn animal(kind: Value, name: Value, feeding: Value) -> Animal {
        let feeding_list = <ListOf<Value>>::unpack_value(feeding).unwrap();
        Ok(Animal {
            kind: kind.unpack_str().unwrap().to_string(),
            name: name.unpack_str().unwrap().to_string(),
            feeding: feeding_list
                .to_vec()
                .iter()
                .map(|l| Feed::from_value(*l).unwrap().clone())
                .collect(),
        })
    }
}

and

fn read_custom_starlark_struct2() {
    let path = Path::new("animal.star");
    let ast: AstModule = AstModule::parse_file(path, &Dialect::Standard).unwrap();
    let globals = GlobalsBuilder::new().with(feed_module).build();
    let module = Module::new();
    let mut eval = Evaluator::new(&module, &globals);
    eval.eval_module(ast).unwrap();

    // Feed --------------------------
    let val = module.get("f").unwrap();
    println!("feed = {}", &Feed::from_value(val).unwrap());
    // feed = feeder: photos-joe, ["lift", "roll-over"]

    // Animal --------------------------
    let val = module.get("a").unwrap();
    println!("animal = {}", &Animal::from_value(val).unwrap());
    // animal = giraffe: giraffe-joe, [Feed { name: "feeder", image: "photos-joe", commands: ["lift", "roll-over"] }, Feed { name: "feeder", image: "photos-joe", commands: ["lift", "roll-over"] }]
    // The feed, `f`, is included twice, as expected.
}

where animal.star contains:

f = feed(
    name = "feeder",
    image = "photos-joe",
    commands = [
        "lift",
        "roll-over",
    ],
)

a = animal(
    kind = "giraffe",
    name = "giraffe-jose",
    feeding = [
        f,
        f,
    ],
)

Please feel free to critique that code. Still seems like there remains a bit of manual unpacking, for example. Are there simpler ways? And of course it omits error-handling.

It looks like we've covered the various ways to unpack various objects. Feel free to close this issue. Thanks, Neil!

Cool! Two notes:

  1. If you are going to define Display, you can just reuse it in collect_repr with write!(collector, "{}", self).

  2. If you use the type in the #[starlark_module] function it basically does unpack for you, with better error messages, so:

    fn feed(name: String, image: String, commands: Vec<String>) -> Feed

But basically looks as I'd expect, nice work. Closing this issue.