facebook / react

The library for web and native user interfaces.

Home Page:https://react.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alternative to Observe for Sideways Data Loading

sebmarkbage opened this issue · comments

For reference. We're exploring an alternative to the observe function proposed in #3398.

Instead of having a separate function that defines a list of dependencies on Observables, we could allow the render function itself trigger a bunch of side-effects that registers the currently executing component with some Observables.

render() {
  var data = request(this.props.url);
  if (data === null) return <div>Loading...</div>;
  return <div>{data}</div>;
}

Side-effects! 😲

The implementation of request wouldn't return an Observable. Instead it would tell React that the currently executing component has a dependency on this Observable and needs to rerender anytime the subscription fires. It returns the last value received by this particular Observable... somehow.

function request(url) {
  return React.readAndSubscribe(Observable.fetch(url));
}

If an Observable is no longer subscribed to in the next render pass, it is automatically unsubscribed. Perhaps the Observable API is not ideal for this model.

commented

I've posted a PR (#3920) that implements the underlying API that would power these new semantics, mostly to continue/inform the discussion.

commented

I have mixed feelings about this. But I have to say I like that this keeps API surface minimal and doesn't introduce new intermediate states (e.g. componentDidReceivePropsAndNowGonnaUpdateObservedData).

render() {
  var data = request(this.props.url);
  if (data === null) return <div>Loading...</div>;
  return <div>{data}</div>;
}
  • Mixing of concerns (render doesn't just render)
  • Magic: it looks like data = request() is synchronous, but the real logic is async. Might make people rage when they can't match their expected abstraction with the actual functionality of the tool.

needs to rerender anytime the subscription fires

Subscriptions don't "fire", Observables do. Subscriptions are sinks. If Subscriptions are to send any data forward to somewhere, then you are looking for a second Observable (as a function of the first), not a Subscription.

function request(url) {
  return React.readAndSubscribe(Observable.fetch(url));
}

There is a problem here because you can't keep the "last value received by this Observable" because everytime you call readAndSubscribe() you give it a new Observable, namely Observable.fetch(url). You can solve this by making Observable.fetch() have internally a persistent registry of url -> Observable, but that sounds really bad, it's global shared state hidden as framework magic. The core problem is that you need to separate Observable creation from Observable consumption, but the suggested implementation of request(url) conflates them. "last value received" should be an Observable, because, well, you "can observe" the last value.

I recommend fully embracing Observables as they are, by Erik Meijer, or dropping the whole idea.

It seems to me that this makes server rendering much harder, because you'd now need to call render() to extract data dependencies.

commented

You can solve this by making Observable.fetch() have internally a persistent registry of url -> Observable, but that sounds really bad, it's global shared state hidden as framework magic.

I think a version of this was also present in the original observe hook proposal from #3398:

This ordering is important since it allows the provider of data to do reference counting of their cache. I.e. I can cache data for as long as nobody listens to it. If I unsubscribed immediately, then the reference count would go down to zero before I subscribe to the same data again.

In the original proposal, the lifetime of the observable was "until new props/state come in", and in this proposal, its lifetime is "until the next value comes in" which I agree doesn't really make sense for Observables. You can see that #3920 does not use Observables at all, presumably for this reason. One might of course turn an observable factory into a “data tracker provider” a la #3290 but it doesn't seem to have any additional value over just using data tracker directly.

That said, I'm concerned about DataTracker suggested in #3920 as I don't understand how it can be compatible with immutable data and plain object (or ImmutableJS) models.

At this point I'm starting to think that sideways data injection is solved well by higher-order components and doesn't need to be in the core. I'd like to learn what makes React team think otherwise. I think even seems that the proposal from #3920 could be implemented in the userland as a higher order component that just forceUpdates its child. Is this not so? I actually already asked this question and it got answered, oops!

commented

@glenjamin It doesn't make server-side rendering any harder, because you can just re-run render on the client to discover the dependencies. In fact, we do re-run render on the client anyway (to setup refs and callbacks and other component data structures), so we aren't making things any worse.

@jimfb If i'm following correctly, it doesn't give you a way to collect data dependencies to be resolved server-side, without calling renderToString() multiple times on the server?

commented

@gaearon Correct, there would be no benefit to using observables, unless your system already uses observables, in which case, by all means, do so! Observables propagate state on write, whereas a react render utilizes data on read. While observables are perfectly compatible with React, the model is actually the reverse of what React does internally, so, yeah, that's why the observable API doesn't add any additional value when talking with the react core.

@gaearon With regards to your point about immutable/plain javascript, this is super important and one we've give lots of thought! This system actually plays super nicely with both immutable and plain javascript, and you are encouraged to use immutable/plain javascript as much as possible! If you don't use the data tracker, everything behaves exactly the same as it behaves today. Sometimes, however (eg. large cyclic graphs), immutable data starts to run into troubles and the recommended solution is to break up the cycle by having keys that you lookup in a map (effectively introducing mutability into your immutable graph). This API allows the data store implementor to do that, without exposing where the mutable break-points are located. If you are able to effectively implement your app with immutable data, you are encouraged to do that, and you won't ever learn about this api! But when that isn't practical, you would be able to call into this API for better performance and sideways data loading. However, please don't feel obligated to ditch plain/immutable javascript just because we add a new API! We still love immutable and plain js!

commented

@glenjamin Ah, I see what you're asking. Either API (observables or this) will require some minor nonisomorphism (is that a word?). With this API, you SSR datastores would need to return the data eagerly (eagerly fetching isn't usually a problem on SSR because your database (source of truth) lives near by), and with Observables your SSR datastore would need to close all observable streams after loading but not close them on the client in order to receive updates from the server. So we aren't really making the situation any worse than the other proposals.

commented

@jimfb

Thanks for addressing this! So it is a bit like shouldComponentUpdate in reverse. It would not be the primary means of subscribing to data changes, but could be used for finer-grained subscriptions in large trees. Is this correct?

In the Flux architecture, what would be the subscription point? A Store? Would the people writing Stores be exposed to this API, or would rather the people writing Flux libraries be encouraged to use it? Would this be an encouraged pattern at all?

commented

@gaearon Yes, exactly, it's like shouldComponentUpdate in reverse! It allows datastores to provide "hints" to a component that "data they depend on has changed, and they might want to rerender". It does the rerenders in an optimal way to be maximally efficient (in some cases, even more efficient than you can be with immutablejs). See @josephsavona's #3920 (comment) for details.

This API would probably be used by people writing datastores, but potentially also by people writing flux libraries. If this is the API we choose, this pattern would be recommended for sideways data loading (ie. you shouldn't setup subscriptions to flux stores manually). All sideways data loading is still "discouraged" (ie. people SHOULD TRY to find a way to use just immutable props), but if you're sideways loading data anyway, it would be recommended that you let React track that data. Of course, you could still always do the tracking manually, just like today, if you felt the need to do your own thing.

One of the motivations is that: There is an oft quoted statistic that something like 80% of mixins are trying to implement sideways data loading. I'm not sure where that statistic came from, but it is certainly true that sideways data loading is one of the main thing mixins are used for, and it's also one of the more error prone things that people do in React, and we want to simplify that. See the motivation for Observe (#3398).

After reading the 3 proposals I don't understand everything of them but I think React should not try to solve this problem and keep it simple: UI = f(state).

However just a crazy idea: have you thought about a render function that could return a promise?

For example:

render: function() {
  var renderSync = <Spinner/>;
  var renderAsync = getUser(this.props.userId).then(user => <div>{user.name}</div>)
  return React.asyncRender(renderSync, renderAsync);
}

Not sure it is a good idea but I guess it wouldn't be a problem for isomorphic apps for React to wait for all render promises to complete...

Interesting update:

If we can work around the need for push-updates for dirty marking, we can use the Iterable protocol for pulling data. E.g. polling of data.

commented

may be u need to call render to extract data dependencies.

Here's a cute integration of 'observable' data with react components: https://github.com/mweststrate/mobservable
Boils down to wrapping your data in an observable function, including a mixin, and referencing your observable data in render(). When your 'computed observable' render method changes by changing any observable data it refrences, forceUpdate() automatically happens.
Might be a little too much magic for my taste, and wrapping data isn't that appealing, but its cute how simple the react component becomes.
Anyways this is a great discussion, I'm torn between sideways data loading with event emitters and pure parameter === based top down style.
The problem with magical observation is that invariably it seems like I always have to end up deeply understanding and designing around how the magic works, which possibly defeats the purpose of the magic.

@slorber I've done something on this.. Look https://github.com/AlexeyFrolov/slt . Warning, code examples are quite outdated, read tests.
You can define something like this https://github.com/AlexeyFrolov/slt/blob/master/src/__tests__/data/deps.js
It's ready to use, well tested, but sadly, it's lack of examples yet.

@gaearon I've been using Meteor with React, where basically a lot of components are loading data sideways from Meteor. Since the react-packages' ReactMeteorData mixin similarly does forceUpdate() and doesn't add any more state transitions like shouldComponentUpdate, whenever I've needed to use PureRenderMixin on a component, I've had to wrap it with another component that loads the data and passes it in as props. I've felt annoyed having to do that refactoring rather than just implement a shouldUpdateForMeteorData (if it existed).

In some ways I like the idea of being able to load the observed data directly in the render() function. However, this could still be achieved in a mixin with a separate reactiveRender() method that caches its results and returns them in its implementation of render() (I'm not sure if mixins can actually define render, but if not, it would still be a smaller change to core: just make it possible for mixins to define render()). And since people may end up using a variety of APIs for sideways data loading (Meteor being the biggest example, but I'm sure other reactive frameworks will come) that probably can't integrate with this proposal anyway, I think it would be better for people to pick and choose the mixins they need for sideways data loading.

@slorber generally any time I use promises on the client side, I go ahead and render a spinner or whatever, then render something new when the promise resolves, instead of just leaving the user hanging until stuff shows up. I hope we can all agree it's kinder to users to do that. So I would avoid using promises in render().

@slorber whoa, sorry, I didn't actually read your code, I see you were doing just what I was talking about.

yes @jedwards1211 the idea is that the render returns a promise of ReactElement, but still you should be able to handle loading/error states

commented

I noticed this thread has been kinda quiet, so please redirect me if the discussion moved elsewhere

I quite like this idea, but introducing subtle side effects to render() doesn't seem appropriate as It breaks a pretty important contract.

A component's state and props (and sometimes context) usually dictate what the output of render() will be, so why not move the tracking logic to to the source? We could simply call setState(newData) in response to tracked changes, and keep the render() function nice and simple. Here's a code sample:

var users = new ReactiveCollection()

class Component extends React.Component {
    // A new lifecycle method. Gets wrapped
    // in Tracker.autorun. Returned object
    // simply gets passed to setState().
    trackedState () {
        return {
            users: users.find().fetch()
        }
    }

    // No side effects. We just read all
    // tracked data from this.state, as we would with
    // any piece of state.
    render () {
        return (
            <div>
                this.state.users.map(user => (
                    <div>{user.name}</div>
                    ...
                ))
            </div>
        )
    }
}

Behind the scenes, the base Component class will do something along these lines:

class React.Component {
    componentDidMount () {
        this.computation = Tracker.autorun(() => {
            this.setState(this.trackedState())
        })
    }

    componentWillUnmount () {
        this.computation.stop()
    }
    ...
}
commented

@oztune It's worth noting that the "side effects" are internal to the React core. It is semantically equivalent to returning a tuple (containing the jsx elements, and objects-read). For this reason, doing the tracking within render isn't an antipattern IFF React is responsible for tracking the reads.

Moving the logic out to a separate function does work, and was discussed (it is semantically equivalent to an observe function), but it leads to a duplication of logic. You are specifying (in the first method) what you are going to read, and you are actually doing the read in the second method. It's totally unnecessary, and error prone. If the two methods ever get out of sync, this can lead to some pretty subtle bugs. For simple components, it's relatively easy to keep them in sync (so long as you don't forget) but as your component starts getting more complex, it becomes exponentially more difficult to keep them in sync. The whole point of combining them is to avoid this unnecessary duplication of logic and to make sure that what-you-read-is-what-you-use.

commented

@jimfb Thanks for the thorough answer. Some questions:

You are specifying (in the first method) what you are going to read, and you are actually doing the read in the second method. It's totally unnecessary, and error prone.

You're specifying what you'll be reading and doing the read within trackedState(), then throwing the results into the state. The render() method isn't doing the read from the original data source anymore than it does when accessing this.props.

If the two methods ever get out of sync, this can lead to some pretty subtle bugs

But how is that different from the setState -> render way of doing things? Is the data coming from tracker conceptually different than any other piece of state? It seem to me like everything flowing through the same, established, channels could lead to less confusion down the line.

commented

A small example to illustrate my point:

Say you had a model library that was completely synchronous. How would you model a React component to use it?

A

class App {
    render() {
        var users = MySyncLib.getUsers()
        return <UsersList data={ users } />
    }
}

or B?

class App {
    componentDidMount () {
        setState({
            users: MySyncLib.getUsers()
        })
    }
    render() {
        var users = this.state.users
        return <UsersList data={ users } />
    }
}

Isn't B the "React way"? From what I understand the render() method should touch the least amount of globals possible. So why is it fine to do:

render() {
  var data = request(this.props.url);
  if (data === null) return <div>Loading...</div>;
  return <div>{data}</div>;
}

(from example in first post)

Where request() is pretty much a global? Isn't a render() method supposed to rely only on data within the scope of the component as much as possible? Isn't that why the separation between state and render was created in the first place?

commented

@oztune Neither of the above (Both options A and B are bad). For applications written in React, you're probably going to want some sort of data management architecture (like Flux or Relay) that is responsible for managing your application's data - you probably wouldn't want to store your application's data in the internal state of an App component at the root of your application.

Under this proposal, all your data remains in the data layer (flux/redux/relay/whatever), and components "subscribe" to that data by reading the data. When the data changes, the appropriate components will re-render.

If you do like you suggested, and copy/store the data prior to rendering, you have a few issues:

  • Duplication of state - your source of truth is now confusing, since you have two "copies" of the data. For more details on a similar anti-pattern, see: https://facebook.github.io/react/tips/props-in-getInitialState-as-anti-pattern.html
  • Under fetching - If your first function fails to copy/return one of the data elements needed by render, then your render function will crash (or at least return the wrong result). This was totally avoidable, because the data you need is in your datastore, you just didn't copy it out.
  • Over fetching - Rather than reasoning about what data will be needed when, it's easy to just "fetch everything" instead of trying to figure out the minimal set of what you'll need. This is wasteful from a performance perspective, and results in death-by-a-thousand-papercuts ,where each component is slightly wasteful and the result is a whole application that is slow. React apps are often fast because we intentionally make it difficult to fall into these anti-patterns.

All three issues just evaporate if your definition of "in use" is based on what "is actually used by render".

Additionally, if you copy the data manually, it means you're managing your subscriptions manually (because you need to know when your data changes), which is error prone. It's super easy to forget to unsubscribe from one of your data sources when your component unmounts (oh, and manual subscriptions are super side-effect-full).

commented

@jimfb Thanks! So you're saying that it's better to read data from the Flux/redux/whatever library directly right from within render(), instead of putting it into state and then reading it from there in render()? It makes sense, but I remember reading the opposite somewhere. So I looked it up, and this is one of the official examples on the Flux site:

image

(https://facebook.github.io/flux/docs/todo-list.html#content)

^ Isn't it copying from the flux store into state, then using this.state right within render? Is that still the recommended approach?


Another thing I noticed is that a bunch of places in the React source warns users with messages like:

"Render methods should be a pure function of props and state"

(https://github.com/facebook/react/search?utf8=%E2%9C%93&q=Render+methods+should+be+a+pure+function+of+props+and+state)

Doesn't your approach break that contract? Did I misunderstand something?

Thanks for bearing with me, I really appreciate all the thorough answers. We're starting to implement side loading in our apps and want to know what the React team recommends.

commented

Yeah, my thinking is that the flux example creates a shadow copy of state, and shadow copies are bad, so I tend to think that reflects negatively on the example. However, the example does have some redeeming qualities (namely that it's a commonly used pattern, especially in the context of data transformation/memoization, and also has a render function that is clearly pure). For the record, the same example shows a component manually subscribing/unsubscribing from the datastore, which is exactly the opposite of what we would want in a perfect world and is the whole reason for these issues discussing sideways data loading. To say the example is 100% good or 100% bad is too simplistic. We are still learning about "best practices" and will continue to improve the libraries and documentation over time.

Yes, components should be a pure function of props and state. However, once you implement flux, your state may be in a datastore rather than in your components. The primary point of that sentence is to communicate the fact that components should never imperatively mutate the data/ui. To see this, allow me to paste in the remainder of that sentence from the first search result you linked...

The render() function should be pure, meaning that it does not modify component state, it returns the same result each time it's invoked, and it does not read from or write to the DOM or otherwise interact with the browser (e.g., by using setTimeout).

commented

@jimfb Thanks for clearing that up, and thanks for all your work on React, it's been a real pleasure to use. I am looking forward to seeing what best practices the community brings forward with time.

@sebmarkbage was showing @jayphelps this, and I think I mentioned it in short-form on Twitter... but providing this functionality would enable people to do something like the following (typical autocomplete example):

import { Component } from 'react';
import { boundSubject } from 'bound-subject-decorator';
import * as Rx from 'rxjs';

export default MyComponent extends Component {
  @boundSubject // binds the subject's handlers to the subject instance
  keypresses = new Subject();

  observe() {
    return {
      // compose keypresses into a throttle stream of async auto complete data
      autoCompleteData: this.keypresses
        .map((e) => e.target.value)
        .throttle(500)
        .switchMap(query => getSomeAjaxObservable(query))
    };
  }

  render() {
    return (
      <div>
        <input onKeyPress={this.keypresses.next}/>
        <ul>{this.data.autoCompleteData.map(x => (<li>{x}</li>))</ul>
      </div>
    );
  }
}

I'm not sure if I'd recommend people use this pattern, but they certainly could and it's pretty ergonomic which is nice. I prefer mixing Redux and RxJS, myself, as I think it's better architecturally.

commented

See also https://mobxjs.github.io/mobx/getting-started.html for a related approach based on tracking reads and writes. I think it was mentioned in this thread before, but it's gotten more mature since then, and I've only heard good things about it from the people who used it.

cc @mweststrate

MobX indeed tracks reads and writes. From React perspective only the reads are interesting. The MobX @observer decorator wraps the render function of a component and tracks which observables are accessed during render.

The rest is pretty much the same as described by @sebmarkbage in the initial post of this thread; after each render the old subscriptions are disposed and new ones are made. It makes it extremely straight forward to write components, regardless where the observables used in render are coming from (props, state, context, closure, singletons etc..).

Performance wise this approach is very beneficial, as child and parent components can render independently, which usually saves the React reconciler a lot of work. MobX also adds the PureRenderMixin to all components; there is no need to deeply check the equality of objects if their properties are observable...

Since a few months I have actually stopped using React's own state mechanism, as observable instance fields are more straightforward to use (I have regularly seen colleagues storing state, which wasn't relevant for the render, in the components state. This resulted in unexpected component behavior. Or they struggled with the async nature of setState). Automatic subscriptions avoids these problems.

Technically this makes render indeed impure, but in reality I did never run into trouble with this, since rendering is still idempotent. If you force extra rendering calls, or when the subscriptions are lost, rendering will still work as expected. Because the observable subscriptions don't influence the how or what of the rendering, just the when.

See also this jsbin if you want to play a bit with MobX

@mweststrate it seems like the primary function of mobx is to create objects and arrays who notify when their properties are set, so some external consumer can be notified can get from those properties. All map and filter-type functionality I see in there were effectively maps and filters over subclassed Arrays. Other than that, there's some mechanism that is able to lookup even emitters that are bound to each property to wire up notifications. Is that accurate?

Where is reactive programming composition in this particular design? It seems like it's more of a paradigm for a sort of pull-based reactive model layer (very similar to Ember's, actually) with that @computed property.

It's really cool/interesting, I'm just not sure we're talking about the same thing here. Also the term "observable" in the case of MobX, while accurate by Merriam-Webster's definition, might be a misnomer. When I think "observable", I think Bacon, Most, Kefir or Rx, etc.

commented

Also the term "observable" in the case of MobX, while accurate by Merriam-Webster's definition, might be a misnomer. When I think "observable", I think Bacon, Most, Kefir or Rx, etc.

Yeah. I posted it because I think this is the sense the original issue post was implying (see also #3920 for something similar to MobX in spirit).

@Blesh there are many kinds of observables indeed, it's a very old concept after all (way older then just Rx). The observables of MobX are similar to the ones in javelin, ember, knockout and the killed Object.observe proposal, and different from the ones in Rx and Bacon. It are observable values instead of observable event streams. So there is no explicit reactive programming composition in MobX in terms of operators and such. That is exactly the thing it abstracts away from.

I doubt however whether the distinction between those categories is relevant in the context of this thread (otherwise I gladly elaborate on it). As long as an observable implements Symbol.observable it should be fine to mix and match them. I think that MobX proves that a mechanism that automatically (un)subscribes to observables as a side effect of rendering is a very viable mechanisms (even in the current versions of React).

If this mechanism is pulled into react, the thing that is needed is an api that allows an observable to signal React that it was accessed. Something like React.notifyThatObservableHasBeenAccessed(observable). So that React can link it to the currently rendering component. This way developers are not bothered with explicit dependencies for sideways loading, as React and the reactive library can settle this themselves.

This way developers are not bothered with explicit dependencies for sideways loading

Honestly, I'd prefer that React didn't have opinions about what async types it works best with at all, even if they're Rx observables. But rather handled generic async types via interoperability points like [Symbol.observable] or thennables

@Blesh indeed, standards are the best :). But as pointed out by @jimfb having observables / thennables in itself is not enough to solve the problem of over- or undersubscribing, nor does it allow for automatically subscribing to observables used by render.

commented

I’ll close since this discussion is pretty stale. For people who like this approach Mobx does a good job 🙂