davidmoten / state-machine

Finite state machine class generator for java, exports graphml, supports immutability!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to push a stream of JSON messages through state-machine?

davidmoten opened this issue · comments

From email:

David-
Thank you so much for contributing to the code for RxJava2.

I have an application where I am receiving a single Flowable stream of JSON-formatted events that come from a number of different sources. Each event has the sourceID in it. These events are asynchronous. There can be a few events for a given source or a continual (never-ending) stream of events. I cannot know the specific sourceID beforehand, and there could be events from a thousand different sources intermingled in the single stream.

Question: If I need to evaluate the state of a given source using a Finite State Machine, do you have any ideas about adapting your state-machine to safely be used in such an environment? I could see trying to keep the state of a given source in a hashmap, but cannot quite envision how to fold that into your state machine into that scheme. Do any other approaches come to mind?

Kindest regards,

Ewin Barnett
Columbia, Missouri.

Hi Ewin

This sort of conversation is much better to put on github issues so other people can see now and later and also I can use markdown for formatting. Do you mind if we move the conversation there? We don't have to if you don't want to of course.

Ok you are getting a single Flowable stream that is the merge of all the source streams. Presumably the sources can be categorized such that any source is of type T and emits events that happen to a T with unique id in the T domain.

The first step is to make a state machine definition like:

https://github.com/davidmoten/state-machine/blob/master/state-machine-test-definition/src/main/java/com/github/davidmoten/fsm/example/StateMachineDefinitions.java

Then you define a Processor like the createProcessor method in:

https://github.com/davidmoten/state-machine/blob/master/state-machine-test/src/test/java/com/github/davidmoten/fsm/rx/ProcessorTest.java

To use the rx side when you build the Processor you specify also the signals:

Observable<Signal<?, String> signals = ...
Processor<String> processor = Processor //
    .behaviour(Microwave.class, behaviour) //
    .processingScheduler(Schedulers.immediate()) //
    .signalScheduler(signalScheduler) //
    .signals(signals) //
    .build();

then subscribe to the processor:

processor.observable().subscribe(subscriber);

By the way Flowable is RxJava2, the equivalent in RxJava1 is Observable and this project supports RxJava1 at the moment, I'll probably migrate to RxJava2 this week seeing as you're asking about it
Suppose now you have a Observable which is a stream of JSON messages as you've described then signals is constructed like this:

Observable<Signal<?, String>> signals = json.map(s -> {
    Class<?> cls = ...; //from json 
    String id = ...; //from json
    Event<?> event = ...;//from json    
    return Signal.create(cls, id, event)) });

I think we need a fully worked example in the docs/tests for this.
Does that help?

From Ewin:

First, yes. Happy to move the conversation to GitHub.

Second. Thank you so much for your generous answer!

Third: Let me work my way through your answer, pondering much along the way. And yes, it would be great to develop a working example,

Cheers!
--Ewin

I've created an example at https://github.com/davidmoten/state-machine/blob/master/state-machine-test/src/test/java/com/github/davidmoten/fsm/rx/StreamingTest.java which accepts JSON stream and pushes the events through the state machine using RxJava. Have a look at that.

RxJava2 migration done in branch rx2. Let me know if you'd be happy to be cutting edge and use that version. If so then I can release to Maven Central.

version 0.2 released to Maven Central supporting RxJava 2 (Flowables).

David-

Super. I will take a look.
BTW, where do I fetch java.util.function.Supplier from? My quest to locate it has been fruitless. Must be my male refrigerator blindness.

--Ewin

Yes, yes. Brain fart question.
--Ewin

David -
Question on how to adapt my states and events to you architecture.

"As you can see the definition is pretty concise. This is largely because of the advisable constraint that any one State can only be arrived at via one Event type."

I have asynchronous status events from a number of different units that are received by this IOT box. There are several sequences. Most have only one event. One has more than one event in the sequence.

One event is "powerup" (PU event). Another event is "powerup-complete-unit-ready".(PUCR event) Another is "fault" (F). Another event is a chain of several events that each follow each other in a fixed format (DiagA, DiagB, DiagC events).

After each of these sequences, the State for that unitID is "ready for input". That seems to violate your architecture because I can arrive at this State from a PU, a PUCR, a F and a DiagC event.

Can you give any guidance about how to approach defining this?

--Ewin

David-

Please let me add that the units I am monitoring are remote and may suffer from an internet/power outage. So, if I start to receive events for a sequence that is supposed to have three events (like DiagA, DiagB and DiagC), but I only receive DiagA, I need to have a timeout that can take me to an ErrorReport state, and from there back to "ready for input" state. This only adds to the number of States that route me back to the single "ready for input" State, which appear to violate your architecture.

I didn't yet say that the unit emits a heartbeat event every 60 seconds and the "ready for input" state has a 70 second idle timer to alert when a heartbeat is not received. But this is why I must use a FSM as I am sure you can see.

Cheers!
--Ewin

There are at least a couple of approaches that you can use:

  • event inheritance
  • adding intermediate states

If you use inheritance the critical thing is that the onEntry procedure for the ReadyForInput state should not have

if event is of type blah
  do this
else if event is of type blah2 
  do that
...

That sort of treatment should ideally be extracted into extra states and events.

To use extra states, suppose you want

A -> R via E1
B -> R via E2

As you've noted that violates the rule so you can create extra states A', B':

A -> A' via E1
A -> B' via E2
A' -> R via E3
B' -> R via E3

The onEntry procedures for A', B' look very simple:

/onEntry
send E3 to self

I'd be glad (in fact quite interested) to review the full state diagram in a form like above if you like.

BTW the fact that PowerUp and Fault events both take us to ReadyForInput looks like a prime candidate for extracting intermediate states especially as I imagine that a Fault event might normally involve some sort of side-effect for a lot of systems (like logging etc).

State1 -> PoweringUp via PowerUp
State2 -> FaultOccurred via Fault
PoweringUp -> ReadyForInput via Ready
FaultOccurred -> ReadyForInput via Ready

FaultOccurred onEntry procedure:

/entry
log failure
send Ready to self

PoweringUp onEntry procedure:

/entry
send Ready to self

Thanks, I'll take a stab at it.

The first thing I'm looking for is independence. The alarm event needs to be logged no matter what else is going on and presumably doesn't directly affect any other interaction with the device. The alarm event certainly may indicate something worrying has happened to the device but cancellation of other conversations with the device might be expected to happen via timeout rather than as a consequence of the alarm. So based on that assumption I treat alarm events completely separately.

The command conversation looks to be modelled by a standard looking state machine for DeviceCommand (I model each command conversation separately because they could presumably happen concurrently and I assume that each Data response is returned with a unique id to associate it with the command).

Symbology wise I represent a transition like this:

State1 -> State2: Event

DeviceCommand state machine

Transitions:

Initial State -> CommandSent: Command
CommandSent -> HasResponse: Data
HasResponse -> Terminal State: Done
CommandSent -> TimedOut: Timeout
TimedOut -> Terminal State: Done

onEntry procedures:

CommandSent:

send Timeout to self in N seconds
send Command to device

HasResponse:

cancel delayed signal to self 
persist Data
send Done to self

TimedOut:

persist timeout result
send Done to self

With regard to the Diagnostic events I assume that any sequence is uniquely identified so that we can handle two concurrently emitted sequences. If so then I would expect a state machine for each sequence that would look as below.

DeviceDiagnosticSequence state machine

Transitions:

Initial State -> DiagA Received: DiagA
DiagA Received -> DiagB Received: DiagB
DiagB Received -> DiagB Received: DiagB
DiagB Received -> DiagC Received: DiagC
DiagA Received -> Instrument Failed: InstrumentFailure
DiagC Received -> Terminal State: Done
DiagA Received -> Timed Out: Timeout
DiagB Received -> Timed Out: Timeout
DiagC Received -> Timed Out: Timeout
Timeout -> Terminal State: Done
Instrument Failed -> Terminal State: Done

onEntry procedures:

DiagA Received:

persist DiagA
send Timeout to self in N seconds

DiagB Received:

cancel delayed signal to self
persist DiagB
send Timeout to self in N seconds

DiagC Received:

cancel delayed signal to self
persist DiagC
send Done to self

Timed Out:

persist timeout info
send Done to self

Instrument Failed:

cancel delayed signal to self
persist InstrumentFailure
send Done to self

Bear in mind that what I've put above are short lived state machines that spring into existence when prompted to by an event that uniquely identifies a new device command or new device diagnostic sequence and then die when something terminal happens to them like a timeout or a finalising event.

If all the incoming command and diagnostic events from all devices were in one incoming stream then the stream would be grouped by commandId/diagnosticSequenceId and the member streams then would be processsed synchronously against new state machine instances as described above.

I think it helps mentally to distinguish event names from state names and I usually achieve this by using the odd verb in the state name like Has or Received.

David-

Let me code up one of these and get your input.

--Ewin