matz / streem

prototype of stream based programming language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

could there be more streams?

wdiechmann opened this issue · comments

@matz I watched the Full Stack Fest 2015 on youTube just minutes ago and hurried to see what exciting things you are up to :)

Perusing the examples got me thinking if the *nix stream descriptors could come in handy (and be (mis)used) ?

Like

[ 1,2,3 ] | > ( { |x| [ x, x*x, x*x*x ] } |1> { |x| puts x } |2> { |x| puts x } |3> { |x| puts x } ) | stdout
# 
# output would be
1 
1 
1
2
4
8
3
9
27

In that sense you would build your own concurrency and leave it to the interpreter to hand each stream to what core resources would be available -- in the above example 1,2 or 3 cores might be used depending upon availability - and the parenthesis(es) would indicate scopes of concurrency, heck you could even (given some pre-knowledge of the computational challenges throw a seeding value into the mix like

| stream_description, resource_allocation_weight >

 |1,1> { |x| puts x } |2,2> { |x| puts x } |3,3> { |x| puts x }

I cannot wait for you to 'build' this streem (and apply it to Ruby) in effect providing us with some relief when we do:

- resources.each do | resource |
  = render partial 'item', locals: { item: resource }

(and resources is a 10,000 row SQL select ActiveRecord::Relation object)


The stream descriptors would obviously have to work the other way around too. Think about a ticket system in the subway

ticket_verifier 1< reader1 2< reader2 3< reader3 .... N< readerN |1> big_screen 2> logfile

with no one using the reader24 - reader32 they are (today) just sitting there eating up cycles - with your streem, they stop becoming an embarrassment :)

Full 'steem' ahead Matz :D

oh - and thank you so much for Ruby!!!!

cheers
Walther

ps: if this entire post is just about the dumbest you have ever read - please excuse me for waisting your bandwidth!

I am not sure what you want (yet). For your information, you can

[1,2,3] | {x->[x,x*x,x*xz*x]|stdout}

Actually, I found a bug so this program does not work yet. But conceptually you can create your own pipelines within pipeline streaming.

Oh - my example is of cause hopelessly mundane but my point is that by allowing for branching the stream(s) you increase the vector probability of concurrency (and make it much clearer to the programmer that this is processing concurrently) -

On a different (and yet the same) note: We have to rethink the representation of code! With massively parallelism we have to abandon 'the scroll' and arrange streams from left (or right) - one big advantage will be that editors will be easy to access and use for programmers like yourself, coming from a RTL language, right? Even today many programmers try to keep chunks of code down to one 'display-page' - and for that very same reason: being easily overviewed/grasped

                     x xxx xxxx xxxxx xxxxxx xxx xxx xxxx xxx

   xxxxxx    xxxx xxxxxxx xxxxxxx xxxxxxx xxxxxx xxxxxxxx

                      xxxxxxx xxxxx xxxxxxx

Swiping left to right (or reverse) will easily let us overview 20-30 processes in parallel, and making an IDE/editor able to 'unfold' the instructions within a stream, would not be unsurmountable - but I digress :)

In my experience branching is a clear indication of (possible) concurrency (take Git for example). Branches may reach to a point where they merge - or one branch will never make it past the first few commits/instructions.

In the real world things are concurrent too. Each morning my wife and I take different paths to work - and yet usually in the evening, we meet again.

But - I have not seen all the syntax - so perhaps it is already in there - and then I apologize again for entertaining you with this simple feature

OK, I haven't made detailed document. Any input is welcome.

In Streem, a | b connects two streams a and b, and when you connect a multiple times as left hand side, it delivers same value to all right hand side, e.g.

a = [1,2,3]
a | b
a | c

b and c will receive three values 1,2,3 respectively.

On the other hand, when you connect multiple left hand side to a stream, the right hand side will receive all values from all left hand sides, e.g.

[1,2,3]|stdout
[5,6,7]|stdout

will prints 1,2,3,4,5,6 in random order. I think we need to add more fork/join operators, such as zip (take one value from each argument stream, and make them an array), etc.

ok - yes more fork, join operators will 'cut the mustard' :)

like the STDIN where all 2's should go to process 2, all 1's go to process 1 and every third 3's go to process3 and the rest to process4 - that was what I was trying to make a notation of, previously :)

you will always like to throw some information out (the stdout way), keep some (persisting it would be the decision of another stream), let some supervisor stream in on the deal with a few state updates here and there, and of cause, finally tell the user about what is going on (display it is what we do today - but who knows, perhaps tomorrow we might just toss a 'display-object' into a stream connected to our neural cortex :)

it really is not

a | b
a | c 

but more like 'tagged' bins which the 'a' can throw output into - without these bins being defined within the 'a' process :)

makes sense?

Perhaps, something like

a | switch_by_tag(:a,proc1,:b,proc2,:any,proc3)

would meet your idea, wouldn't be?

yes - kind' of

excuse me for not explaining this concisely - english is not my first language either :$

allow me one more try :)

stdin = [ pear, apple, tuna, broccoli, apple, apple, golf ball ]

now - the pears should be sliced and arranged on a plate, the apples squeezed to juice and bottled, the tuna arranged with the broccoli, the broccoli should be boiled lightly, and the golf ball has nothing to do here what so ever :D

stdin |
(purveyor 1> pear_slicing 2> apple_squezzer 3> broccoli_boiler 4> other_stuff) | 
(1> pear_plating 2> apple_bottling 3> tuna_broccoli ) |
stdout

now we would have

stdout = [ sliced_pears_on_plate, bottled_apples, tuna_arranged_with_broccoli ]

Another 'real world' example is that of network traffic - with a lot of different protocols coming in and even stuff that does not belong at all. Today we handle that through a number of 'layers' - but imagine if your code/syntax could open this cacophony and make beautiful music/code out of it - in parallel :)

ok - I promise: I'll stop pestering you now and leave you with one last try to describing the syntax for parallelism :)

this 'system' receives tons of barcodes from a cold store warehouse and will update a ProductList on new barcodes, check the stocks (another system), build new requisition order items if stock have reach the low water mark, and send orders to suppliers. Sequentially this will work it's way through every barcode and with 80% of barcodes being known, waisting valuable time on checking for Product, building Product and checking Stock.

stdin | 
  process1 |
    UnknownBarcode::Barcode> ( process2 | process5 )
    StockUnknownBarcode::Barcode> ( process3 StockUnknownBarcode::Barcode< process5 | process4 )
    StockedBarcode::Barcode> ( process4 | process5 )

(and this example adds object classes to the forks - to ease the interpreter on the job of shuffling objects into the correct queues on the heap)

what I really like about it is that

  • each process will know absolutely nothing about the outside world *)
  • editors will be able to make it very easy to do drag-dropping of streams
  • the code is able to document an entire workflow

fork_join

*) that is a lie, obviously - but a thing to strive for :)