organix / pijFORTHos

A bare-metal FORTH operating system for Raspberry Pi

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update for Raspberry Pi Model B+

ul5255 opened this issue · comments

Dale,
here is an update to part of your blinker example. This now works on
Raspberry Pi B+. This model has the LED moved to GPIO pin #47.
Also the logic here is that the LED goes ON when you set the pin.
Hope this helps,
Uwe.

16# 20200010 CONSTANT GPFSEL4 \ GPIO function select (pins 40..49)
16# 00E00000 CONSTANT GPIO47_FSEL \ GPIO pin 47 function select mask
16# 00200000 CONSTANT GPIO47_OUT \ GPIO pin 47 function is output
GPFSEL4 @ \ read GPIO function selection
GPIO47_FSEL INVERT AND \ clear function for pin 47
GPIO47_OUT OR \ set function to output
GPFSEL4 ! \ write GPIO function selection

16# 20200020 CONSTANT GPSET1 \ GPIO pin output set (pins 32..53)
16# 2020002C CONSTANT GPCLR1 \ GPIO pin output clear (pins 32..53)
16# 00008000 CONSTANT GPIO47_PIN \ GPIO pin 47 set/clear
: +gpio47 GPIO47_PIN GPSET1 ! ; \ set GPIO pin 47
: -gpio47 GPIO47_PIN GPCLR1 ! ; \ clear GPIO pin 47
: LED_ON +gpio47 ; \ turn on ACT/OK LED set 47)
: LED_OFF -gpio47 ; \ turn off ACT/OK LED (clear 47)

I've integrated your update. Can you provide better attribution information?

Thanks, Dale!

 

Re: Attribution Information
My name is Uwe LANGE from Germany.
Would it be sufficient to update my Github profile?

 

A more general question:

 

What is your goal with pijFORTHos? I am looking for a bare metal
Forth which is hosted in Raspberry Pi and execution speed close
to what is achieveable with hand-written assembler.

 

I experimented with pijFORTHos. As a starting point I wanted to bit-bang
a single wire protocol (shortest pulse width 5us, max. 10% jitter) to
one of the GPIO pins. While I got it working the code I ended up with
is rather ugly for the following reasons:

  • access to CONSTANTs is slow, this is rather unfortunate given that
      GPIO access should use symbolic names for registers
  • factoring of words is not possible in tight code because pijFORTHos
      spends most of its time with stack shuffling thanks to its inline
      threaded code execution

 

I measured several scenarios with a logic analyzer hooked up to
a GPIO pin. If I inline all words and CONSTANTs manually then I
get a maximum pin toggle frequency of 1MHz. I'd say this is an
order of magnitude less than what I want. If you have access to
a logic analyzer you can try for yourself using your code from
blinker.f: the busy waiting 'us' word works reliably only for
IIRC 7us and up.

 

I'd like to change at least two things in pijFORTHos, both with more
or less impact:

  1. change from indirect threaded code to subroutine threaded code to
       support factoring of code w/o wasting so much time w/ stack shuffling
  2. make the colon compiler smarter with regards to inlining of small
       words to further reduce stack shuffling

  and possibly

  1. experiment with keeping TOS in a register

 

Would yo be interested in such work? If yes, I'd try to transform the
pijFORTHos code base. If no, I would probably start from scratch and
just use part of your code base (Makefile, loadmap) as a starting
point (with proper attribution of course).

 

Let me know.

 

Cheers, Uwe.


Analysis of what happens when using CONSTANT:

defword "CONSTANT",8,,CONSTANT
.int WORD              @ get the name (the name follows CONSTANT)
.int CREATE            @ make the dictionary entry
.int DOCOL, COMMA      @ append _DOCOL (the codeword field of this word)
.int TICK, LIT, COMMA  @ append the codeword LIT
.int COMMA             @ append the value on the top of the stack
.int TICK, EXIT, COMMA @ append the codeword EXIT
.int EXIT @ Return.

 

in high level Forth:

 

: CONSTANT ( n -- )
  WORD CREATE

  DOCOL ,

  ' LIT ,

  ,

  ' EXIT ,
;

 

analysis of what will happen during run time:

 

CFA    DOCOL  -->    defconst "DOCOL",5,,DOCOL,_DOCOL
                      _DOCOL  -->   PUSHRSP FIP  --> str FIP, [RSP, #-4]!
                                    add FIP, r0, #4

                                    r0 has the CFA, so we advance by one
                                    cell to the DFA

DFA    LIT    -->    ldr r1, [FIP], #4
                      PUSHDSP r1  --> str r1, [DSP, #-4]!
                      NEXT    -->   b _NEXT

                                    [...]

                                    ldr r0, [FIP], #4
                                    ldr r1, [r0]
                                    bx r1
        n
        EXIT   -->    POPRSP FIP  --> ldr FIP, [RSP], #4
                      NEXT    -->   b _NEXT

                                    [...]

                                    ldr r0, [FIP], #4
                                    ldr r1, [r0]
                                    bx r1

 

condensed:

 

str FIP,  [RSP, #-4]!
add FIP,  r0, #4
ldr r1,   [FIP], #4       @ fetch the CONSTANTs value
str r1,   [DSP, #-4]!     @ push it on the data stack
b   _NEXT

[...]

ldr r0,   [FIP], #4
ldr r1,   [r0]
bx  r1

[...]

ldr FIP,  [RSP], #4
b   _NEXT

[...]

ldr r0,   [FIP], #4
ldr r1,   [r0]
bx  r1

 

while in essense it just needs to do this (inline the LIT actions/defcode):

 

ldr r1,   [FIP], #4       @ fetch the CONSTANTs value
str r1,   [DSP, #-4]!     @ push it on the data stack
b   _NEXT

[...]

ldr r0,   [FIP], #4
ldr r1,   [r0]
bx  r1

 

Gesendet: Mittwoch, 20. August 2014 um 00:12 UhrVon: "Dale Schumacher" notifications@github.comAn: organix/pijFORTHos pijFORTHos@noreply.github.comCc: ul5255 uwe_lange@web.deBetreff: Re: [pijFORTHos] Update for Raspberry Pi Model B+ (#4)

I've integrated your update. Can you provide better attribution information?


Reply to this email directly or view it on GitHub.

here are my notes for the GPIO pin toggle timings (measured
with a logic analyzer connected to GPIO16):

\ for test purposes we hard-wire the Si504 to GPIO16
16# 20200004 CONSTANT GPFSEL1
16# 001C0000 CONSTANT GPIO16_FSEL
16# 00040000 CONSTANT GPIO16_OUT

GPFSEL1 @
GPIO16_FSEL INVERT AND
GPIO16_OUT OR
GPFSEL1 !

16# 2020001C CONSTANT GPSET0
16# 20200028 CONSTANT GPCLR0
16# 00010000 CONSTANT GPIO16_PIN

: +Si504 GPIO16_PIN GPSET0 ! ;
: -Si504 GPIO16_PIN GPCLR0 ! ;

: t1 \ just toggle the pin as fast as we can -- 2.5us between toggles
-Si504
+Si504
-Si504
+Si504
-Si504
+Si504
-Si504

+Si504
-Si504
+Si504
-Si504
+Si504
-Si504
+Si504
-Si504

+Si504
;

\ let's try and make this faster: Looking at jonesforth.s
\ we notice that CONSTANT is a surprisingly time-intensive
\ operation. Let's inline those constants in our + and
\ - code:

BASE @ HEX

: +Si504-fast 00010000 2020001C ! ;
: -Si504-fast 00010000 20200028 ! ;

BASE !

: t2 \ this brings it down to 1.5us between toggles
-Si504-fast
+Si504-fast
-Si504-fast
+Si504-fast
-Si504-fast
+Si504-fast
-Si504-fast

+Si504-fast
-Si504-fast
+Si504-fast
-Si504-fast
+Si504-fast
-Si504-fast
+Si504-fast
-Si504-fast

+Si504-fast
;

\ How much overhead do we have because of the funcion calls to
\ +/-? Let's try out by inlining that code as well:

BASE @ HEX

: t3 \ this brings it down to 1.0us between toggles, so function
\ calls cost 500ns in this Forth version
00010000 20200028 !
00010000 2020001C !
00010000 20200028 !
00010000 2020001C !
00010000 20200028 !
00010000 2020001C !

00010000 20200028 !
00010000 2020001C !
00010000 20200028 !
00010000 2020001C !
00010000 20200028 !
00010000 2020001C !
00010000 20200028 !
00010000 2020001C !

00010000 20200028 !
;

BASE !

You are always welcome to create a fork and explore a different direction in design-space. I intend to stick with the threaded model.

I'm not sure FORTH is the right platform for your application. However, if you still want to use FORTH, I would recommend creating a few low-level words implemented directly in assembly-language.

As I mentioned in the blinker tutorial, it is common to build up a DSL of FORTH words, and use them to construct your application. When you find that some words are performance-critical, you can always re-write those words in assembly. I did just that with many of the original JonesFORTH definitions.

Also, consider the XMODEM file upload. It would certainly be possible to write the whole thing in FORTH, and it would probably have adequate performance. But I found it easier to write the protocol in C instead. YMMV.

Would someone be willing to vector "emit" to use PI's screen & "key" to use pi's usb at boot-up?