fuzzy spec :)
This commit is contained in:
parent
ed8e20f0db
commit
cbc7bff7f7
|
@ -0,0 +1,85 @@
|
|||
# i swear this is what fuzzy actually does
|
||||
|
||||
## the stack
|
||||
|
||||
fuzzy works on a 16-bit cell-width, zero-page data stack indexed with the x register, as documented in Garth Wilson's [stack treatise](https://wilsonminesco.com/stacks/virtualstacks.html)
|
||||
|
||||
to push a byte onto the data stack, we just:
|
||||
|
||||
```asm
|
||||
dex ; decrement the stack pointer
|
||||
lda some_value ; load the byte we want on the stack into a
|
||||
sta 0, x ; put the byte on the stack!
|
||||
```
|
||||
|
||||
and to pop a byte off it:
|
||||
|
||||
```asm
|
||||
lda 0, x ; pop the top of stack off into a
|
||||
inx ; increment the stack pointer
|
||||
```
|
||||
|
||||
## types
|
||||
|
||||
these are used in word definitions, and refer to the type of an individual stack cell:
|
||||
|
||||
| type | desc |
|
||||
| ---------------------- | ----------------------------------------------------------- |
|
||||
| **bool** | a boolean value, represented by $0000 or $ffff |
|
||||
| **nat** | an unsigned 16-bit integer |
|
||||
| **int** | a signed 16-bit integer |
|
||||
| **char** | an 8-bit george-ascii character, padded with leading zeroes |
|
||||
| **string** | a 16-bit pointer to a string in memory |
|
||||
| **word** _`dangerous`_ | a 16-bit pointer to a fuzzy word or quotation |
|
||||
|
||||
## operators
|
||||
|
||||
- `!` NOT: applies NOT to tos
|
||||
- `&` AND: pops 2 off the stack and pushes the AND'ed result
|
||||
- `|` OR: pops 2 off the stack and pushes the OR'ed result
|
||||
- `+` add: pops 2 off the stack and pushes the sum
|
||||
- `-` subtract: pops 2 off the stack and pushes the difference
|
||||
- `*` multiply: pops 2 off the stack and pushes the result, truncating if it's >$FFFF
|
||||
- `/` divide: pops 2 off the stack and pushes the remainder and quotient
|
||||
- `=` equality: pushes true/false if the top 2 stack cells do/don't match
|
||||
- `>` greater than: pushes true/false if tos-1 is/isn't greater than tos
|
||||
- `<` less than: pushes true/false if tos-1 is/isn't greater than tos
|
||||
- `#` quote _`dangerous`_: pops tos and pushes a word that produces its value
|
||||
|
||||
### supported types (this will need to be more clearly laid out later)
|
||||
|
||||
| operator | input type | output type | notes |
|
||||
| -------- | ------------------------ | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `!` | `bool`, `nat`, `int` | `bool`, `nat`, `int` | |
|
||||
| `&` | `bool`, `nat`, `int` | `bool`, `nat`, `int` | |
|
||||
| `\|` | `bool`, `nat`, `int` | `bool`, `nat`, `int` | |
|
||||
| `+` | `nat` `nat`, `int` `int` | `nat`, `int` | |
|
||||
| `-` | `nat` `nat`, `int` `int` | `nat`, `int` | subtracting two `nat`s |
|
||||
| `*` | `nat` `nat`, `int` `int` | `nat`, `int` | most products will be truncated, since most 16 bit multiplications result in a >16 bit product, but in practice that shouldn't matter cause we're not doing science |
|
||||
| `/` | `nat` `nat`, `int` `int` | `nat` `nat`, `int` `int` | produces two cells, the quotient and remainder |
|
||||
| `=` | any any | `bool` | equality/order is checked based on stack cell value, not type (e.g. a `word` pointing to $abcd and a `nat` with the value $abcd are equivalent) |
|
||||
| `>` | any any | `bool` | see above |
|
||||
| `<` | any any | `bool` | see above |
|
||||
| `#` | any | `word` | _`dangerous`_ |
|
||||
|
||||
## `danger!`
|
||||
|
||||
the `danger!` keyword marks a word as being _`dangerous`_. certain language features can only be used in dangerous words, such as:
|
||||
|
||||
- inline assembly
|
||||
- quotations
|
||||
- typechecking quotations is a difficult problem & probably too complex too implement on george if we ever want to fully self-host fuzzy
|
||||
- unchecked operator usage
|
||||
- applying `+` to two chars, applying `&` to two strings, etc
|
||||
- this does not mean that _dangerous_ words are untyped! just the type of the result of an operation is asserted to be the word result type
|
||||
- `danger! dangerous_word num num is char: +` can't be used on a `num char` stack, and any words used after `dangerous_word` treat the top of the stack as having a `char` and don't care that it was made with two `num`s
|
||||
|
||||
the program body cannot use any _dangerous_ features. this makes it so that _dangerous_ behavior is contained to specific words.
|
||||
|
||||
## memory layout
|
||||
|
||||
| start | end | use |
|
||||
| ------ | ------ | ---------------------------- |
|
||||
| `$200` | `$300` | |
|
||||
| | | core language implementation |
|
||||
| | | core language implementation |
|
|
@ -0,0 +1,78 @@
|
|||
# fuzzy syntax in a well-defined grammar so i don't lose my mind
|
||||
|
||||
## notation
|
||||
|
||||
| notation | meaning |
|
||||
| -------- | --------------------------------------------- |
|
||||
| abc | syntactical production |
|
||||
| : | maps production to children (products?) |
|
||||
| () | groups items |
|
||||
| ʕ·ᴥ·ʔ | any 8-bit georgesci character |
|
||||
| `abc` | exact character(s) |
|
||||
| \x | an escape character |
|
||||
| x? | optional |
|
||||
| x\* | zero or more of x |
|
||||
| x+ | one or more of x |
|
||||
| x+y | y or more of x |
|
||||
| x.y | y repetitions of x |
|
||||
| \| | one or another |
|
||||
| [-] | any characters in range (>=1 ranges accepted) |
|
||||
|
||||
(adapted from the rust reference cause i like how simple they do it)
|
||||
|
||||
## grammar
|
||||
|
||||
the only semantically significant whitespace is \n+2 after a word definition.
|
||||
|
||||
otherwise, assume tokens are delimited by an arbitrary amount of (not \n+2) whitespace, including no whitespace, e.g. the colon in `hello is: "hello"`
|
||||
|
||||
also order is significant! if `value` produced `word` first, it would make reserved words like `true` and `false` parse into word references.
|
||||
|
||||
```syntax
|
||||
george: defs? body
|
||||
|
||||
defs: (def \n+2)*
|
||||
body: values
|
||||
|
||||
def: signature `:` values
|
||||
signature: `danger!`? word typedef
|
||||
|
||||
values: (value | op)*
|
||||
|
||||
typedef: pop? `is` push? effects?
|
||||
|
||||
pop: type*
|
||||
|
||||
push: type*
|
||||
|
||||
effects: effect*
|
||||
|
||||
type: `bool` | `nat` | `int` | `char` | `string` | `word`
|
||||
|
||||
effect: `paint` | `sing` | `store`
|
||||
|
||||
value: bool | num | char | string | word
|
||||
|
||||
op: `!` | `&` | `|` | `+` | `-` | `*` | `/` | `=` | `>` | `<` | `#`
|
||||
|
||||
quote: `[` values `]`
|
||||
|
||||
bool: `true` | `false`
|
||||
|
||||
word: [a-z A-Z]+
|
||||
|
||||
num: hexnum | binarynum
|
||||
|
||||
binarynum: binarydigit+
|
||||
binarydigit: [0-9]
|
||||
hexnum: (`$` hexdigit+)
|
||||
hexdigit: [0-9 a-f A-F]
|
||||
|
||||
char: `'` ʕ·ᴥ·ʔ `'`
|
||||
|
||||
string: `"` ʕ·ᴥ·ʔ* `"`
|
||||
```
|
||||
|
||||
## notes
|
||||
|
||||
fuzzy assumes the source text to be encoded in [georgesci](#), which is nearly ascii-compatible and should only cause minor headaches <3
|
Loading…
Reference in New Issue