fuzzy spec :)
This commit is contained in:
		
							parent
							
								
									ed8e20f0db
								
							
						
					
					
						commit
						cbc7bff7f7
					
				| 
						 | 
				
			
			@ -0,0 +1,85 @@
 | 
			
		|||
# i swear this is what fuzzy actually does
 | 
			
		||||
 | 
			
		||||
## the stack
 | 
			
		||||
 | 
			
		||||
fuzzy works on a 16-bit cell-width, zero-page data stack indexed with the x register, as documented in Garth Wilson's [stack treatise](https://wilsonminesco.com/stacks/virtualstacks.html)
 | 
			
		||||
 | 
			
		||||
to push a byte onto the data stack, we just:
 | 
			
		||||
 | 
			
		||||
```asm
 | 
			
		||||
   dex            ; decrement the stack pointer
 | 
			
		||||
   lda some_value ; load the byte we want on the stack into a
 | 
			
		||||
   sta 0, x       ; put the byte on the stack!
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
and to pop a byte off it:
 | 
			
		||||
 | 
			
		||||
```asm
 | 
			
		||||
   lda 0, x       ; pop the top of stack off into a
 | 
			
		||||
   inx            ; increment the stack pointer
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
## types
 | 
			
		||||
 | 
			
		||||
these are used in word definitions, and refer to the type of an individual stack cell:
 | 
			
		||||
 | 
			
		||||
| type                   | desc                                                        |
 | 
			
		||||
| ---------------------- | ----------------------------------------------------------- |
 | 
			
		||||
| **bool**               | a boolean value, represented by $0000 or $ffff              |
 | 
			
		||||
| **nat**                | an unsigned 16-bit integer                                  |
 | 
			
		||||
| **int**                | a signed 16-bit integer                                     |
 | 
			
		||||
| **char**               | an 8-bit george-ascii character, padded with leading zeroes |
 | 
			
		||||
| **string**             | a 16-bit pointer to a string in memory                      |
 | 
			
		||||
| **word** _`dangerous`_ | a 16-bit pointer to a fuzzy word or quotation               |
 | 
			
		||||
 | 
			
		||||
## operators
 | 
			
		||||
 | 
			
		||||
- `!` NOT: applies NOT to tos
 | 
			
		||||
- `&` AND: pops 2 off the stack and pushes the AND'ed result
 | 
			
		||||
- `|` OR: pops 2 off the stack and pushes the OR'ed result
 | 
			
		||||
- `+` add: pops 2 off the stack and pushes the sum
 | 
			
		||||
- `-` subtract: pops 2 off the stack and pushes the difference
 | 
			
		||||
- `*` multiply: pops 2 off the stack and pushes the result, truncating if it's >$FFFF
 | 
			
		||||
- `/` divide: pops 2 off the stack and pushes the remainder and quotient
 | 
			
		||||
- `=` equality: pushes true/false if the top 2 stack cells do/don't match
 | 
			
		||||
- `>` greater than: pushes true/false if tos-1 is/isn't greater than tos
 | 
			
		||||
- `<` less than: pushes true/false if tos-1 is/isn't greater than tos
 | 
			
		||||
- `#` quote _`dangerous`_: pops tos and pushes a word that produces its value
 | 
			
		||||
 | 
			
		||||
### supported types (this will need to be more clearly laid out later)
 | 
			
		||||
 | 
			
		||||
| operator | input type               | output type              | notes                                                                                                                                                               |
 | 
			
		||||
| -------- | ------------------------ | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | 
			
		||||
| `!`      | `bool`, `nat`, `int`     | `bool`, `nat`, `int`     |                                                                                                                                                                     |
 | 
			
		||||
| `&`      | `bool`, `nat`, `int`     | `bool`, `nat`, `int`     |                                                                                                                                                                     |
 | 
			
		||||
| `\|`     | `bool`, `nat`, `int`     | `bool`, `nat`, `int`     |                                                                                                                                                                     |
 | 
			
		||||
| `+`      | `nat` `nat`, `int` `int` | `nat`, `int`             |                                                                                                                                                                     |
 | 
			
		||||
| `-`      | `nat` `nat`, `int` `int` | `nat`, `int`             | subtracting two `nat`s                                                                                                                                              |
 | 
			
		||||
| `*`      | `nat` `nat`, `int` `int` | `nat`, `int`             | most products will be truncated, since most 16 bit multiplications result in a >16 bit product, but in practice that shouldn't matter cause we're not doing science |
 | 
			
		||||
| `/`      | `nat` `nat`, `int` `int` | `nat` `nat`, `int` `int` | produces two cells, the quotient and remainder                                                                                                                      |
 | 
			
		||||
| `=`      | any any                  | `bool`                   | equality/order is checked based on stack cell value, not type (e.g. a `word` pointing to $abcd and a `nat` with the value $abcd are equivalent)                     |
 | 
			
		||||
| `>`      | any any                  | `bool`                   | see above                                                                                                                                                           |
 | 
			
		||||
| `<`      | any any                  | `bool`                   | see above                                                                                                                                                           |
 | 
			
		||||
| `#`      | any                      | `word`                   | _`dangerous`_                                                                                                                                                       |
 | 
			
		||||
 | 
			
		||||
## `danger!`
 | 
			
		||||
 | 
			
		||||
the `danger!` keyword marks a word as being _`dangerous`_. certain language features can only be used in dangerous words, such as:
 | 
			
		||||
 | 
			
		||||
- inline assembly
 | 
			
		||||
- quotations
 | 
			
		||||
  - typechecking quotations is a difficult problem & probably too complex too implement on george if we ever want to fully self-host fuzzy
 | 
			
		||||
- unchecked operator usage
 | 
			
		||||
  - applying `+` to two chars, applying `&` to two strings, etc
 | 
			
		||||
  - this does not mean that _dangerous_ words are untyped! just the type of the result of an operation is asserted to be the word result type
 | 
			
		||||
    - `danger! dangerous_word num num is char: +` can't be used on a `num char` stack, and any words used after `dangerous_word` treat the top of the stack as having a `char` and don't care that it was made with two `num`s
 | 
			
		||||
 | 
			
		||||
the program body cannot use any _dangerous_ features. this makes it so that _dangerous_ behavior is contained to specific words.
 | 
			
		||||
 | 
			
		||||
## memory layout
 | 
			
		||||
 | 
			
		||||
| start  | end    | use                          |
 | 
			
		||||
| ------ | ------ | ---------------------------- |
 | 
			
		||||
| `$200` | `$300` |                              |
 | 
			
		||||
|        |        | core language implementation |
 | 
			
		||||
|        |        | core language implementation |
 | 
			
		||||
| 
						 | 
				
			
			@ -0,0 +1,78 @@
 | 
			
		|||
# fuzzy syntax in a well-defined grammar so i don't lose my mind
 | 
			
		||||
 | 
			
		||||
## notation
 | 
			
		||||
 | 
			
		||||
| notation | meaning                                       |
 | 
			
		||||
| -------- | --------------------------------------------- |
 | 
			
		||||
| abc      | syntactical production                        |
 | 
			
		||||
| :        | maps production to children (products?)       |
 | 
			
		||||
| ()       | groups items                                  |
 | 
			
		||||
| ʕ·ᴥ·ʔ    | any 8-bit georgesci character                 |
 | 
			
		||||
| `abc`    | exact character(s)                            |
 | 
			
		||||
| \x       | an escape character                           |
 | 
			
		||||
| x?       | optional                                      |
 | 
			
		||||
| x\*      | zero or more of x                             |
 | 
			
		||||
| x+       | one or more of x                              |
 | 
			
		||||
| x+y      | y or more of x                                |
 | 
			
		||||
| x.y      | y repetitions of x                            |
 | 
			
		||||
| \|       | one or another                                |
 | 
			
		||||
| [-]      | any characters in range (>=1 ranges accepted) |
 | 
			
		||||
 | 
			
		||||
(adapted from the rust reference cause i like how simple they do it)
 | 
			
		||||
 | 
			
		||||
## grammar
 | 
			
		||||
 | 
			
		||||
the only semantically significant whitespace is \n+2 after a word definition.
 | 
			
		||||
 | 
			
		||||
otherwise, assume tokens are delimited by an arbitrary amount of (not \n+2) whitespace, including no whitespace, e.g. the colon in `hello is: "hello"`
 | 
			
		||||
 | 
			
		||||
also order is significant! if `value` produced `word` first, it would make reserved words like `true` and `false` parse into word references.
 | 
			
		||||
 | 
			
		||||
```syntax
 | 
			
		||||
george: defs? body
 | 
			
		||||
 | 
			
		||||
defs: (def \n+2)*
 | 
			
		||||
body: values
 | 
			
		||||
 | 
			
		||||
def: signature `:` values
 | 
			
		||||
signature: `danger!`? word typedef
 | 
			
		||||
 | 
			
		||||
values: (value | op)*
 | 
			
		||||
 | 
			
		||||
typedef: pop? `is` push? effects?
 | 
			
		||||
 | 
			
		||||
pop: type*
 | 
			
		||||
 | 
			
		||||
push: type*
 | 
			
		||||
 | 
			
		||||
effects: effect*
 | 
			
		||||
 | 
			
		||||
type: `bool` | `nat` | `int` | `char` | `string` | `word`
 | 
			
		||||
 | 
			
		||||
effect: `paint` | `sing` | `store`
 | 
			
		||||
 | 
			
		||||
value: bool | num | char | string | word
 | 
			
		||||
 | 
			
		||||
op: `!` | `&` | `|` | `+` | `-` | `*` | `/` | `=` | `>` | `<` | `#`
 | 
			
		||||
 | 
			
		||||
quote: `[` values `]`
 | 
			
		||||
 | 
			
		||||
bool: `true` | `false`
 | 
			
		||||
 | 
			
		||||
word: [a-z A-Z]+
 | 
			
		||||
 | 
			
		||||
num: hexnum | binarynum
 | 
			
		||||
 | 
			
		||||
binarynum: binarydigit+
 | 
			
		||||
binarydigit: [0-9]
 | 
			
		||||
hexnum: (`$` hexdigit+)
 | 
			
		||||
hexdigit: [0-9 a-f A-F]
 | 
			
		||||
 | 
			
		||||
char: `'` ʕ·ᴥ·ʔ `'`
 | 
			
		||||
 | 
			
		||||
string: `"` ʕ·ᴥ·ʔ* `"`
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
## notes
 | 
			
		||||
 | 
			
		||||
fuzzy assumes the source text to be encoded in [georgesci](#), which is nearly ascii-compatible and should only cause minor headaches <3
 | 
			
		||||
		Loading…
	
		Reference in New Issue