fuzzy/syntax.md

79 lines
2.2 KiB
Markdown
Raw Permalink Normal View History

2024-10-06 21:54:35 -04:00
# fuzzy syntax in a well-defined grammar so i don't lose my mind
## notation
| notation | meaning |
| -------- | --------------------------------------------- |
| abc | syntactical production |
| : | maps production to children (products?) |
| () | groups items |
| ʕ·ᴥ·ʔ | any 8-bit georgesci character |
| `abc` | exact character(s) |
| \x | an escape character |
| x? | optional |
| x\* | zero or more of x |
| x+ | one or more of x |
| x+y | y or more of x |
| x.y | y repetitions of x |
| \| | one or another |
| [-] | any characters in range (>=1 ranges accepted) |
(adapted from the rust reference cause i like how simple they do it)
## grammar
the only semantically significant whitespace is \n+2 after a word definition.
otherwise, assume tokens are delimited by an arbitrary amount of (not \n+2) whitespace, including no whitespace, e.g. the colon in `hello is: "hello"`
also order is significant! if `value` produced `word` first, it would make reserved words like `true` and `false` parse into word references.
```syntax
george: defs? body
defs: (def \n+2)*
body: values
def: signature `:` values
signature: `danger!`? word typedef
values: (value | op)*
typedef: pop? `is` push? effects?
pop: type*
push: type*
effects: effect*
type: `bool` | `nat` | `int` | `char` | `string` | `word`
effect: `paint` | `sing` | `store`
value: bool | num | char | string | word
op: `!` | `&` | `|` | `+` | `-` | `*` | `/` | `=` | `>` | `<` | `#`
quote: `[` values `]`
bool: `true` | `false`
word: [a-z A-Z]+
num: hexnum | binarynum
binarynum: binarydigit+
binarydigit: [0-9]
hexnum: (`$` hexdigit+)
hexdigit: [0-9 a-f A-F]
char: `'` ʕ·ᴥ·ʔ `'`
string: `"` ʕ·ᴥ·ʔ* `"`
```
## notes
fuzzy assumes the source text to be encoded in [georgesci](#), which is nearly ascii-compatible and should only cause minor headaches <3