add a note about history
This commit is contained in:
parent
5a88a00a0d
commit
f5f206765c
|
|
@ -104,6 +104,8 @@ below (note the invisible space on the lines with dots):
|
|||
.\n
|
||||
```
|
||||
|
||||
### Custom tokenizers
|
||||
|
||||
Users may supply any tokenizer via option `-F`. The script below produces
|
||||
line-size tokens for demonstration (in turn, `werge` will do the usual line
|
||||
merges), and can be used e.g. via `-F ./tokenize.py`:
|
||||
|
|
@ -119,6 +121,13 @@ for l in sys.stdin.readlines():
|
|||
print('/'+l.replace('\\','\\\\'))
|
||||
```
|
||||
|
||||
### History
|
||||
|
||||
I previously made an attempt to solve this in `adiff` software, which failed
|
||||
because the approach was too complex. Before that, the issue was tackled by
|
||||
Arek Antoniewicz on MFF CUNI, who used regex-edged DFAs (REDFAs) to construct
|
||||
user-specifiable tokenizers in a pretty cool way.
|
||||
|
||||
## Installation
|
||||
|
||||
```sh
|
||||
|
|
|
|||
Loading…
Reference in a new issue