add a note about history
This commit is contained in:
		
							parent
							
								
									5a88a00a0d
								
							
						
					
					
						commit
						f5f206765c
					
				|  | @ -104,6 +104,8 @@ below (note the invisible space on the lines with dots): | |||
| .\n | ||||
| ``` | ||||
| 
 | ||||
| ### Custom tokenizers | ||||
| 
 | ||||
| Users may supply any tokenizer via option `-F`. The script below produces | ||||
| line-size tokens for demonstration (in turn, `werge` will do the usual line | ||||
| merges), and can be used e.g. via `-F ./tokenize.py`: | ||||
|  | @ -119,6 +121,13 @@ for l in sys.stdin.readlines(): | |||
|         print('/'+l.replace('\\','\\\\')) | ||||
| ``` | ||||
| 
 | ||||
| ### History | ||||
| 
 | ||||
| I previously made an attempt to solve this in `adiff` software, which failed | ||||
| because the approach was too complex. Before that, the issue was tackled by | ||||
| Arek Antoniewicz on MFF CUNI, who used regex-edged DFAs (REDFAs) to construct | ||||
| user-specifiable tokenizers in a pretty cool way. | ||||
| 
 | ||||
| ## Installation | ||||
| 
 | ||||
| ```sh | ||||
|  |  | |||
		Loading…
	
		Reference in a new issue