document, change non-space token mark
This commit is contained in:
		
							parent
							
								
									6a2b2e3148
								
							
						
					
					
						commit
						44518ce946
					
				
							
								
								
									
										37
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										37
									
								
								README.md
									
									
									
									
									
								
							|  | @ -1,20 +1,26 @@ | ||||||
| 
 | 
 | ||||||
| # werge (merge weird stuff) | # werge (merge weird stuff) | ||||||
| 
 | 
 | ||||||
| This is a partial work-alike of `diff3` and `git merge` and other merge-y tools | This is a partial work-alike of `diff3`, `patch`, `git merge` and other merge-y | ||||||
| that is capable of | tools that is capable of: | ||||||
| 
 | 
 | ||||||
| - merging token-size changes instead of line-size ones | - merging token-size changes (words, identifiers, sentences) instead of | ||||||
| - largely ignoring changes in blank characters |   line-size ones | ||||||
|  | - merging changes in blank characters separately or ignoring them altogether | ||||||
| 
 | 
 | ||||||
| These properties are great for several use-cases: | These properties are great for several use-cases: | ||||||
| 
 | 
 | ||||||
| - merging free-flowing text changes (such as in TeX) irrespective of line breaks | - combining changes in free-flowing text (such as in TeX or Markdown), | ||||||
|   etc, |   irrespectively of changed line breaks, paragraph breaking and justification, | ||||||
| - merging of change sets that use different code formatters |   etc. | ||||||
|  | - merging of code formatted with different code formatters | ||||||
| - minimizing the conflict size of tiny changes to a few characters, making them | - minimizing the conflict size of tiny changes to a few characters, making them | ||||||
|   easier to resolve |   easier to resolve | ||||||
| 
 | 
 | ||||||
|  | Separate `diff`&`patch` functionality is provided too for sending | ||||||
|  | token-granularity patches. (The patches are similar to what `git diff | ||||||
|  | --word-diff` produces, but can be applied to files.) | ||||||
|  | 
 | ||||||
| ## Demo | ## Demo | ||||||
| 
 | 
 | ||||||
| Original (`old` file): | Original (`old` file): | ||||||
|  | @ -85,21 +91,22 @@ type. This choice trades off some merge quality for (a lot of) complexity. | ||||||
| 
 | 
 | ||||||
| Tokenizers are simple, implementable as linear scanners that print separate | Tokenizers are simple, implementable as linear scanners that print separate | ||||||
| tokens on individual lines that are prefixed with a space mark (`.` for space | tokens on individual lines that are prefixed with a space mark (`.` for space | ||||||
| and `|` for non-space), and also escape newlines and backslashes. A default | and `/` for non-space), and also escape newlines and backslashes. A default | ||||||
| tokenization of string "hello \ world" with a new line at the end is listed | tokenization of string "hello \ world" with a new line at the end is listed | ||||||
| below (note the invisible space on the lines with dots): | below (note the invisible space on the lines with dots): | ||||||
| 
 | 
 | ||||||
| ``` | ``` | ||||||
| |hello | /hello | ||||||
| .  | .  | ||||||
| |\\ | /\\ | ||||||
| .  | .  | ||||||
| |world | /world | ||||||
| .\n | .\n | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| Users may supply any tokenizer via option `-F`, e.g. this script makes | Users may supply any tokenizer via option `-F`. The script below produces | ||||||
| line-size tokens (reproducing the usual line merges): | line-size tokens for demonstration (in turn, `werge` will do the usual line | ||||||
|  | merges), and can be used e.g. via `-F ./tokenize.py`: | ||||||
| 
 | 
 | ||||||
| ```py | ```py | ||||||
| #!/usr/bin/env python3 | #!/usr/bin/env python3 | ||||||
|  | @ -107,9 +114,9 @@ import sys | ||||||
| for l in sys.stdin.readlines(): | for l in sys.stdin.readlines(): | ||||||
|     if len(l)==0: continue |     if len(l)==0: continue | ||||||
|     if l[-1]=='\n': |     if l[-1]=='\n': | ||||||
|         print('|'+l[:-1].replace('\\','\\\\')+'\\n') |         print('/'+l[:-1].replace('\\','\\\\')+'\\n') | ||||||
|     else: |     else: | ||||||
|         print('|'+l.replace('\\','\\\\')) |         print('/'+l.replace('\\','\\\\')) | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| ## Installation | ## Installation | ||||||
|  |  | ||||||
		Loading…
	
		Reference in a new issue