doc

author: Mirek Kratochvil <exa.exa@gmail.com> 2025-07-14 10:55:43 +0200
committer: Mirek Kratochvil <exa.exa@gmail.com> 2025-07-14 10:55:43 +0200
commit: a8b38d647581b7bb6ae515b12bda81b7aae44fee (patch)
tree: 6e4b3d8d4a339071043a195ae863ba66f89cb117
parent: 79977cdf4b9a2fcac4c47b458cccca101686da63 (diff)
download: werge-a8b38d647581b7bb6ae515b12bda81b7aae44fee.tar.gz
werge-a8b38d647581b7bb6ae515b12bda81b7aae44fee.tar.bz2
1 files changed, 37 insertions, 1 deletions
diff --git a/README.md b/README.md
index 9de6a10..40dbb95 100644
--- a/README.md
+++ b/README.md
@@ -14,7 +14,41 @@ These properties are great for several use-cases:
 - minimizing the conflict size of tiny changes to a few characters, making them
   easier to resolve
 
-Better docs is WIP
+## How does it work?
+
+- Instead of lines, the files are torn to small tokens (words, spaces, symbols,
+  ...) and these are diffed and merged individually.
+- Some tokens are marked as spaces by the tokenizer, which allows the merge
+  algorithm to be (selectively) more zealous when resolving conflicts on these.
+
+Tokenizers are simple, implementable as linear scanners that print separate
+tokens on individual lines that are prefixed with a space mark (`.` for space
+and `|` for non-space), and also escape newlines and backslashes. A default
+tokenization of string "hello \ world" with a new line at the end is listed
+below (note the invisible space on the lines with dots):
+
+```
+|hello
+. 
+|\\
+. 
+|world
+.\n
+```
+
+Users may supply any tokenizer via option `-F`, e.g. this script makes
+line-size tokens (reproducing the usual line merges):
+
+```
+#!/usr/bin/env python3
+import sys
+for l in sys.stdin.readlines():
+    if len(l)==0: continue
+    if l[-1]=='\n':
+        print('|'+l[:-1].replace('\\','\\\\')+'\\n')
+    else:
+        print('|'+l.replace('\\','\\\\'))
+```
 
 ## Installation
 
@@ -74,3 +108,5 @@ Available commands:
 
 werge is a free software, use it accordingly.
 ```
+
+## External tokenizer
author	Mirek Kratochvil <exa.exa@gmail.com>	2025-07-14 10:55:43 +0200
committer	Mirek Kratochvil <exa.exa@gmail.com>	2025-07-14 10:55:43 +0200
commit	a8b38d647581b7bb6ae515b12bda81b7aae44fee (patch)
tree	6e4b3d8d4a339071043a195ae863ba66f89cb117
parent	79977cdf4b9a2fcac4c47b458cccca101686da63 (diff)
download	werge-a8b38d647581b7bb6ae515b12bda81b7aae44fee.tar.gz werge-a8b38d647581b7bb6ae515b12bda81b7aae44fee.tar.bz2