expand README

This commit is contained in:
Mirek Kratochvil 2025-07-17 15:53:30 +02:00
parent 44bd3e8c14
commit 554429b9b1
3 changed files with 166 additions and 32 deletions

View file

@ -156,6 +156,11 @@ isKeepTok _ = False
isDelTok (Del, _) = True
isDelTok _ = False
-- TODO: Diff output is not necessarily deterministic; we could make the chunk
-- sequences more unique by rolling them to front (or back), possibly enabling
-- more conflict resolution and preventing mismerges.
--
-- Example: " a " can be made out of " {+a +}" or "{+ a+} "
chunks :: [(Op, String)] -> [Merged]
chunks [] = []
chunks xs@((Keep, _):_) =

View file

@ -225,7 +225,7 @@ cmdGitMerge = do
[ fmap Just . some
$ strArgument
$ metavar "UNMERGED"
<> help "Unmerged git file (can be specified repeatedly)"
<> help "Unmerged file tracked by git (can be specified repeatedly)"
, flag'
Nothing
(long "unmerged"

191
README.md
View file

@ -1,19 +1,77 @@
# werge (merge weird stuff)
This is a partial work-alike of `diff3` and `git merge` and other merge-y tools that is capable of
This is a partial work-alike of `diff3` and `git merge` and other merge-y tools
that is capable of
- merging token-size changes instead of line-size ones
- largely ignoring changes in blank characters
These properties are great for several use-cases:
- merging free-flowing text changes (such as in TeX) irrespective of linebreaks
- merging free-flowing text changes (such as in TeX) irrespective of line breaks
etc,
- merging of changesets that use different code formatters
- merging of change sets that use different code formatters
- minimizing the conflict size of tiny changes to a few characters, making them
easier to resolve
## Demo
Original (`old` file):
```
Roses are red. Violets are blue.
Patch is quite hard. I cannot rhyme.
```
Local changes (`my` file):
```
Roses are red. Violets are blue.
Patching is hard. I still cannot rhyme.
```
Remote changes (`your` file):
```
Roses are red.
Violets are blue.
Patch is quite hard.
I cannot do verses.
```
Token-merged version with `werge merge my orig your` (conflicts on the space
change that is too close to the disappearing "still" token):
```
Roses are red.
Violets are blue.
Patching is hard.<<<<< I still||||| I=====
I>>>>> cannot do verses.
```
(NOTE: option `-G` gives nicely colored output that is much easier to read.)
Token-merged version with separate space resolution using `-s` (conflicts get
fixed separately):
```
Roses are red.
Violets are blue.
Patching is hard.
I still cannot do verses.
```
A harder-conflicting file (`theirs`):
```
Roses are red.
Violets are blue.
Merging is quite hard.
I cannot do verses.
```
`werge merge mine orig theirs -s` highlights the actual unmergeable change:
```
Roses are red.
Violets are blue.
<<<<<Patching|||||Patch=====Merging>>>>> is hard.
I still cannot do verses.
```
## How does it work?
- Instead of lines, the files are torn to small tokens (words, spaces, symbols,
@ -21,6 +79,10 @@ These properties are great for several use-cases:
- Some tokens are marked as spaces by the tokenizer, which allows the merge
algorithm to be (selectively) more zealous when resolving conflicts on these.
This approach differs from various other structured-merge tools by being
completely oblivious about the file structure. Werge trades off some merge
quality for (a lot of) complexity.
Tokenizers are simple, implementable as linear scanners that print separate
tokens on individual lines that are prefixed with a space mark (`.` for space
and `|` for non-space), and also escape newlines and backslashes. A default
@ -61,50 +123,117 @@ with the one from [GNU diffutils](https://www.gnu.org/software/diffutils/). You
may set up a path to such `diff` (or a wrapper script) via environment variable
`WERGE_DIFF`.
## Help & features
## Use with `git`
`werge` can automatically process files that are marked in `git` as merge
conflicts:
```sh
$ git merge somebranch
$ werge git -ua
```
Options `-ua` (`--unmerged --add`) find all files that are marked as unmerged,
tries to merge them token-by-token, and if the merge is successful with current
settings it runs `git add` on them. The current changes in the files are
replaced by the merged (or partially merged) state; backups are written
automatically to `filename.werge-backup`.
## Current `--help` and features
```
werge -- blanks-friendly mergetool for tiny interdwindled changes
Usage: werge [(-F|--tok-filter FILTER) | (-i|--simple-tokens) |
(-I|--full-tokens)] [-s|--spaces (normal|conflict|my|old|your)]
[-C|--expand-context N] [--no-zeal | (-z|--zeal)]
[--label-start STRING] [--label-mo STRING] [--label-oy STRING]
[--label-end STRING] [--conflict-overlaps] [--conflict-separate]
COMMAND
(-I|--full-tokens)] [--no-zeal | (-z|--zeal)]
[-S|--space (keep|my|old|your)]
[-s | --resolve-space (normal|keep|my|old|your)]
[--conflict-space-overlaps] [--conflict-space-separate]
[--conflict-space-all] [-C|--expand-context N]
[--resolve (keep|my|old|your)] [--conflict-overlaps]
[--conflict-separate] [--conflict-all] [-G|--color]
[--label-start "<<<<<"] [--label-mo "|||||"] [--label-oy "====="]
[--label-end ">>>>>"] COMMAND
Available options:
-F,--tok-filter FILTER external program to separate the text to tokens
-i,--simple-tokens use wider character class to separate the tokens
-F,--tok-filter FILTER External program to separate the text to tokens
-i,--simple-tokens Use wider character class to separate the tokens
(results in larger tokens and ignores case)
-I,--full-tokens separate characters by all known character classes
-I,--full-tokens Separate characters by all known character classes
(default)
-s,--spaces (normal|conflict|my|old|your)
mode of merging the space-only changes; instead of
usual resolution one may choose to always conflict or
to default the space from the source files (default:
normal)
--no-zeal avoid zealous mode (default)
-z,--zeal Try to zealously minify conflicts, potentially
resolving them
-S,--space (keep|my|old|your)
Retain spacing from a selected version, or keep all
space changes for merging (default: keep)
-s Shortcut for `--resolve-space keep' (this separates
space-only conflicts, enabling better automated
resolution)
--resolve-space (normal|keep|my|old|your)
Resolve conflicts in space-only tokens separately,
and either keep unresolved conflicts, or resolve in
favor of a given version; `normal' resolves the
spaces together with other tokens, ignoring choices
in --resolve-space-* (default: normal)
--conflict-space-overlaps
Never resolve overlapping changes in space-only
tokens
--conflict-space-separate
Never resolve separate (non-overlapping) changes in
space-only tokens
--conflict-space-all Never resolve any changes in space-only tokens
-C,--expand-context N Consider changes that are at most N tokens apart to
be a single change. Zero may cause bad resolutions of
near conflicting edits. (default: 1)
--no-zeal avoid zealous mode (default)
-z,--zeal try to zealously minify conflicts, potentially
resolving them
--label-start STRING label for beginning of the conflict
(default: "<<<<<")
--label-mo STRING separator of local edits and original
(default: "|||||")
--label-oy STRING separator of original and other people's edits
(default: "=====")
--label-end STRING label for end of the conflict (default: ">>>>>")
--conflict-overlaps do not resolve overlapping changes
--conflict-separate do not resolve separate (non-overlapping) changes
near conflicting edits (default: 1)
--resolve (keep|my|old|your)
Resolve general conflicts in favor of a given
version, or keep the conflicts (default: keep)
--conflict-overlaps Never resolve overlapping changes in general tokens
--conflict-separate Never resolve separate (non-overlapping) changes in
general tokens
--conflict-all Never resolve any changes in general tokens
-G,--color Use shorter, gaily colored output markers by default
(requires ANSI color support; good for terminals or
`less -R')
--label-start "<<<<<" Label for beginning of the conflict
--label-mo "|||||" Separator of local edits and original
--label-oy "=====" Separator of original and other people's edits
--label-end ">>>>>" Label for end of the conflict
-h,--help Show this help text
--version Show version information
Available commands:
merge diff3-style merge of two changesets
git automerge unmerged files in git conflict
git Automerge unmerged files in git conflict
werge is a free software, use it accordingly.
```
#### Manual merging
```
Usage: werge merge MYFILE OLDFILE YOURFILE
diff3-style merge of two changesets
Available options:
MYFILE Version with local edits
OLDFILE Original file version
YOURFILE Version with other people's edits
-h,--help Show this help text
```
#### Git interoperability
```
Usage: werge git (UNMERGED | (-u|--unmerged)) [(-a|--add) | --no-add]
Automerge unmerged files in git conflict
Available options:
UNMERGED Unmerged file tracked by git (can be specified
repeatedly)
-u,--unmerged Process all files marked as unmerged by git
-a,--add Run `git add' for fully merged files
--no-add Prevent running `git add'
-h,--help Show this help text
```