From 554429b9b1f39b09cfef8dc3654f806655547a2a Mon Sep 17 00:00:00 2001 From: Mirek Kratochvil Date: Thu, 17 Jul 2025 15:53:30 +0200 Subject: [PATCH] expand README --- Main.hs | 5 ++ Opts.hs | 2 +- README.md | 191 +++++++++++++++++++++++++++++++++++++++++++++--------- 3 files changed, 166 insertions(+), 32 deletions(-) diff --git a/Main.hs b/Main.hs index d86a6cb..c3e18c9 100644 --- a/Main.hs +++ b/Main.hs @@ -156,6 +156,11 @@ isKeepTok _ = False isDelTok (Del, _) = True isDelTok _ = False +-- TODO: Diff output is not necessarily deterministic; we could make the chunk +-- sequences more unique by rolling them to front (or back), possibly enabling +-- more conflict resolution and preventing mismerges. +-- +-- Example: " a " can be made out of " {+a +}" or "{+ a+} " chunks :: [(Op, String)] -> [Merged] chunks [] = [] chunks xs@((Keep, _):_) = diff --git a/Opts.hs b/Opts.hs index a32682f..7358951 100644 --- a/Opts.hs +++ b/Opts.hs @@ -225,7 +225,7 @@ cmdGitMerge = do [ fmap Just . some $ strArgument $ metavar "UNMERGED" - <> help "Unmerged git file (can be specified repeatedly)" + <> help "Unmerged file tracked by git (can be specified repeatedly)" , flag' Nothing (long "unmerged" diff --git a/README.md b/README.md index 45f786e..9af114d 100644 --- a/README.md +++ b/README.md @@ -1,19 +1,77 @@ # werge (merge weird stuff) -This is a partial work-alike of `diff3` and `git merge` and other merge-y tools that is capable of +This is a partial work-alike of `diff3` and `git merge` and other merge-y tools +that is capable of - merging token-size changes instead of line-size ones - largely ignoring changes in blank characters These properties are great for several use-cases: -- merging free-flowing text changes (such as in TeX) irrespective of linebreaks +- merging free-flowing text changes (such as in TeX) irrespective of line breaks etc, -- merging of changesets that use different code formatters +- merging of change sets that use different code formatters - minimizing the conflict size of tiny changes to a few characters, making them easier to resolve +## Demo + +Original (`old` file): +``` +Roses are red. Violets are blue. +Patch is quite hard. I cannot rhyme. +``` + +Local changes (`my` file): +``` +Roses are red. Violets are blue. +Patching is hard. I still cannot rhyme. +``` + +Remote changes (`your` file): +``` +Roses are red. +Violets are blue. +Patch is quite hard. +I cannot do verses. +``` + +Token-merged version with `werge merge my orig your` (conflicts on the space +change that is too close to the disappearing "still" token): +``` +Roses are red. +Violets are blue. +Patching is hard.<<<<< I still||||| I===== +I>>>>> cannot do verses. +``` +(NOTE: option `-G` gives nicely colored output that is much easier to read.) + +Token-merged version with separate space resolution using `-s` (conflicts get +fixed separately): +``` +Roses are red. +Violets are blue. +Patching is hard. +I still cannot do verses. +``` + +A harder-conflicting file (`theirs`): +``` +Roses are red. +Violets are blue. +Merging is quite hard. +I cannot do verses. +``` + +`werge merge mine orig theirs -s` highlights the actual unmergeable change: +``` +Roses are red. +Violets are blue. +<<<<>>>> is hard. +I still cannot do verses. +``` + ## How does it work? - Instead of lines, the files are torn to small tokens (words, spaces, symbols, @@ -21,6 +79,10 @@ These properties are great for several use-cases: - Some tokens are marked as spaces by the tokenizer, which allows the merge algorithm to be (selectively) more zealous when resolving conflicts on these. +This approach differs from various other structured-merge tools by being +completely oblivious about the file structure. Werge trades off some merge +quality for (a lot of) complexity. + Tokenizers are simple, implementable as linear scanners that print separate tokens on individual lines that are prefixed with a space mark (`.` for space and `|` for non-space), and also escape newlines and backslashes. A default @@ -61,50 +123,117 @@ with the one from [GNU diffutils](https://www.gnu.org/software/diffutils/). You may set up a path to such `diff` (or a wrapper script) via environment variable `WERGE_DIFF`. -## Help & features +## Use with `git` + +`werge` can automatically process files that are marked in `git` as merge +conflicts: + +```sh +$ git merge somebranch +$ werge git -ua +``` + +Options `-ua` (`--unmerged --add`) find all files that are marked as unmerged, +tries to merge them token-by-token, and if the merge is successful with current +settings it runs `git add` on them. The current changes in the files are +replaced by the merged (or partially merged) state; backups are written +automatically to `filename.werge-backup`. + +## Current `--help` and features ``` werge -- blanks-friendly mergetool for tiny interdwindled changes Usage: werge [(-F|--tok-filter FILTER) | (-i|--simple-tokens) | - (-I|--full-tokens)] [-s|--spaces (normal|conflict|my|old|your)] - [-C|--expand-context N] [--no-zeal | (-z|--zeal)] - [--label-start STRING] [--label-mo STRING] [--label-oy STRING] - [--label-end STRING] [--conflict-overlaps] [--conflict-separate] - COMMAND + (-I|--full-tokens)] [--no-zeal | (-z|--zeal)] + [-S|--space (keep|my|old|your)] + [-s | --resolve-space (normal|keep|my|old|your)] + [--conflict-space-overlaps] [--conflict-space-separate] + [--conflict-space-all] [-C|--expand-context N] + [--resolve (keep|my|old|your)] [--conflict-overlaps] + [--conflict-separate] [--conflict-all] [-G|--color] + [--label-start "<<<<<"] [--label-mo "|||||"] [--label-oy "====="] + [--label-end ">>>>>"] COMMAND Available options: - -F,--tok-filter FILTER external program to separate the text to tokens - -i,--simple-tokens use wider character class to separate the tokens + -F,--tok-filter FILTER External program to separate the text to tokens + -i,--simple-tokens Use wider character class to separate the tokens (results in larger tokens and ignores case) - -I,--full-tokens separate characters by all known character classes + -I,--full-tokens Separate characters by all known character classes (default) - -s,--spaces (normal|conflict|my|old|your) - mode of merging the space-only changes; instead of - usual resolution one may choose to always conflict or - to default the space from the source files (default: - normal) + --no-zeal avoid zealous mode (default) + -z,--zeal Try to zealously minify conflicts, potentially + resolving them + -S,--space (keep|my|old|your) + Retain spacing from a selected version, or keep all + space changes for merging (default: keep) + -s Shortcut for `--resolve-space keep' (this separates + space-only conflicts, enabling better automated + resolution) + --resolve-space (normal|keep|my|old|your) + Resolve conflicts in space-only tokens separately, + and either keep unresolved conflicts, or resolve in + favor of a given version; `normal' resolves the + spaces together with other tokens, ignoring choices + in --resolve-space-* (default: normal) + --conflict-space-overlaps + Never resolve overlapping changes in space-only + tokens + --conflict-space-separate + Never resolve separate (non-overlapping) changes in + space-only tokens + --conflict-space-all Never resolve any changes in space-only tokens -C,--expand-context N Consider changes that are at most N tokens apart to be a single change. Zero may cause bad resolutions of - near conflicting edits. (default: 1) - --no-zeal avoid zealous mode (default) - -z,--zeal try to zealously minify conflicts, potentially - resolving them - --label-start STRING label for beginning of the conflict - (default: "<<<<<") - --label-mo STRING separator of local edits and original - (default: "|||||") - --label-oy STRING separator of original and other people's edits - (default: "=====") - --label-end STRING label for end of the conflict (default: ">>>>>") - --conflict-overlaps do not resolve overlapping changes - --conflict-separate do not resolve separate (non-overlapping) changes + near conflicting edits (default: 1) + --resolve (keep|my|old|your) + Resolve general conflicts in favor of a given + version, or keep the conflicts (default: keep) + --conflict-overlaps Never resolve overlapping changes in general tokens + --conflict-separate Never resolve separate (non-overlapping) changes in + general tokens + --conflict-all Never resolve any changes in general tokens + -G,--color Use shorter, gaily colored output markers by default + (requires ANSI color support; good for terminals or + `less -R') + --label-start "<<<<<" Label for beginning of the conflict + --label-mo "|||||" Separator of local edits and original + --label-oy "=====" Separator of original and other people's edits + --label-end ">>>>>" Label for end of the conflict -h,--help Show this help text --version Show version information Available commands: merge diff3-style merge of two changesets - git automerge unmerged files in git conflict + git Automerge unmerged files in git conflict werge is a free software, use it accordingly. ``` + +#### Manual merging +``` +Usage: werge merge MYFILE OLDFILE YOURFILE + + diff3-style merge of two changesets + +Available options: + MYFILE Version with local edits + OLDFILE Original file version + YOURFILE Version with other people's edits + -h,--help Show this help text +``` + +#### Git interoperability +``` +Usage: werge git (UNMERGED | (-u|--unmerged)) [(-a|--add) | --no-add] + + Automerge unmerged files in git conflict + +Available options: + UNMERGED Unmerged file tracked by git (can be specified + repeatedly) + -u,--unmerged Process all files marked as unmerged by git + -a,--add Run `git add' for fully merged files + --no-add Prevent running `git add' + -h,--help Show this help text +```