aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: 9af114dfb6deba1f04c70c7e8e31569fb838a5c2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239

# werge (merge weird stuff)

This is a partial work-alike of `diff3` and `git merge` and other merge-y tools
that is capable of

- merging token-size changes instead of line-size ones
- largely ignoring changes in blank characters

These properties are great for several use-cases:

- merging free-flowing text changes (such as in TeX) irrespective of line breaks
  etc,
- merging of change sets that use different code formatters
- minimizing the conflict size of tiny changes to a few characters, making them
  easier to resolve

## Demo

Original (`old` file):
```
Roses are red. Violets are blue.
Patch is quite hard. I cannot rhyme.
```

Local changes (`my` file):
```
Roses are red. Violets are blue.
Patching is hard. I still cannot rhyme.
```

Remote changes (`your` file):
```
Roses are red.
Violets are blue.
Patch is quite hard.
I cannot do verses.
```

Token-merged version with `werge merge my orig your` (conflicts on the space
change that is too close to the disappearing "still" token):
```
Roses are red.
Violets are blue.
Patching is hard.<<<<< I still||||| I=====
I>>>>> cannot do verses.
```
(NOTE: option `-G` gives nicely colored output that is much easier to read.)

Token-merged version with separate space resolution using `-s` (conflicts get
fixed separately):
```
Roses are red.
Violets are blue.
Patching is hard.
I still cannot do verses.
```

A harder-conflicting file (`theirs`):
```
Roses are red.
Violets are blue.
Merging is quite hard.
I cannot do verses.
```

`werge merge mine orig theirs -s` highlights the actual unmergeable change:
```
Roses are red.
Violets are blue.
<<<<<Patching|||||Patch=====Merging>>>>> is hard.
I still cannot do verses.
```

## How does it work?

- Instead of lines, the files are torn to small tokens (words, spaces, symbols,
  ...) and these are diffed and merged individually.
- Some tokens are marked as spaces by the tokenizer, which allows the merge
  algorithm to be (selectively) more zealous when resolving conflicts on these.

This approach differs from various other structured-merge tools by being
completely oblivious about the file structure. Werge trades off some merge
quality for (a lot of) complexity.

Tokenizers are simple, implementable as linear scanners that print separate
tokens on individual lines that are prefixed with a space mark (`.` for space
and `|` for non-space), and also escape newlines and backslashes. A default
tokenization of string "hello \ world" with a new line at the end is listed
below (note the invisible space on the lines with dots):

```
|hello
. 
|\\
. 
|world
.\n
```

Users may supply any tokenizer via option `-F`, e.g. this script makes
line-size tokens (reproducing the usual line merges):

```
#!/usr/bin/env python3
import sys
for l in sys.stdin.readlines():
    if len(l)==0: continue
    if l[-1]=='\n':
        print('|'+l[:-1].replace('\\','\\\\')+'\\n')
    else:
        print('|'+l.replace('\\','\\\\'))
```

## Installation

```sh
cabal install
```

Running of `werge` requires a working installation of `diff` compatible
with the one from [GNU diffutils](https://www.gnu.org/software/diffutils/). You
may set up a path to such `diff` (or a wrapper script) via environment variable
`WERGE_DIFF`.

## Use with `git`

`werge` can automatically process files that are marked in `git` as merge
conflicts:

```sh
$ git merge somebranch
$ werge git -ua
```

Options `-ua` (`--unmerged --add`) find all files that are marked as unmerged,
tries to merge them token-by-token, and if the merge is successful with current
settings it runs `git add` on them. The current changes in the files are
replaced by the merged (or partially merged) state; backups are written
automatically to `filename.werge-backup`.

## Current `--help` and features

```
werge -- blanks-friendly mergetool for tiny interdwindled changes

Usage: werge [(-F|--tok-filter FILTER) | (-i|--simple-tokens) | 
               (-I|--full-tokens)] [--no-zeal | (-z|--zeal)] 
             [-S|--space (keep|my|old|your)] 
             [-s | --resolve-space (normal|keep|my|old|your)] 
             [--conflict-space-overlaps] [--conflict-space-separate] 
             [--conflict-space-all] [-C|--expand-context N] 
             [--resolve (keep|my|old|your)] [--conflict-overlaps] 
             [--conflict-separate] [--conflict-all] [-G|--color] 
             [--label-start "<<<<<"] [--label-mo "|||||"] [--label-oy "====="] 
             [--label-end ">>>>>"] COMMAND

Available options:
  -F,--tok-filter FILTER   External program to separate the text to tokens
  -i,--simple-tokens       Use wider character class to separate the tokens
                           (results in larger tokens and ignores case)
  -I,--full-tokens         Separate characters by all known character classes
                           (default)
  --no-zeal                avoid zealous mode (default)
  -z,--zeal                Try to zealously minify conflicts, potentially
                           resolving them
  -S,--space (keep|my|old|your)
                           Retain spacing from a selected version, or keep all
                           space changes for merging (default: keep)
  -s                       Shortcut for `--resolve-space keep' (this separates
                           space-only conflicts, enabling better automated
                           resolution)
  --resolve-space (normal|keep|my|old|your)
                           Resolve conflicts in space-only tokens separately,
                           and either keep unresolved conflicts, or resolve in
                           favor of a given version; `normal' resolves the
                           spaces together with other tokens, ignoring choices
                           in --resolve-space-* (default: normal)
  --conflict-space-overlaps
                           Never resolve overlapping changes in space-only
                           tokens
  --conflict-space-separate
                           Never resolve separate (non-overlapping) changes in
                           space-only tokens
  --conflict-space-all     Never resolve any changes in space-only tokens
  -C,--expand-context N    Consider changes that are at most N tokens apart to
                           be a single change. Zero may cause bad resolutions of
                           near conflicting edits (default: 1)
  --resolve (keep|my|old|your)
                           Resolve general conflicts in favor of a given
                           version, or keep the conflicts (default: keep)
  --conflict-overlaps      Never resolve overlapping changes in general tokens
  --conflict-separate      Never resolve separate (non-overlapping) changes in
                           general tokens
  --conflict-all           Never resolve any changes in general tokens
  -G,--color               Use shorter, gaily colored output markers by default
                           (requires ANSI color support; good for terminals or
                           `less -R')
  --label-start "<<<<<"    Label for beginning of the conflict
  --label-mo "|||||"       Separator of local edits and original
  --label-oy "====="       Separator of original and other people's edits
  --label-end ">>>>>"      Label for end of the conflict
  -h,--help                Show this help text
  --version                Show version information

Available commands:
  merge                    diff3-style merge of two changesets
  git                      Automerge unmerged files in git conflict

werge is a free software, use it accordingly.
```

#### Manual merging
```
Usage: werge merge MYFILE OLDFILE YOURFILE

  diff3-style merge of two changesets

Available options:
  MYFILE                   Version with local edits
  OLDFILE                  Original file version
  YOURFILE                 Version with other people's edits
  -h,--help                Show this help text
```

#### Git interoperability
```
Usage: werge git (UNMERGED | (-u|--unmerged)) [(-a|--add) | --no-add]

  Automerge unmerged files in git conflict

Available options:
  UNMERGED                 Unmerged file tracked by git (can be specified
                           repeatedly)
  -u,--unmerged            Process all files marked as unmerged by git
  -a,--add                 Run `git add' for fully merged files
  --no-add                 Prevent running `git add'
  -h,--help                Show this help text
```