/*
 * Copyright (C) 2010, Google Inc.
 * and other copyright owners as documented in the project's IP log.
 *
 * This program and the accompanying materials are made available
 * under the terms of the Eclipse Distribution License v1.0 which
 * accompanies this distribution, is reproduced below, and is
 * available at http://www.eclipse.org/org/documents/edl-v10.php
 *
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or
 * without modification, are permitted provided that the following
 * conditions are met:
 *
 * - Redistributions of source code must retain the above copyright
 *   notice, this list of conditions and the following disclaimer.
 *
 * - Redistributions in binary form must reproduce the above
 *   copyright notice, this list of conditions and the following
 *   disclaimer in the documentation and/or other materials provided
 *   with the distribution.
 *
 * - Neither the name of the Eclipse Foundation, Inc. nor the
 *   names of its contributors may be used to endorse or promote
 *   products derived from this software without specific prior
 *   written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
 * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
 * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
 * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

package org.eclipse.jgit.diff;

Compares two Sequences to create an EditList of changes.

An algorithm's diff method must be callable from concurrent threads without data collisions. This permits some algorithms to use a singleton pattern, with concurrent invocations using the same singleton. Other algorithms may support parameterization, in which case the caller can create a unique instance per thread.

/** * Compares two {@link org.eclipse.jgit.diff.Sequence}s to create an * {@link org.eclipse.jgit.diff.EditList} of changes. * <p> * An algorithm's {@code diff} method must be callable from concurrent threads * without data collisions. This permits some algorithms to use a singleton * pattern, with concurrent invocations using the same singleton. Other * algorithms may support parameterization, in which case the caller can create * a unique instance per thread. */
public abstract class DiffAlgorithm {
Supported diff algorithm
/** * Supported diff algorithm */
public enum SupportedAlgorithm {
Myers diff algorithm
/** * Myers diff algorithm */
MYERS,
Histogram diff algorithm
/** * Histogram diff algorithm */
HISTOGRAM }
Get diff algorithm
Params:
  • alg – the diff algorithm for which an implementation should be returned
Returns:an implementation of the specified diff algorithm
/** * Get diff algorithm * * @param alg * the diff algorithm for which an implementation should be * returned * @return an implementation of the specified diff algorithm */
public static DiffAlgorithm getAlgorithm(SupportedAlgorithm alg) { switch (alg) { case MYERS: return MyersDiff.INSTANCE; case HISTOGRAM: return new HistogramDiff(); default: throw new IllegalArgumentException(); } }
Compare two sequences and identify a list of edits between them.
Params:
  • cmp – the comparator supplying the element equivalence function.
  • a – the first (also known as old or pre-image) sequence. Edits returned by this algorithm will reference indexes using the 'A' side: Edit.getBeginA(), Edit.getEndA().
  • b – the second (also known as new or post-image) sequence. Edits returned by this algorithm will reference indexes using the 'B' side: Edit.getBeginB(), Edit.getEndB().
Returns:a modifiable edit list comparing the two sequences. If empty, the sequences are identical according to cmp's rules. The result list is never null.
/** * Compare two sequences and identify a list of edits between them. * * @param cmp * the comparator supplying the element equivalence function. * @param a * the first (also known as old or pre-image) sequence. Edits * returned by this algorithm will reference indexes using the * 'A' side: {@link org.eclipse.jgit.diff.Edit#getBeginA()}, * {@link org.eclipse.jgit.diff.Edit#getEndA()}. * @param b * the second (also known as new or post-image) sequence. Edits * returned by this algorithm will reference indexes using the * 'B' side: {@link org.eclipse.jgit.diff.Edit#getBeginB()}, * {@link org.eclipse.jgit.diff.Edit#getEndB()}. * @return a modifiable edit list comparing the two sequences. If empty, the * sequences are identical according to {@code cmp}'s rules. The * result list is never null. */
public <S extends Sequence> EditList diff( SequenceComparator<? super S> cmp, S a, S b) { Edit region = cmp.reduceCommonStartEnd(a, b, coverEdit(a, b)); switch (region.getType()) { case INSERT: case DELETE: return EditList.singleton(region); case REPLACE: { if (region.getLengthA() == 1 && region.getLengthB() == 1) return EditList.singleton(region); SubsequenceComparator<S> cs = new SubsequenceComparator<>(cmp); Subsequence<S> as = Subsequence.a(a, region); Subsequence<S> bs = Subsequence.b(b, region); EditList e = Subsequence.toBase(diffNonCommon(cs, as, bs), as, bs); return normalize(cmp, e, a, b); } case EMPTY: return new EditList(0); default: throw new IllegalStateException(); } } private static <S extends Sequence> Edit coverEdit(S a, S b) { return new Edit(0, a.size(), 0, b.size()); }
Reorganize an EditList for better diff consistency.

DiffAlgorithms may return Type.INSERT or Type.DELETE edits that can be "shifted". For example, the deleted section

-a
-b
-c
 a
 b
 c
can be shifted down by 1, 2 or 3 locations.

To avoid later merge issues, we shift such edits to a consistent location. normalize uses a simple strategy of shifting such edits to their latest possible location.

This strategy may not always produce an aesthetically pleasing diff. For instance, it works well with

 function1 {
  ...
 }
+function2 {
+ ...
+}
+
function3 {
...
}
but less so for
 #
 # comment1
 #
 function1() {
 }
 #
+# comment3
+#
+function3() {
+}
+
+#
 # comment2
 #
 function2() {
 }
More sophisticated strategies are possible, say by calculating a suitable "aesthetic cost" for each possible position and using the lowest cost, but normalize just shifts edits to the end as much as possible.
Params:
  • cmp – the comparator supplying the element equivalence function.
  • e – a modifiable edit list comparing the provided sequences.
  • a – the first (also known as old or pre-image) sequence.
  • b – the second (also known as new or post-image) sequence.
Type parameters:
  • <S> – type of sequence being compared.
Returns:a modifiable edit list with edit regions shifted to their latest possible location. The result list is never null.
Since:4.7
/** * Reorganize an {@link EditList} for better diff consistency. * <p> * {@code DiffAlgorithms} may return {@link Edit.Type#INSERT} or * {@link Edit.Type#DELETE} edits that can be "shifted". For * example, the deleted section * <pre> * -a * -b * -c * a * b * c * </pre> * can be shifted down by 1, 2 or 3 locations. * <p> * To avoid later merge issues, we shift such edits to a * consistent location. {@code normalize} uses a simple strategy of * shifting such edits to their latest possible location. * <p> * This strategy may not always produce an aesthetically pleasing * diff. For instance, it works well with * <pre> * function1 { * ... * } * * +function2 { * + ... * +} * + * function3 { * ... * } * </pre> * but less so for * <pre> * # * # comment1 * # * function1() { * } * * # * +# comment3 * +# * +function3() { * +} * + * +# * # comment2 * # * function2() { * } * </pre> * <a href="https://github.com/mhagger/diff-slider-tools">More * sophisticated strategies</a> are possible, say by calculating a * suitable "aesthetic cost" for each possible position and using * the lowest cost, but {@code normalize} just shifts edits * to the end as much as possible. * * @param <S> * type of sequence being compared. * @param cmp * the comparator supplying the element equivalence function. * @param e * a modifiable edit list comparing the provided sequences. * @param a * the first (also known as old or pre-image) sequence. * @param b * the second (also known as new or post-image) sequence. * @return a modifiable edit list with edit regions shifted to their * latest possible location. The result list is never null. * @since 4.7 */
private static <S extends Sequence> EditList normalize( SequenceComparator<? super S> cmp, EditList e, S a, S b) { Edit prev = null; for (int i = e.size() - 1; i >= 0; i--) { Edit cur = e.get(i); Edit.Type curType = cur.getType(); int maxA = (prev == null) ? a.size() : prev.beginA; int maxB = (prev == null) ? b.size() : prev.beginB; if (curType == Edit.Type.INSERT) { while (cur.endA < maxA && cur.endB < maxB && cmp.equals(b, cur.beginB, b, cur.endB)) { cur.shift(1); } } else if (curType == Edit.Type.DELETE) { while (cur.endA < maxA && cur.endB < maxB && cmp.equals(a, cur.beginA, a, cur.endA)) { cur.shift(1); } } prev = cur; } return e; }
Compare two sequences and identify a list of edits between them. This method should be invoked only after the two sequences have been proven to have no common starting or ending elements. The expected elimination of common starting and ending elements is automatically performed by the diff(SequenceComparator<? super Sequence>, Sequence, Sequence) method, which invokes this method using Subsequences.
Params:
  • cmp – the comparator supplying the element equivalence function.
  • a – the first (also known as old or pre-image) sequence. Edits returned by this algorithm will reference indexes using the 'A' side: Edit.getBeginA(), Edit.getEndA().
  • b – the second (also known as new or post-image) sequence. Edits returned by this algorithm will reference indexes using the 'B' side: Edit.getBeginB(), Edit.getEndB().
Returns:a modifiable edit list comparing the two sequences.
/** * Compare two sequences and identify a list of edits between them. * * This method should be invoked only after the two sequences have been * proven to have no common starting or ending elements. The expected * elimination of common starting and ending elements is automatically * performed by the {@link #diff(SequenceComparator, Sequence, Sequence)} * method, which invokes this method using * {@link org.eclipse.jgit.diff.Subsequence}s. * * @param cmp * the comparator supplying the element equivalence function. * @param a * the first (also known as old or pre-image) sequence. Edits * returned by this algorithm will reference indexes using the * 'A' side: {@link org.eclipse.jgit.diff.Edit#getBeginA()}, * {@link org.eclipse.jgit.diff.Edit#getEndA()}. * @param b * the second (also known as new or post-image) sequence. Edits * returned by this algorithm will reference indexes using the * 'B' side: {@link org.eclipse.jgit.diff.Edit#getBeginB()}, * {@link org.eclipse.jgit.diff.Edit#getEndB()}. * @return a modifiable edit list comparing the two sequences. */
public abstract <S extends Sequence> EditList diffNonCommon( SequenceComparator<? super S> cmp, S a, S b); }